ShuriZma/xemu - xemu

Commit Graph

Author	SHA1	Message	Date
Richard Henderson	b09d78bf22	tcg/tci: Merge INDEX_op_ld16s_{i32,i64} Eliminating a TODO for ld16s_i64. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	77c38c7c3f	tcg/tci: Merge INDEX_op_ld16u_{i32,i64} Eliminating a TODO for ld16u_i32. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	850163eb4d	tcg/tci: Merge INDEX_op_ld8s_{i32,i64} Eliminating a TODO for ld8s_i32. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	7f33f5cd0a	tcg/tci: Merge INDEX_op_ld8u_{i32,i64} Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	5410e4347b	tcg/tci: Inline tci_write_reg64 into 64-bit callers Note that we had two functions of the same name: a 32-bit version which took two register numbers and a 64-bit version which was a no-op wrapper for tcg_write_reg. After this, we are left with only the 32-bit version. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	85bbbf7088	tcg/tci: Inline tci_write_reg32 into all callers For a 64-bit TCI, the upper bits of a 32-bit operation are undefined (much like a native ppc64 32-bit operation). It simplifies everything if we don't force-extend the result. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	43c8a40279	tcg/tci: Inline tci_write_reg16 into the only caller Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	475a15611f	tcg/tci: Inline tci_write_reg8 into its callers Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	9592e8974f	tcg/tci: Inline tci_write_reg32s into the only caller Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Stefan Weil	cbec0754be	tcg/tci: Implement INDEX_op_ld8s_i64 That TCG opcode is used by debian-buster (arm64) running ffmpeg: qemu-aarch64 /usr/bin/ffmpeg -i theora.mkv theora.webm Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20210128020425.2055454-1-sw@weilnetz.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Stefan Weil	49a5a75f3e	tcg/tci: Implement INDEX_op_ld16s_i32 That TCG opcode is used by debian-buster (arm64) running ffmpeg: qemu-aarch64 /usr/bin/ffmpeg -i theora.mkv theora.webm Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20210128024814.2056958-1-sw@weilnetz.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	13e71f08bf	tcg/tci: Make tci_tb_ptr thread-local Each thread must have its own pc, even under TCI. Remove the GETPC ifdef, because GETPC is always available for helpers, and thus is always required. Move the assignment under INDEX_op_call, because the value is only visible when we make a call to a helper function. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210204014509.882821-6-richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	2dfa2f1871	tcg/aarch64: Do not convert TCGArg to temps that are not temps Fixes INDEX_op_rotli_vec for aarch64 host, where the 3rd argument is an integer, not a temporary, which now tickles an assert added in `e89b28a635`. Previously, the value computed into v2 would be garbage for rotli_vec, but as the value was unused it caused no harm. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Philippe Mathieu-Daudé	8e43c5a1f2	tcg/s390: Fix compare instruction from extended-immediate facility The code is currently comparing c2 to the type promotion of uint32_t and int32_t. That is, the conversion rules are as: (common_type) c2 == (common_type) (uint32_t) (is_unsigned ? (uint32_t)c2 : (uint32_t)(int32_t)c2) In the signed case we lose the desired sign extensions because of the argument promotion rules of the ternary operator. Solve the problem by doing the round-trip parsing through the intermediate type and back to the desired common type (all at one expression). Fixes: `a534bb15f3` ("tcg/s390: Use constant pool for cmpi") Tested-by: Richard W.M. Jones <rjones@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reported-by: Miroslav Rezanina <mrezanin@redhat.com> Reported-by: Richard W.M. Jones <rjones@redhat.com> Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210204182902.1742826-1-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Richard Henderson	0c823e5968	tcg: Remove TCG_TARGET_CON_SET_H All backends have now been converted to tcg-target-con-set.h, so we can remove the fallback code. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	63b29fda4e	tcg/tci: Split out constraint sets to tcg-target-con-set.h This requires finishing the conversion to tcg_target_op_def. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	0d11dc7c97	tcg/sparc: Split out constraint sets to tcg-target-con-set.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	d1c36a9032	tcg/s390: Split out constraint sets to tcg-target-con-set.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	665be288ac	tcg/riscv: Split out constraint sets to tcg-target-con-set.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	6893016b90	tcg/ppc: Split out constraint sets to tcg-target-con-set.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	0263330bce	tcg/mips: Split out constraint sets to tcg-target-con-set.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	7166eebb9b	tcg/arm: Split out constraint sets to tcg-target-con-set.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	39e7522b4a	tcg/aarch64: Split out constraint sets to tcg-target-con-set.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	4c22e84088	tcg/i386: Split out constraint sets to tcg-target-con-set.h This exports the constraint sets from tcg_target_op_def to a place we will be able to manipulate more in future. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	8c07f3262e	tcg: Remove TCG_TARGET_CON_STR_H All backends have now been converted to tcg-target-con-str.h, so we can remove the fallback code. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:43 -10:00
Richard Henderson	77f268e80b	tcg/sparc: Split out target constraints to tcg-target-con-str.h Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	c947deb13e	tcg/s390: Split out target constraints to tcg-target-con-str.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	fc63a4c5c8	tcg/riscv: Split out target constraints to tcg-target-con-str.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	51800e4346	tcg/mips: Split out target constraints to tcg-target-con-str.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	7d1820a755	tcg/tci: Split out target constraints to tcg-target-con-str.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	85d251d7ec	tcg/ppc: Split out target constraints to tcg-target-con-str.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	abc730e18e	tcg/aarch64: Split out target constraints to tcg-target-con-str.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	3440d583d6	tcg/arm: Split out target constraints to tcg-target-con-str.h Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:31 -10:00
Richard Henderson	358b492392	tcg/i386: Split out target constraints to tcg-target-con-str.h This eliminates the target-specific function target_parse_constraint and folds it into the single caller, process_op_defs. Since this is done directly into the switch statement, duplicates are compilation errors rather than silently ignored at runtime. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:29 -10:00
Richard Henderson	df903b94b3	tcg/i386: Tidy register constraint definitions Create symbolic constants for all low-byte-addressable and second-byte-addressable registers. Create a symbol for the registers that need reserving for softmmu. There is no functional change for 's', as this letter is only used for i386. The BYTEL name is correct for the action we wish from the constraint. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:12:07 -10:00
Richard Henderson	c7c778b5b9	tcg/i386: Move constraint type check to tcg_target_const_match Rather than check the type when filling in the constraint, check it when matching the constant. This removes the only use of the type argument to target_parse_constraint. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:11:49 -10:00
Richard Henderson	2366c858f9	tcg/tci: Remove TCG_TARGET_HAS_* ifdefs The opcodes always exist, regardless of whether or not they are enabled. Remove the unnecessary ifdefs. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:04:26 -10:00
Richard Henderson	0a19f167de	tcg/tci: Drop L and S constraints These are identical to the 'r' constraint. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-02 12:04:25 -10:00
Philippe Mathieu-Daudé	5fa6ab7ecc	tcg/tci: Restrict tci_write_reg16() to 64-bit hosts Restrict tci_write_reg16() to 64-bit hosts to fix on 32-bit ones: [520/1115] Compiling C object libqemu-arm-linux-user.fa.p/tcg_tci.c.o FAILED: libqemu-arm-linux-user.fa.p/tcg_tci.c.o tcg/tci.c:132:1: error: 'tci_write_reg16' defined but not used [-Werror=unused-function] tci_write_reg16(tcg_target_ulong *regs, TCGReg index, uint16_t value) ^~~~~~~~~~~~~~~ Fixes: `2f160e0f97` ("tci: Add implementation for INDEX_op_ld16u_i64") Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20210123094107.2340222-1-f4bug@amsat.org> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-01-27 17:15:40 +01:00
Richard Henderson	ae30e86661	tcg: Restart code generation when we run out of temps Some large translation blocks can generate so many unique constants that we run out of temps to hold them. In this case, longjmp back to the start of code generation and restart with a smaller translation block. Buglink: https://bugs.launchpad.net/bugs/1912065 Tested-by: BALATON Zoltan <balaton@eik.bme.hu> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-24 08:03:27 -10:00
Roman Bolshakov	653b87eb36	tcg: Toggle page execution for Apple Silicon Pages can't be both write and executable at the same time on Apple Silicon. macOS provides public API to switch write protection [1] for JIT applications, like TCG. 1. https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon Tested-by: Alexander Graf <agraf@csgraf.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20210113032806.18220-1-r.bolshakov@yadro.com> [rth: Inline the qemu_thread_jit_* functions; drop the MAP_JIT change for a follow-on patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:13:00 -10:00
Richard Henderson	10061ffe56	tcg/aarch64: Use tcg_constant_vec with tcg vec expanders Improve rotrv_vec to reduce "t1 = -v2, t2 = t1 + c" to "t1 = -v2, t2 = c - v2". This avoids a serial dependency between t1 and t2. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	44aa59a099	tcg/ppc: Use tcg_constant_vec with tcg vec expanders Improve expand_vec_shi to use sign-extraction for MO_32. This allows a single VSPLTISB instruction to load all of the valid shift constants. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	be986adb35	tcg: Remove tcg_gen_dup{8,16,32,64}i_vec These interfaces have been replaced by tcg_gen_dupi_vec and tcg_constant_vec. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	9739a052ad	tcg/i386: Use tcg_constant_vec with tcg vec expanders Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	efe86b21ea	tcg: Add tcg_reg_alloc_dup2 There are several ways we can expand a vector dup of a 64-bit element on a 32-bit host. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	c58f4c97b2	tcg: Remove movi and dupi opcodes These are now completely covered by mov from a TYPE_CONST temporary. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	1bd1af98d7	tcg/tci: Add special tci_movi_{i32,i64} opcodes The normal movi opcodes are going away. We need something for TCI to use internally. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	88d4005b09	tcg: Use tcg_constant_{i32,i64,vec} with gvec expanders Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	11d11d61bd	tcg: Use tcg_constant_{i32,i64} with tcg int expanders Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	0b4286dd15	tcg: Convert tcg_gen_dupi_vec to TCG_CONST Because we now store uint64_t in TCGTemp, we can now always store the full 64-bit duplicate immediate. So remove the difference between 32- and 64-bit hosts. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	8fe35e0444	tcg/optimize: Use tcg_constant_internal with constant folding Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	8f17a975e6	tcg/optimize: Adjust TempOptInfo allocation Do not allocate a large block for indexing. Instead, allocate for each temporary as they are seen. In general, this will use less memory, if we consider that most TBs do not touch every target register. This also allows us to allocate TempOptInfo for new temps created during optimization. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	4c868ce645	tcg/optimize: Improve find_better_copy Prefer TEMP_CONST over anything else. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	c0522136ad	tcg: Introduce TYPE_CONST temporaries These will hold a single constant for the duration of the TB. They are hashed, so that each value has one temp across the TB. Not used yet, this is all infrastructure. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	54795544e4	tcg: Expand TempOptInfo to 64-bits This propagates the extended value of TCGTemp.val that we did before. In addition, it will be required for vector constants. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	6fcb98eda1	tcg: Rename struct tcg_temp_info to TempOptInfo Fix this name vs our coding style. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	bdb38b95f7	tcg: Expand TCGTemp.val to 64-bits This will reduce the differences between 32-bit and 64-bit hosts, allowing full 64-bit constants to be created with the same interface. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	e01fa97dea	tcg: Add temp_readonly In most, but not all, places that we check for TEMP_FIXED, we are really testing that we do not modify the temporary. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	ee17db83d2	tcg: Consolidate 3 bits into enum TCGTempKind The temp_fixed, temp_global, temp_local bits are all related. Combine them into a single enumeration. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	4e18617555	tcg: Increase tcg_out_dupi_vec immediate to int64_t While we don't store more than tcg_target_long in TCGTemp, we shouldn't be limited to that for code generation. We will be able to use this for INDEX_op_dup2_vec with 2 constants. Also pass along the minimal vece that may be said to apply to the constant. This allows some simplification in the various backends. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	0a6a8bc8eb	tcg: Use tcg_out_dupi_vec from temp_load Having dupi pass though movi is confusing and arguably wrong. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-13 08:39:08 -10:00
Richard Henderson	e5e2e4c739	tcg: Constify TCGLabelQemuLdst.raddr Now that all native tcg hosts support splitwx, make this pointer const. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	c8bc1168ad	tcg: Constify tcg_code_gen_epilogue Now that all native tcg hosts support splitwx, make this pointer const. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	de2fac62d2	tcg: Remove TCG_TARGET_SUPPORT_MIRROR Now that all native tcg hosts support splitwx, remove the define. Replace the one use with a test for CONFIG_TCG_INTERPRETER. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	69478b8b15	tcg/arm: Support split-wx code generation Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	df5af1306a	tcg/mips: Support split-wx code generation Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	91a7fd1fb6	tcg/mips: Do not assert on relocation overflow This target was not updated with `7ecd02a06f`, and so did not allow re-compilation with relocation overflow. Remove reloc_26 and reloc_26_val as unused. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	793f738196	tcg/riscv: Support split-wx code generation Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	4b6a52d01e	tcg/riscv: Remove branch-over-branch fallback Since `7ecd02a06f`, we are prepared to re-start code generation with a smaller TB if a relocation is out of range. We no longer need to leave a nop in the stream Just In Case. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	844d0442a5	tcg/riscv: Fix branch range checks The offset even checks were folded into the range check incorrectly. By offsetting by 1, and not decrementing the width, we silently allowed out of range branches. Assert that the offset is always even instead. Move tcg_out_goto down into the CONFIG_SOFTMMU block so that it is not unused. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	79dae4ddd8	tcg/s390: Support split-wx code generation Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	dd90043f5d	tcg/s390: Use tcg_tbrel_diff Use tcg_tbrel_diff when we need a displacement to a label, and with a NULL argument when we need the normalizing addend. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	0d8b6191ac	tcg/sparc: Support split-wx code generation Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	47c2206ba4	tcg/sparc: Use tcg_tbrel_diff Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	d54401dfef	tcg/ppc: Support split-wx code generation Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	2d6f38ebe5	tcg/ppc: Use tcg_out_mem_long to reset TCG_REG_TB The maximum TB code gen size is UINT16_MAX, which the current code does not support. Use our utility function to optimally add an arbitrary constant. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	e6dc7f818f	tcg/ppc: Use tcg_tbrel_diff Use tcg_tbrel_diff when we need a displacement to a label, and with a NULL argument when we need the normalizing addend. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	305daaedf6	tcg/tci: Push const down through bytecode reading Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	ffba3eb34b	tcg/aarch64: Support split-wx code generation Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	f716bab3a9	tcg/aarch64: Use B not BL for tcg_out_goto_long A typo generated a branch-and-link insn instead of plain branch. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	705ed477d5	tcg/i386: Support split-wx code generation Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	eba40358b4	tcg: Return the TB pointer from the rx region from exit_tb This produces a small pc-relative displacement within the generated code to the TB structure that preceeds it. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	a35b3e1415	tcg: Add --accel tcg,split-wx property Plumb the value through to alloc_code_gen_buffer. This is not supported by any os or tcg backend, so for now enabling it will result in an error. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	d997143533	tcg: Make DisasContextBase.tb const There is nothing within the translators that ought to be changing the TranslationBlock data, so make it const. This does not actually use the read-only copy of the data structure that exists within the rx region. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	1acbad0f27	tcg: Adjust tb_target_set_jmp_target for split-wx Pass both rx and rw addresses to tb_target_set_jmp_target. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	755bf9e514	tcg: Adjust tcg_register_jit for const We must change all targets at once, since all must match the declaration in tcg.c. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	92ab8e7d62	tcg: Adjust tcg_out_label for const Simplify the arguments to always use s->code_ptr instead of take it as an argument. That makes it easy to ensure that the value_ptr is always the rx version. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	2be7d76b15	tcg: Adjust tcg_out_call for const We must change all targets at once, since all must match the declaration in tcg.c. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	ffd0e50736	tcg: Adjust TCGLabel for const Change TCGLabel.u.value_ptr to const, and initialize it with tcg_splitwx_to_rx. Propagate const through tcg/host/ only as far as needed to avoid errors from the value_ptr change. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	db0c51a380	tcg: Introduce tcg_splitwx_to_{rx,rw} Add two helper functions, using a global variable to hold the displacement. The displacement is currently always 0, so no change in behaviour. Begin using the functions in tcg common code only. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	8b5c2b6260	tcg: Move tcg epilogue pointer out of TCGContext This value is constant across all thread-local copies of TCGContext, so we might as well move it out of thread-local storage. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	b91ccb3115	tcg: Move tcg prologue pointer out of TCGContext This value is constant across all thread-local copies of TCGContext, so we might as well move it out of thread-local storage. Use the correct function pointer type, and name the variable tcg_qemu_tb_exec, which means that we are able to remove the macro that does the casting. Replace HAVE_TCG_QEMU_TB_EXEC with CONFIG_TCG_INTERPRETER, as this is somewhat clearer in intent. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	1da8de39a3	util: Enhance flush_icache_range with separate data pointer We are shortly going to have a split rw/rx jit buffer. Depending on the host, we need to flush the dcache at the rw data pointer and flush the icache at the rx code pointer. For now, the two passed pointers are identical, so there is no effective change in behaviour. Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	df5d2b1658	tcg: Do not flush icache for interpreter This is currently a no-op within tci/tcg-target.h, but is about to be moved to a more generic location. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	07ce0b0530	tcg: Introduce INDEX_op_qemu_st8_i32 Enable this on i386 to restrict the set of input registers for an 8-bit store, as required by the architecture. This removes the last use of scratch registers for user-only mode. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:06 -10:00
Richard Henderson	d2ef1b83a7	tcg/i386: Adjust TCG_TARGET_HAS_MEMORY_BSWAP Always true when movbe is available, otherwise leave this to generic code. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:06 -10:00
Peter Maydell	aadac5b3d9	From Alex's pull request: * improve cross-build KVM coverage * new --without-default-features configure flag * add __repr__ for ConsoleSocket for debugging * build tcg tests with -Werror * test 32 bit builds with fedora * remove last traces of debian9 * hotfix for centos8 powertools repo * Move lots of feature detection code to meson (Alex, myself) * CFI and LTO support (Daniele) * test-char dangling pointer (Eduardo) * Build system and win32 fixes (Marc-André) * Initialization fixes (myself) * TCG include cleanup (Richard, myself) * x86 'int N' fix (Peter) -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl/1gRUUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPTwAf+J/ffnckmzpckB1gwZ5vEnkYNDreq NrYWDpmnOX6mICXC68WsTmyOvoAvn5es/PF36rOEZ3mDHdF7/RGn/5zxKculLTKp uISs0wdApEC5n78iQwIlec6nzgjteg+DIfaLqQ4P4sVuEtFkuAVsv5E3BJGVoHLg sXy8gTEf95KS9r5bZpzP70rAjIbmxcAjbET4fvdELjkGDNCTRKmpEYPj0sE6qaBp 0/VdqVLpLthuEQoDuEWube7Y2LA/ZuY3Gfxq1em+abXqFJBTAXBf2GET6a/BjLU6 N7wO5FEQ0CUG8fst/Zw3Xp1htGPZTYYMtr0dipYEI2np0A7/CITjTWsekg== =rsil -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging From Alex's pull request: * improve cross-build KVM coverage * new --without-default-features configure flag * add __repr__ for ConsoleSocket for debugging * build tcg tests with -Werror * test 32 bit builds with fedora * remove last traces of debian9 * hotfix for centos8 powertools repo * Move lots of feature detection code to meson (Alex, myself) * CFI and LTO support (Daniele) * test-char dangling pointer (Eduardo) * Build system and win32 fixes (Marc-André) * Initialization fixes (myself) * TCG include cleanup (Richard, myself) * x86 'int N' fix (Peter) # gpg: Signature made Wed 06 Jan 2021 09:21:25 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (52 commits) win32: drop fd registration to the main-loop on setting non-block configure: move tests/qemu-iotests/common.env generation to meson meson.build: convert --with-default-devices to meson libattr: convert to meson cap_ng: convert to meson virtfs: convert to meson seccomp: convert to meson zstd: convert to meson lzfse: convert to meson snappy: convert to meson lzo: convert to meson rbd: convert to meson libnfs: convert to meson libiscsi: convert to meson bzip2: convert to meson glusterfs: convert to meson curl: convert to meson curl: remove compatibility code, require 7.29.0 brlapi: convert to meson configure: remove CONFIG_FILEVERSION and CONFIG_PRODUCTVERSION ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # trace/meson.build	2021-01-06 15:55:29 +00:00
Zihao Yu	d2f3066eb2	tcg/riscv: Fix illegal shift instructions Out-of-range shifts have undefined results, but must not trap. Mask off immediate shift counts to solve this problem. This bug can be reproduced by running the following guest instructions: xor %ecx,%ecx sar %cl,%eax cmovne %edi,%eax After optimization, the tcg opcodes of the sar are movi_i32 tmp3,$0xffffffffffffffff pref=all sar_i32 tmp3,eax,tmp3 dead: 2 pref=all mov_i32 cc_dst,eax sync: 0 dead: 1 pref=0xffc0300 mov_i32 cc_src,tmp3 sync: 0 dead: 0 1 pref=all movi_i32 cc_op,$0x31 sync: 0 dead: 0 pref=all The sar_i32 opcode is a shift by -1, which unmasked generates 0x200808d618: fffa5b9b illegal Signed-off-by: Zihao Yu <yuzihao@ict.ac.cn> Message-Id: <20201216081206.9628-1-yuzihao@ict.ac.cn> [rth: Reworded the patch description.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-04 06:32:58 -10:00
Richard Henderson	6d3ef04893	tcg: Use memset for large vector byte replication In `f47db80cc0`, we handled odd-sized tail clearing for the case of hosts that have vector operations, but did not handle the case of hosts that do not have vector ops. This was ok until `e2e7168a21`, which changed the encoding of simd_desc such that the odd sizes are impossible. Add memset as a tcg helper, and use that for all out-of-line byte stores to vectors. This includes, but is not limited to, the tail clearing operation in question. Cc: qemu-stable@nongnu.org Buglink: https://bugs.launchpad.net/bugs/1907817 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-04 06:32:58 -10:00
Richard Henderson	084cfca143	util: Extract flush_icache_range to cacheflush.c This has been a tcg-specific function, but is also in use by hardware accelerators via physmem.c. This can cause link errors when tcg is disabled. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Joelle van Dyne <j@getutm.app> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20201214140314.18544-3-richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-02 21:03:36 +01:00
Daniele Buono	c905a3680d	cfi: Initial support for cfi-icall in QEMU LLVM/Clang, supports runtime checks for forward-edge Control-Flow Integrity (CFI). CFI on indirect function calls (cfi-icall) ensures that, in indirect function calls, the function called is of the right signature for the pointer type defined at compile time. For this check to work, the code must always respect the function signature when using function pointer, the function must be defined at compile time, and be compiled with link-time optimization. This rules out, for example, shared libraries that are dynamically loaded (given that functions are not known at compile time), and code that is dynamically generated at run-time. This patch: 1) Introduces the CONFIG_CFI flag to support cfi in QEMU 2) Introduces a decorator to allow the definition of "sensitive" functions, where a non-instrumented function may be called at runtime through a pointer. The decorator will take care of disabling cfi-icall checks on such functions, when cfi is enabled. 3) Marks functions currently in QEMU that exhibit such behavior, in particular: - The function in TCG that calls pre-compiled TBs - The function in TCI that interprets instructions - Functions in the plugin infrastructures that jump to callbacks - Functions in util that directly call a signal handler Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> Acked-by: Alex Bennée <alex.bennee@linaro.org Message-Id: <20201204230615.2392-3-dbuono@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-02 21:03:35 +01:00
Thomas Huth	d84568b773	tcg/optimize: Add fallthrough annotations To be able to compile this file with -Werror=implicit-fallthrough, we need to add some fallthrough annotations to the case statements that might fall through. Unfortunately, the typical "/* fallthrough */" comments do not work here as expected since some case labels are wrapped in macros and the compiler fails to match the comments in this case. But using __attribute__((fallthrough)) seems to work fine, so let's use that instead (by introducing a new QEMU_FALLTHROUGH macro in our compiler.h header file). Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20201211152426.350966-11-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2020-12-18 09:14:23 +01:00
Richard Henderson	c56caea3b2	tcg: Revert "tcg/optimize: Flush data at labels not TCG_OPF_BB_END" This reverts commit `cd0372c515`. The patch is incorrect in that it retains copies between globals and non-local temps, and non-local temps still die at the end of the BB. Failing test case for hppa: .globl _start _start: cmpiclr,= 0x24,%r19,%r0 cmpiclr,<> 0x2f,%r19,%r19 ---- 00010057 0001005b movi_i32 tmp0,$0x24 sub_i32 tmp1,tmp0,r19 mov_i32 tmp2,tmp0 mov_i32 tmp3,r19 movi_i32 tmp1,$0x0 ---- 0001005b 0001005f brcond_i32 tmp2,tmp3,eq,$L1 movi_i32 tmp0,$0x2f sub_i32 tmp1,tmp0,r19 mov_i32 tmp2,tmp0 mov_i32 tmp3,r19 movi_i32 tmp1,$0x0 mov_i32 r19,tmp1 setcond_i32 psw_n,tmp2,tmp3,ne set_label $L1 In this case, both copies of "mov_i32 tmp3,r19" are removed. The second because opt thought it was redundant. The first is removed later by liveness because tmp3 is known to be dead. This leaves the setcond_i32 with an uninitialized input. Revert the entire patch for 5.2, and a proper optimization across the branch may be considered for the next development cycle. Reported-by: qemu@igor2.repo.hu Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-11-04 10:35:40 -08:00
Richard Henderson	f14bed3fd4	tcg: Remove assert from set_jmp_reset_offset Since `6e6c4efed9`, there has been a more appropriate range check done later at the end of tcg_gen_code. There, a failing range check results in a returned error code, which causes the TB to be restarted at half the size. Reported-by: Sai Pavan Boddu <saipava@xilinx.com> Tested-by: Sai Pavan Boddu <sai.pavan.boddu@xilinx.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-11-04 10:35:40 -08:00
Richard Henderson	cd0372c515	tcg/optimize: Flush data at labels not TCG_OPF_BB_END We can easily propagate temp values through the entire extended basic block (in this case, the set of blocks connected by fallthru), simply by not discarding the register state at the branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-27 09:48:07 -07:00
Richard Henderson	b4cb76e620	tcg: Do not kill globals at conditional branches We can easily register allocate the entire extended basic block (in this case, the set of blocks connected by fallthru), simply by not discarding the register state at the branch. This does not help blocks starting with a label, as they are reached via a taken branch, and that would require saving the complete register state at the branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-27 09:48:07 -07:00
Richard Henderson	cae5d53b9e	tcg: Remove TCG_TARGET_HAS_cmp_vec The cmp_vec opcode is mandatory; this symbol is unused. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	1dc4fe7012	tcg/optimize: Fold dup2_vec When the two arguments are identical, this can be reduced to dup_vec or to mov_vec from a tcg_constant_vec. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	a5b30d950c	tcg: Fix generation of dupi_vec for 32-bit host The definition of INDEX_op_dupi_vec is that it operates on units of tcg_target_ulong -- in this case 32 bits. It does not work to use this for a uint64_t value that happens to be small enough to fit in tcg_target_ulong. Fixes: `d2fd745fe8` Fixes: `db432672dc` Cc: qemu-stable@nongnu.org Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	f80d09b599	tcg/i386: Fix dupi for avx2 32-bit hosts The previous change wrongly stated that 32-bit avx2 should have used VPBROADCASTW. But that's a 16-bit broadcast and we want a 32-bit broadcast. Fixes: `7b60ef3264` Cc: qemu-stable@nongnu.org Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	bc2b17e6ea	tcg: Move some TCG_CT_* bits to TCGArgConstraint bitfields These are easier to set and test when they have their own fields. Reduce the size of alias_index and sort_index to 4 bits, which is sufficient for TCG_MAX_OP_ARGS. This leaves only the bits indicating constants within the ct field. Move all initialization to allocation time, rather than init individual fields in process_op_defs. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	74a117906b	tcg: Remove TCG_CT_REG This wasn't actually used for anything, really. All variable operands must accept registers, and which are indicated by the set in TCGArgConstraint.regs. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	66792f90f1	tcg: Move sorted_args into TCGArgConstraint.sort_index This uses an existing hole in the TCGArgConstraint structure and will be convenient for keeping the data in one place. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	9be0d08019	tcg: Drop union from TCGArgConstraint The union is unused; let "regs" appear in the main structure without the "u.regs" wrapping. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	e2e7168a21	tcg: Adjust simd_desc size encoding With larger vector sizes, it turns out oprsz == maxsz, and we only need to represent mismatch for oprsz <= 32. We do, however, need to represent larger oprsz and do so without reducing SIMD_DATA_BITS. Reduce the size of the oprsz field and increase the maxsz field. Steal the oprsz value of 24 to indicate equality with maxsz. Tested-by: Frank Chang <frank.chang@sifive.com> Reviewed-by: Frank Chang <frank.chang@sifive.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-08 05:57:32 -05:00
Richard Henderson	4c389f6edf	disas: Move host asm annotations to tb_gen_code Instead of creating GStrings and passing them into log_disas, just print the annotations directly in tb_gen_code. Fix the annotations for the slow paths of the TB, after the part implementing the final guest instruction. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-10-03 04:25:14 -05:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Richard Henderson	fe4b0b5bfa	tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem We already support duplication of 128-bit blocks. This extends that support to 256-bit blocks. This will be needed by SVE2. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-09-03 13:13:58 -07:00
Richard Henderson	6a17646176	tcg: Eliminate one store for in-place 128-bit dup_mem Do not store back to the exact memory from which we just loaded. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-09-03 13:13:58 -07:00
Stephen Long	e7e8f33fb6	tcg: Fix tcg gen for vectorized absolute value The fallback inline expansion for vectorized absolute value, when the host doesn't support such an insn was flawed. E.g. when a vector of bytes has all elements negative, mask will be 0xffff_ffff_ffff_ffff. Subtracting mask only adds 1 to the low element instead of all elements becase -mask is 1 and not 0x0101_0101_0101_0101. Signed-off-by: Stephen Long <steplong@quicinc.com> Message-Id: <20200813161818.190-1-steplong@quicinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-09-03 13:13:58 -07:00
Peter Maydell	dd8014e4e9	ppc patch queue 2020-08-18 Here's my first pull request for qemu-5.2, which has quite a few accumulated things. Highlights are: * Preliminary support for POWER10 (Power ISA 3.1) instruction emulation * Add documentation on the (very confusing) pseries NUMA configuration * Fix some bugs handling edge cases with XICS, XIVE and kernel_irqchip * Fix icount for a number of POWER registers * Many cleanups to error handling in XIVE code * Validate size of -prom-env data -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl87VpwACgkQbDjKyiDZ s5LjIxAAs8YAQe3uDRz1Wb9GftoMmEHdq7JQoO0FbXDQIVXzpTAXmFLSBtCWKl6p O1MEIy/o48b5ORXJqSDSA5LgxbHxYfHdIPEY5Tbn/TGvTvKyCukx9n11milUG8In JxRrOTQBnQAAHkLoyuZyrWKOauC0N1scNrnX9Geuid13GcmqHg1d2alXAUu8jEeC HSiVmtMqqyyqTx2xA4vfhaGuuwTthnKNfbGdg9ksVqBsCW+etn6ZKGImt8hBe3qO 5iqbQZvFbkpzgbjkhDzUDM6tmUAFN55y/Y+y7I8Tz4/IX7d3WbdqpplwrXXVWkpq 2gcBBjQ/9a1hPTBRVN9jn4CvHfhILBfeHIElUiLpSTQZQQALymTnnI2pLCgKoEFX LcchXbjiX+pZ2OJnAijpwBcknjgT2U/ZNyiqHJfSQ6jzlYx1YtUf4xGUsgloSiK8 9QDK8o2k0Cm8Be+lPMBMmTctoi8bq+8SN5UUF710WQL235J58o9+z1vuGO2HVk3x flBtv/+B890wcCDpGU80DPs/LSzR0xTTbA5JsWft2fvO569mda0MoWkJH5w6jvSc ZLYqljCzFCVW+tKiGHzaBalJaMwn0+QMDTsxzP3yTt5LmmEeRXpBELgvrW64IobD xBeryH3nG4SwxFSJq+4ATfvUzjy/Eo58lTTl6c53Ji8/D3aFwsA= =L9Wi -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.2-20200818' into staging ppc patch queue 2020-08-18 Here's my first pull request for qemu-5.2, which has quite a few accumulated things. Highlights are: * Preliminary support for POWER10 (Power ISA 3.1) instruction emulation * Add documentation on the (very confusing) pseries NUMA configuration * Fix some bugs handling edge cases with XICS, XIVE and kernel_irqchip * Fix icount for a number of POWER registers * Many cleanups to error handling in XIVE code * Validate size of -prom-env data # gpg: Signature made Tue 18 Aug 2020 05:18:36 BST # gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full] # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full] # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full] # gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown] # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-5.2-20200818: (40 commits) spapr/xive: Use xive_source_esb_len() nvram: Exit QEMU if NVRAM cannot contain all -prom-env data spapr/xive: Simplify error handling of kvmppc_xive_cpu_synchronize_state() ppc/xive: Simplify error handling in xive_tctx_realize() spapr/xive: Simplify error handling in kvmppc_xive_connect() ppc/xive: Fix error handling in vmstate_xive_tctx_*() callbacks spapr/xive: Fix error handling in kvmppc_xive_post_load() spapr/kvm: Fix error handling in kvmppc_xive_pre_save() spapr/xive: Rework error handling of kvmppc_xive_set_source_config() spapr/xive: Rework error handling in kvmppc_xive_get_queues() spapr/xive: Rework error handling of kvmppc_xive_[gs]et_queue_config() spapr/xive: Rework error handling of kvmppc_xive_cpu_[gs]et_state() spapr/xive: Rework error handling of kvmppc_xive_mmap() spapr/xive: Rework error handling of kvmppc_xive_source_reset() spapr/xive: Rework error handling of kvmppc_xive_cpu_connect() spapr: Simplify error handling in spapr_phb_realize() spapr/xive: Convert KVM device fd checks to assert() ppc/xive: Introduce dedicated kvm_irqchip_in_kernel() wrappers ppc/xive: Rework setup of XiveSource::esb_mmio target/ppc: Integrate icount to purr, vtb, and tbu40 ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2020-08-24 09:35:21 +01:00
Paolo Bonzini	139c1837db	meson: rename included C source files to .c.inc With Makefiles that have automatically generated dependencies, you generated includes are set as dependencies of the Makefile, so that they are built before everything else and they are available when first building the .c files. Alternatively you can use a fine-grained dependency, e.g. target/arm/translate.o: target/arm/decode-neon-shared.inc.c With Meson you have only one choice and it is a third option, namely "build at the beginning of the corresponding target"; the way you express it is to list the includes in the sources of that target. The problem is that Meson decides if something is a source vs. a generated include by looking at the extension: '.c', '.cc', '.m', '.C' are sources, while everything else is considered an include---including '.inc.c'. Use '.c.inc' to avoid this, as it is consistent with our other convention of using '.rst.inc' for included reStructuredText files. The editorconfig file is adjusted. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-08-21 06:18:30 -04:00
Lijun Pan	73ebe95e8e	target/ppc: add vmulld to INDEX_op_mul_vec case Group vmuluwm and vmulld. Make vmulld-specific changes since it belongs to new ISA 3.1. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200724045845.89976-3-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Richard Henderson	69c918d2ef	tcg: Save/restore vecop_list around minmax fallback Forgetting this asserts when tcg_gen_cmp_vec is called from within tcg_gen_cmpsel_vec. Fixes: `72b4c792c7` Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-07-16 13:09:22 -07:00
Liao Pingfang	895bfa84fe	tcg/riscv: Remove superfluous breaks Remove superfluous breaks, as there is a "return" before them. Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <1594600421-22942-1-git-send-email-wang.yi59@zte.com.cn> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>	2020-07-13 17:25:37 -07:00
Richard Henderson	852f933e48	tcg: Fix do_nonatomic_op_* vs signed operations The smin/smax/umin/umax operations require the operands to be properly sign extended. Do not drop the MO_SIGN bit from the load, and additionally extend the val input. Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Reported-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20200701165646.1901320-1-richard.henderson@linaro.org>	2020-07-06 10:58:19 -07:00
Catherine A. Frederick	94248cfc04	tcg/ppc: Sanitize immediate shifts Sanitize shift constants so that shift operations with large constants don't generate invalid instructions. Signed-off-by: Catherine A. Frederick <chocola@animebitch.es> Message-Id: <20200607211100.22858-1-agrecascino123@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-07-06 10:58:19 -07:00
Emilio G. Cota	938e897a66	tcg: call qemu_spin_destroy for tb->jmp_lock Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Robert Foley <robert.foley@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> [RF: minor changes + remove tb_destroy_func] Message-Id: <20200609200738.445-7-robert.foley@linaro.org> Message-Id: <20200612190237.30436-10-alex.bennee@linaro.org>	2020-06-16 14:49:05 +01:00
Richard Henderson	61f15c487f	tcg: Improve move ops in liveness_pass_2 If the output of the move is dead, then the last use is in the store. If we propagate the input to the store, then we can remove the move opcode entirely. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	ab87a66fa2	tcg/ppc: Implement INDEX_op_rot[lr]v_vec We already had support for rotlv, using a target-specific opcode; convert to use the generic opcode. Handle rotrv via simple negation. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	7cff8988fa	tcg/aarch64: Implement INDEX_op_rotl{i,v}_vec For immediate rotate , we can implement this in two instructions, using SLI. For variable rotate, the oddness of aarch64 right-shift- as-negative-left-shift means a backend-specific expansion works best. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	885b1706df	tcg/i386: Implement INDEX_op_rotl{i,s,v}_vec For immediates, we must continue the special casing of 8-bit elements. The other element sizes and shift types are trivially implemented with shifts. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	23850a74af	tcg: Implement gvec support for rotate by scalar No host backend support yet, but the interfaces for rotls are in place. Only implement left-rotate for now, as the only known use of vector rotate by scalar is s390x, so any right-rotate would be unused and untestable. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	3d5bb2ea5c	tcg: Remove expansion to shift by vector from do_shifts We do not reflect this expansion in tcg_can_emit_vecop_list, so it is unused and unusable. However, we actually perform the same expansion in do_gvec_shifts, so it is also unneeded. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	5d0ceda902	tcg: Implement gvec support for rotate by vector No host backend support yet, but the interfaces for rotlv and rotrv are in place. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Drop the generic expansion from rot to shift; we can do better for each backend, and then this code becomes unused.	2020-06-02 08:42:37 -07:00
Richard Henderson	b0f7e7444c	tcg: Implement gvec support for rotate by immediate No host backend support yet, but the interfaces for rotli are in place. Canonicalize immediate rotate to the left, based on a survey of architectures, but provide both left and right shift interfaces to the translators. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Alex Bennée	e5ef4ec28b	disas: include an optional note for the start of disassembly This will become useful shortly for providing more information about output assembly inline. While there fix up the indenting and code formatting in disas(). Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200513175134.19619-9-alex.bennee@linaro.org>	2020-05-15 15:25:16 +01:00
Richard Henderson	07dada0336	tcg: Fix integral argument type to tcg_gen_rot[rl]i_i{32,64} For the benefit of compatibility of function pointer types, we have standardized on int32_t and int64_t as the integral argument to tcg expanders. We converted most of them in `474b2e8f0f`, but missed the rotates. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:10 -07:00
Richard Henderson	ac09ae627e	tcg: Add load_dest parameter to GVecGen2 We have this same parameter for GVecGen2i, GVecGen3, and GVecGen3i. This will make some SVE2 insns easier to parameterize. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:09 -07:00
Richard Henderson	f47db80cc0	tcg: Improve vector tail clearing Better handling of non-power-of-2 tails as seen with Arm 8-byte vector operations. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:07 -07:00
Richard Henderson	398f21412a	tcg: Remove tcg_gen_gvec_dup{8,16,32,64}i These interfaces are now unused. Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:05 -07:00
Richard Henderson	03ddb6f315	tcg: Use tcg_gen_gvec_dup_imm in logical simplifications Replace the outgoing interface. Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:04 -07:00
Richard Henderson	44c94677fe	tcg: Add tcg_gen_gvec_dup_imm Add a version of tcg_gen_dup_* that takes both immediate and a vector element size operand. This will replace the set of tcg_gen_gvec_dup{8,16,32,64}i functions that encode the element size within the function name. Reviewed-by: LIU Zhiwei <zhiwei_liu@c-sky.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:24:58 -07:00
lixinyu	a4e57084c1	tcg/mips: mips sync* encode error OPC_SYNC_WMB, OPC_SYNC_MB, OPC_SYNC_ACQUIRE, OPC_SYNC_RELEASE and OPC_SYNC_RMB have wrong encode. According to the mips manual, their encode should be 'OPC_SYNC \| 0x?? << 6' rather than 'OPC_SYNC \| 0x?? << 5'. Wrong encode can lead illegal instruction errors. These instructions often appear with multi-threaded simulation. Fixes: `6f0b99104a` ("tcg/mips: Add support for fence") Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: lixinyu <precinct@mail.ustc.edu.cn> Message-Id: <20200411124612.12560-1-precinct@mail.ustc.edu.cn> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-04-12 14:07:07 -07:00
Richard Henderson	cce743abbf	tcg/i386: Fix %r12 guest_base initialization When %gs cannot be used, we use register offset addressing. This path is almost never used, so it was clearly not tested. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200406174803.8192-1-richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2020-04-07 16:19:49 +01:00
Richard Henderson	e20cb81d9c	tcg/i386: Fix INDEX_op_dup2_vec We were only constructing the 64-bit element, and not replicating the 64-bit element across the rest of the vector. Cc: qemu-stable@nongnu.org Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-03-30 11:16:17 -07:00
Richard Henderson	312b426fea	tcg/i386: Bound shift count expanding sari_vec A given RISU testcase for SVE can produce tcg-op-vec.c:511: do_shifti: Assertion `i >= 0 && i < (8 << vece)' failed. because expand_vec_sari gave a shift count of 32 to a MO_32 vector shift. In `44f1441dbe`, we changed from direct expansion of vector opcodes to re-use of the tcg expanders. So while the comment correctly notes that the hw will handle such a shift count, we now have to take our own sanity checks into account. Which is easy in this particular case. Fixes: `44f1441dbe` Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-03-17 08:41:07 -07:00
Richard Henderson	fc1bfdfd0c	tcg/arm: Expand epilogue inline It is, after all, just two instructions. Profiling on a cortex-a15, using -d nochain to increase the number of exit_tb that are executed, shows a minor improvement of 0.5%. Signed-off-by: Richard Henderson <rth@twiddle.net>	2020-02-28 10:58:41 -08:00
Richard Henderson	d6fa50f536	tcg/arm: Split out tcg_out_epilogue We will shortly use this function from tcg_out_op as well. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2020-02-28 10:58:41 -08:00
Alex Bennée	fcc54ab5c7	tcg: save vaddr temp for plugin usage While do_gen_mem_cb does copy (via extu_tl_i64) vaddr into a new temp this won't help if the vaddr temp gets clobbered by the actual load/store op. To avoid this clobbering we explicitly copy vaddr before the op to ensure it is live my the time we do the instrumentation. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Cc: qemu-stable@nongnu.org Message-Id: <20200225124710.14152-18-alex.bennee@linaro.org>	2020-02-25 20:20:23 +00:00
Richard Henderson	2445971604	tcg: Add tcg_gen_gvec_5_ptr Extend the vector generator infrastructure to handle 5 vector arguments. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Taylor Simpson <tsimpson@quicinc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-12 14:58:36 -08:00
Philippe Mathieu-Daudé	d3582cfd27	tcg: Move TCG headers to include/tcg/ Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-4-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Philippe Mathieu-Daudé	2b434dd127	tcg: Search includes in the parent source directory All the *.inc.c files included by tcg/$TARGET/tcg-target.inc.c are in tcg/, their parent directory. To simplify the preprocessor search path, include the relative parent path: '..'. Patch created mechanically by running: $ for x in tcg-pool.inc.c tcg-ldst.inc.c; do \ sed -i "s,#include \"$x\",#include \"../$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-3-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Philippe Mathieu-Daudé	dcb32f1d8f	tcg: Search includes from the project root source directory We currently search both the root and the tcg/ directories for tcg files: $ git grep '#include "tcg/' \| wc -l 28 $ git grep '#include "tcg[^/]' \| wc -l 94 To simplify the preprocessor search path, unify by expliciting the tcg/ directory. Patch created mechanically by running: $ for x in \ tcg.h tcg-mo.h tcg-op.h tcg-opc.h \ tcg-op-gvec.h tcg-gvec-desc.h; do \ sed -i "s,#include \"$x\",#include \"tcg/$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-2-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Richard Henderson	fc4120a378	cputlb: Rename helper_ret_ld_cmmu to cpu_ld_code There are no uses of the _cmmu names other than the bare wrapping within the _code inlines. Therefore rename the functions so we can drop the inlines. Use abi_ptr instead of target_ulong in preparation for user-only; the two types are identical for softmmu. Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Robert Foley	7606488c0e	Add use of RCU for qemu_logfile. This now allows changing the logfile while logging is active, and also solves the issue of a seg fault while changing the logfile. Any read access to the qemu_logfile handle will use the rcu_read_lock()/unlock() around the use of the handle. To fetch the handle we will use atomic_rcu_read(). We also in many cases do a check for validity of the logfile handle before using it to deal with the case where the file is closed and set to NULL. The cases where we write to the qemu_logfile will use atomic_rcu_set(). Writers will also use call_rcu() with a newly added qemu_logfile_free function for freeing/closing when readers have finished. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-6-robert.foley@linaro.org>	2019-12-18 20:18:02 +00:00
Robert Foley	fc59d2d870	qemu_log_lock/unlock now preserves the qemu_logfile handle. qemu_log_lock() now returns a handle and qemu_log_unlock() receives a handle to unlock. This allows for changing the handle during logging and ensures the lock() and unlock() are for the same file. Also in target/tilegx/translate.c removed the qemu_log_lock()/unlock() calls (and the log("\n")), since the translator can longjmp out of the loop if it attempts to translate an instruction in an inaccessible page. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-5-robert.foley@linaro.org>	2019-12-18 20:18:02 +00:00
Peter Maydell	cb974c95df	tcg/LICENSE: Remove out of date claim about TCG subdirectory licensing Since 2008 the tcg/LICENSE file has not changed: it claims that everything under tcg/ is BSD-licensed. This is not true and hasn't been true for years: in 2013 we accepted the tcg/aarch64 target code under a GPLv2-or-later license statement. We also have generic vector optimisation code under the LGPL2.1-or-later, and the TCI backend is GPLv2-or-later. Further, many of the files are not BSD licensed but MIT licensed. We don't really consider the tcg subdirectory to be a distinct part of QEMU anyway. Remove the LICENSE file, since claiming false information about the license of the code is confusing. Update the main project LICENSE file also to be clearer about the licenses used by TCG. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-5-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	2552e30cba	tcg/ppc/tcg-target.opc.h: Add copyright/license Add the copyright/license boilerplate for tcg/i386/tcg-target.opc.h. This file has had only two commits, `4b06c21682` and `d9897efa1f`, both by a Linaro engineer. The license is MIT, since that's what the rest of tcg/ppc/ is. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-4-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	2029bf7e52	tcg/i386/tcg-target.opc.h: Add copyright/license Add the copyright/license boilerplate for tcg/i386/tcg-target.opc.h. This file has had only one commit, `770c2fc7bb`, by a Linaro engineer. The license is MIT, since that's what the rest of tcg/i386/ is. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-3-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	97105f2921	tcg/aarch64/tcg-target.opc.h: Add copyright/license Add the copyright/license boilerplate for target/aarch64/tcg-target.opc.h. This file has only had two commits: `14e4c1e235` and `79525dfd08`, both by the same Linaro engineer. The license is GPL-2-or-later, since that's what the rest of tcg/aarch64 uses. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20191025155848.17362-2-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-11-11 15:11:21 +01:00
Peter Maydell	68d8ef4ec5	TCG Plugins initial implementation - use --enable-plugins @ configure - low impact introspection (-plugin empty.so to measure overhead) - plugins cannot alter guest state - example plugins included in source tree (tests/plugins) - -d plugin to enable plugin output in logs - check-tcg runs extra tests when plugins enabled - documentation in docs/devel/plugins.rst -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAl23BZMACgkQ+9DbCVqe KkRPegf/QHygZ4ER2jOaWEookxiOEcik+dzQKVGNqLNXeMLvo5fGjGVpFoFxSgfv ZvCAL4xbW44zsYlVfh59tfn4Tu9qK7s7/qM3WXpHsmuvEuhoWef0Lt2jSe+D46Rs KeG/aX+rHLUR8rr9eCgE+1/MQmxPUj3VUonkUpNkk2ebBbSNoLSOudB4DD9Vcyl7 Pya1kPvA6W9bwI20ZSWihE7flg13o62Pp+LgAFLrsfxXOxOMkPrU8Pp+B0Dvr+hL 5Oh0clZLhiRi75x+KVGZ90TVsoftdjYoOWGMOudS/+NNmqKT1NTLm0K1WJYyRMQ1 V0ne4/OcGNq7x8gcOx/xs09ADu5/VA== =UXR/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stsquad/tags/pull-tcg-plugins-281019-4' into staging TCG Plugins initial implementation - use --enable-plugins @ configure - low impact introspection (-plugin empty.so to measure overhead) - plugins cannot alter guest state - example plugins included in source tree (tests/plugins) - -d plugin to enable plugin output in logs - check-tcg runs extra tests when plugins enabled - documentation in docs/devel/plugins.rst # gpg: Signature made Mon 28 Oct 2019 15:13:23 GMT # gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full] # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-tcg-plugins-281019-4: (57 commits) travis.yml: enable linux-gcc-debug-tcg cache MAINTAINERS: add me for the TCG plugins code scripts/checkpatch.pl: don't complain about (foo, /* empty */) .travis.yml: add --enable-plugins tests include/exec: wrap cpu_ldst.h in CONFIG_TCG accel/stubs: reduce headers from tcg-stub tests/plugin: add hotpages to analyse memory access patterns tests/plugin: add instruction execution breakdown tests/plugin: add a hotblocks plugin tests/tcg: enable plugin testing tests/tcg: drop test-i386-fprem from TESTS when not SLOW tests/tcg: move "virtual" tests to EXTRA_TESTS tests/tcg: set QEMU_OPTS for all cris runs tests/tcg/Makefile.target: fix path to config-host.mak tests/plugin: add sample plugins linux-user: support -plugin option vl: support -plugin option plugin: add qemu_plugin_outs helper plugin: add qemu_plugin_insn_disas helper plugin: expand the plugin_init function to include an info block ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-30 14:10:32 +00:00
Alex Bennée	7dec71d5ff	cputlb: ensure _cmmu helper functions follow the naming standard We document this in docs/devel/load-stores.rst so lets follow it. The 32 bit and 64 bit access functions have historically not included the sign so we leave those as is. We also introduce some signed helpers which are used for loading immediate values in the translator. Fixes: `282dffc8` Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191021150910.23216-1-alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	e6d86bed50	tcg: let plugins instrument virtual memory accesses To capture all memory accesses we need hook into all the various helper functions that are involved in memory operations as well as the injected inline helper calls. A later commit will allow us to resolve the actual guest HW addresses by replaying the lookup. Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: drop haddr handling, just deal in vaddr] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	38b47b19ec	plugin-gen: add module for TCG-related code We first inject empty instrumentation from translator_loop. After translation, we go through the plugins to see what they want to register for, filling in the empty instrumentation. If if turns out that some instrumentation remains unused, we remove it. This approach supports the following features: - Inlining TCG code for simple operations. Note that we do not export TCG ops to plugins. Instead, we give them a C API to insert inlined ops. So far we only support adding an immediate to a u64, e.g. to count events. - "Direct" callbacks. These are callbacks that do not go via a helper. Instead, the helper is defined at run-time, so that the plugin code is directly called from TCG. This makes direct callbacks as efficient as possible; they are therefore used for very frequent events, e.g. memory callbacks. - Passing the host address to memory callbacks. Most of this is implemented in a later patch though. - Instrumentation of memory accesses performed from helpers. See the corresponding comment, as well as a later patch. Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: add alloc_tcg_plugin_context, use glib, rm hwaddr] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	c87fb14fde	tcg: add tcg_gen_st_ptr Will gain a user soon. Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2019-10-28 15:12:38 +00:00
Alex Bennée	504f73f7b3	trace: add mmu_index to mem_info We are going to re-use mem_info later for plugins and will need to track the mmu_idx for softmmu code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2019-10-28 15:12:38 +00:00
Alex Bennée	4cef72d042	cputlb: ensure _cmmu helper functions follow the naming standard We document this in docs/devel/load-stores.rst so lets follow it. The 32 bit and 64 bit access functions have historically not included the sign so we leave those as is. We also introduce some signed helpers which are used for loading immediate values in the translator. Fixes: `282dffc8` Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191021150910.23216-1-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 10:26:02 +01:00
Stefan Weil	2f160e0f97	tci: Add implementation for INDEX_op_ld16u_i64 This fixes "make check-tcg" on a Debian x86_64 host. Signed-off-by: Stefan Weil <sw@weilnetz.de> Tested-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190410194838.10123-1-sw@weilnetz.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 10:26:02 +01:00
Richard Henderson	b7ce3cff21	tcg/ppc: Update vector support for v3.00 dup/dupi These new instructions are conditional on MSR.VEC for TX=1, so we can consider these Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:53 -07:00
Richard Henderson	6e11cde150	tcg/ppc: Update vector support for v3.00 load/store These new instructions are a mix of those like LXSD that are only conditional only on MSR.VEC and those like LXV that are conditional on MSR.VEC for TX=1. Thus, in the end, we can consider all of these as Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:44 -07:00
Richard Henderson	d7cd6a2f25	tcg/ppc: Update vector support for v3.00 Altivec These new instructions are conditional only on MSR.VEC and are thus part of the Altivec instruction set, and not VSX. This includes negation and compare not equal. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:40 -07:00
Richard Henderson	7097312d37	tcg/ppc: Update vector support for v2.07 FP These new instructions are conditional on MSR.FP when TX=0 and MSR.VEC when TX=1. Since we only care about the Altivec registers, and force TX=1, we can consider these to be Altivec instructions. Since Altivec is true for any use of vector types, we only need test have_isa_2_07. This includes moves to and from the integer registers. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:33 -07:00
Richard Henderson	b2dda6400c	tcg/ppc: Update vector support for v2.07 VSX These new instructions are conditional only on MSR.VSX and are thus part of the VSX instruction set, and not Altivec. This includes double-word loads and stores. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:29 -07:00
Richard Henderson	64ff1c6d21	tcg/ppc: Update vector support for v2.07 Altivec These new instructions are conditional only on MSR.VEC and are thus part of the Altivec instruction set, and not VSX. This includes lots of double-word arithmetic and a few extra logical operations. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:25 -07:00
Richard Henderson	47c906ae6f	tcg/ppc: Update vector support for VSX The VSX instruction set instructions include double-word loads and stores, double-word load and splat, double-word permute, and bit select. All of which require multiple operations in the Altivec instruction set. Because the VSX registers map %vsr32 to %vr0, and we have no current intention or need to use vector registers outside %vr0-%vr19, force on the {ax,bx,cx,tx} bits within the added VSX insns so that we don't have to otherwise modify the VR[TABC] macros. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:20 -07:00
Richard Henderson	68f340d4cd	tcg/ppc: Enable Altivec detection Now that we have implemented the required tcg operations, we can enable detection of host vector support. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> (PPC32) Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:10:17 -07:00
Richard Henderson	597cf97892	tcg/ppc: Support vector dup2 This is only used for 32-bit hosts. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:13 -07:00
Richard Henderson	d9897efa1f	tcg/ppc: Support vector multiply For Altivec, this is always an expansion. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:10 -07:00
Richard Henderson	dabae0971b	tcg/ppc: Support vector shift by immediate For Altivec, this is done via vector shift by vector, and loading the immediate into a register. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:06 -07:00
Richard Henderson	e9d1a53ae6	tcg/ppc: Add support for vector saturated add/subtract Add support for vector saturated add/subtract using Altivec instructions: VADDSBS, VADDSHS, VADDSWS, VADDUBS, VADDUHS, VADDUWS, and VSUBSBS, VSUBSHS, VSUBSWS, VSUBUBS, VSUBUHS, VSUBUWS. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:10:02 -07:00
Richard Henderson	d67508117d	tcg/ppc: Add support for vector add/subtract Add support for vector add/subtract using Altivec instructions: VADDUBM, VADDUHM, VADDUWM, VSUBUBM, VSUBUHM, VSUBUWM. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:58 -07:00
Richard Henderson	e238297282	tcg/ppc: Add support for vector maximum/minimum Add support for vector maximum/minimum using Altivec instructions VMAXSB, VMAXSH, VMAXSW, VMAXUB, VMAXUH, VMAXUW, and VMINSB, VMINSH, VMINSW, VMINUB, VMINUH, VMINUW. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:54 -07:00
Richard Henderson	6ef14d7ebe	tcg/ppc: Add support for load/store/logic/comparison Add various bits and peaces related mostly to load and store operations. In that context, logic, compare, and splat Altivec instructions are used, and, therefore, the support for emitting them is included in this patch too. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:50 -07:00
Richard Henderson	4b06c21682	tcg/ppc: Enable tcg backend vector compilation Introduce all of the flags required to enable tcg backend vector support, and a runtime flag to indicate the host supports Altivec instructions. For now, do not actually set have_isa_altivec to true, because we have not yet added all of the code to actually generate all of the required insns. However, we must define these flags in order to disable ifndefs that create stub versions of the functions added here. The change to tcg_out_movi works around a buglet in tcg.c wherein if we do not define tcg_out_dupi_vec we get a declared but not defined Werror, but if we only declare it we get a defined but not used Werror. We need to this change to tcg_out_movi eventually anyway, so it's no biggie. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:44 -07:00
Richard Henderson	63922f467a	tcg/ppc: Replace HAVE_ISEL macro with a variable Previously we've been hard-coding knowledge that Power7 has ISEL, but it was an optional instruction before that. Use the AT_HWCAP2 bit, when present, to properly determine support. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:09:40 -07:00
Richard Henderson	4e33fe0137	tcg/ppc: Replace HAVE_ISA_2_06 This is identical to have_isa_2_06, so replace it. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:09:36 -07:00
Richard Henderson	7d9dae0a10	tcg/ppc: Create TCGPowerISA and have_isa Introduce an enum to hold base < 2.06 < 3.00. Use macros to preserve the existing have_isa_2_06 and have_isa_3_00 predicates. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-14 07:09:30 -07:00
Richard Henderson	b82f769cc1	tcg/ppc: Introduce macros VRT(), VRA(), VRB(), VRC() Introduce macros VRT(), VRA(), VRB(), VRC() used for encoding elements of Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:26 -07:00
Richard Henderson	1838905eb3	tcg/ppc: Introduce macro VX4() Introduce macro VX4() used for encoding Altivec instructions. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:09:22 -07:00
Richard Henderson	42281ec646	tcg/ppc: Introduce Altivec registers Altivec supports 32 128-bit vector registers, whose names are by convention v0 through v31. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2019-10-14 07:08:30 -07:00
Peter Maydell	9de65783e1	Allow page table bit to swap endianness. Reorganize watchpoints out of i/o path. Return host address from probe_write / probe_access. -----BEGIN PGP SIGNATURE----- iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAl1uiyYdHHJpY2hhcmQu aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8AuwgAnYLQQbL8kjSqzp7q gRlj0M2SX41ZW3fMkI794RwsljD9Z0QS7YGnpzHolig9XUYrGnip7STrMvlCr/1L CIMWNHlgitgBMszLqg42/TB+6RxXn+DMX/ShUzTagC6xQhinCIpdEjoLaTKSgeP+ foIyJ2uoJLKOBP8cPTQp8evongtoQIljpsZZ0K8a4sreO1d6ytH+olkuoGiROft+ VoJkA+kNHd9cE+LPCva8UFGu1QE6uCySvhepzOpnvOtK+SXKUm2yLOFGu7RWP1pT RkE0oRyRnImtg+cViHfUUFogIffFROdL5tuYMQVuqbINeROPUgJPav+R1Nz1P60a xM2HEw== =bLLU -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190903' into staging Allow page table bit to swap endianness. Reorganize watchpoints out of i/o path. Return host address from probe_write / probe_access. # gpg: Signature made Tue 03 Sep 2019 16:47:50 BST # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20190903: (36 commits) tcg: Factor out probe_write() logic into probe_access() tcg: Make probe_write() return a pointer to the host page s390x/tcg: Pass a size to probe_write() in do_csst() hppa/tcg: Call probe_write() also for CONFIG_USER_ONLY mips/tcg: Call probe_write() for CONFIG_USER_ONLY as well tcg: Enforce single page access in probe_write() tcg: Factor out CONFIG_USER_ONLY probe_write() from s390x code s390x/tcg: Fix length calculation in probe_write_access() s390x/tcg: Use guest_addr_valid() instead of h2g_valid() in probe_write_access() tcg: Check for watchpoints in probe_write() cputlb: Handle watchpoints via TLB_WATCHPOINT cputlb: Remove double-alignment in store_helper cputlb: Fix size operand for tlb_fill on unaligned store exec: Factor out cpu_watchpoint_address_matches cputlb: Fold TLB_RECHECK into TLB_INVALID_MASK exec: Factor out core logic of check_watchpoint() exec: Move user-only watchpoint stubs inline target/sparc: sun4u Invert Endian TTE bit target/sparc: Add TLB entry with attributes cputlb: Byte swap memory transaction attribute ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-09-04 16:29:18 +01:00
Tony Nguyen	14776ab5a1	tcg: TCGMemOp is now accelerator independent MemOp Preparation for collapsing the two byte swaps, adjust_endianness and handle_bswap, along the I/O path. Target dependant attributes are conditionalized upon NEED_CPU_H. Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-09-03 08:30:38 -07:00
Emilio G. Cota	2bc89637b7	tcg/README: fix typo s/afterwise/afterwards/ Afterwise is "wise after the fact", as in "hindsight". Here we meant "afterwards" (as in "subsequently"). Fix it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190828165307.18321-7-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-09-03 16:20:35 +01:00
tony.nguyen@bt.com	52bf9771fd	configure: Define target access alignment in configure This patch moves the define of target access alignment earlier from target/foo/cpu.h to configure. Suggested in Richard Henderson's reply to "[PATCH 1/4] tcg: TCGMemOp is now accelerator independent MemOp" Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Message-Id: <11e818d38ebc40e986cfa62dd7d0afdc@tpw09926dag18e.domain1.systemhost.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: tony.nguyen@bt.com <tony.nguyen@bt.com>	2019-08-20 17:26:19 +02:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	6a0acfff99	Clean up inclusion of exec/cpu-common.h migration/qemu-file.h neglects to include it even though it needs ram_addr_t. Fix that. Drop a few superfluous inclusions elsewhere. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-14-armbru@redhat.com>	2019-08-16 13:31:52 +02:00
Richard Henderson	1789d4274b	tcg/aarch64: Fix output of extract2 opcodes This patch fixes two problems: (1) The inputs to the EXTR insn were reversed, (2) The input constraints use rZ, which means that we need to use the REG0 macro in order to supply XZR for a constant 0 input. Fixes: `464c2969d5` Reported-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-14 12:19:00 +02:00
Richard Henderson	80f4d7c3ae	tcg: Fix constant folding of INDEX_op_extract2_i32 On a 64-bit host, discard any replications of the 32-bit sign bit when performing the shift and merge. Fixes: https://bugs.launchpad.net/bugs/1834496 Tested-by: Christophe Lyon <christophe.lyon@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-14 12:19:00 +02:00
Richard Henderson	11978f6f58	tcg: Fix expansion of INDEX_op_not_vec This operation can always be emitted, even if we need to fall back to xor. Adjust the assertions to match. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-09 08:26:11 +02:00
Alistair Francis	7ab7e9c7c7	tcg/riscv: Fix RISC-VH host build failure Commit `269bd5d8` "cpu: Move the softmmu tlb to CPUNegativeOffsetState' broke the RISC-V host build as there are two variables that are used but not defined. This patch renames the undefined variables mask_off and table_off to the existing (but unused) mask_ofs and table_ofs variables. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <79729cc88ca509e08b5c4aa0aa8a52847af70c0f.1561039316.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-07-09 08:26:11 +02:00
Like Xu	5cc8767d05	general: Replace global smp variables with smp machine properties Basically, the context could get the MachineState reference via call chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode. A local variable of the same name would be introduced in the declaration phase out of less effort OR replace it on the spot if it's only used once in the context. No semantic changes. Signed-off-by: Like Xu <like.xu@linux.intel.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2019-07-05 17:07:36 -03:00
Markus Armbruster	f91005e195	Supply missing header guards Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190604181618.19980-5-armbru@redhat.com>	2019-06-12 13:20:21 +02:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Richard Henderson	43b3952dea	tcg/arm: Remove mostly unreachable tlb special case There was nothing armv7 specific about the bic+cmp sequence, however looking at the set of guests more closely shows that the 8-bit immediate operand for the bic can only be satisfied with one guest in tree: baseline m-profile -- 10-bit pages with aligned 4-byte memory ops. Therefore it does not seem useful to keep this path. Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	057b6e370b	tcg/arm: Use LDRD to load tlb mask+table This changes the code generation for the tlb from e.g. ldr ip, [r6, #-0x10] ldr r2, [r6, #-0xc] and ip, ip, r4, lsr #8 ldrd r0, r1, [r2, ip]! ldr r2, [r2, #0x18] to ldrd r0, r1, [r6, #-0x10] and r0, r0, r4, lsr #8 ldrd r2, r3, [r1, r0]! ldr r1, [r1, #0x18] for armv7 hosts. Rearranging the register allocation in order to avoid overlap between the two ldrd pairs causes the patch to be larger than it ordinarily would be. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	65b23204d6	tcg/aarch64: Use LDP to load tlb mask+table This changes the code generation for the tlb from e.g. ldur x0, [x19, #0xffffffffffffffe0] ldur x1, [x19, #0xffffffffffffffe8] and x0, x0, x20, lsr #8 add x1, x1, x0 ldr x0, [x1] ldr x1, [x1, #0x18] to ldp x0, x1, [x19, #-0x20] and x0, x0, x20, lsr #8 add x1, x1, x0 ldr x0, [x1] ldr x1, [x1, #0x18] Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	269bd5d8f6	cpu: Move the softmmu tlb to CPUNegativeOffsetState We have for some time had code within the tcg backends to handle large positive offsets from env. This move makes sure that need not happen. Indeed, we are able to assert at build time that simple offsets suffice for all hosts. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Richard Henderson	a40ec84ee2	tcg: Create struct CPUTLB Move all softmmu tlb data into this structure. Arrange the members so that we are able to place mask+table together and at a smaller absolute offset from ENV. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:34 -07:00
Richard Henderson	11e2bfef79	tcg/i386: Use MOVDQA for TCG_TYPE_V128 load/store This instruction raises #GP, aka SIGSEGV, if the effective address is not aligned to 16-bytes. We have assertions in tcg-op-gvec.c that the offset from ENV is aligned, for vector types <= V128. But the offset itself does not validate that the final pointer is aligned -- one must also remember to use the QEMU_ALIGNED() attribute on the vector member within ENV. PowerPC Altivec has vector load/store instructions that silently discard the low 4 bits of the address, making alignment mistakes difficult to discover. Aid that by making the most popular host visibly signal the error. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	9e27f58b99	tcg/aarch64: Allow immediates for vector ORR and BIC The allows immediates to be used for ORR and BIC, as well as the trivial inversions, ORC and AND. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	02f3a5b474	tcg/aarch64: Build vector immediates with two insns Use MOVI+ORR or MVNI+BIC in order to build some vector constants, as opposed to dropping them to the constant pool. This includes all 16-bit constants and a similar set of 32-bit constants. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	7e308e003e	tcg/aarch64: Use MVNI in tcg_out_dupi_vec The compliment of a subset of immediates can be computed with a single instruction. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	984fdcee34	tcg/aarch64: Split up is_fimm There are several sub-classes of vector immediate, and only MOVI can use them all. This will enable usage of MVNI and ORRI, which use progressively fewer sub-classes. This patch adds no new functionality, merely splits the function and moves part of the logic into tcg_out_dupi_vec. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	a9e434a5dc	tcg/aarch64: Support vector bitwise select value The instruction set has 3 insns that perform the same operation, only varying in which operand must overlap the destination. We can represent the operation without overlap and choose based on the operands seen. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	ebcfb91abe	tcg/i386: Use umin/umax in expanding unsigned compare Using umin(a, b) == a as an expansion for TCG_COND_LEU is a better alternative to (a - INT_MIN) <= (b - INT_MIN). Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	3ec3538a45	tcg/i386: Remove expansion for missing minmax This is now handled by code within tcg-op-vec.c. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	904c5e1967	tcg/i386: Support vector comparison select value We already had backend support for this feature. Expand the new cmpsel opcode using vpblendb. The combination allows us to avoid an extra NOT for some comparison codes. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	25c012b400	tcg: Add TCG_OPF_NOT_PRESENT if TCG_TARGET_HAS_foo is negative If INDEX_op_foo is always expanded by tcg_expand_vec_op, then there may be no reasonable set of constraints to return from tcg_target_op_def for that opcode. Let TCG_TARGET_HAS_foo be specified as -1 in that case. Thus a boolean test for TCG_TARGET_HAS_foo is true, but we will not assert within process_op_defs when no constraints are specified. Compare this with tcg_can_emit_vec_op, which already uses this tri-state indication. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	72b4c792c7	tcg: Expand vector minmax using cmp+cmpsel Provide a generic fallback for the min/max operations. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	17f79944eb	tcg: Introduce do_op3_nofail for vector expansion This makes do_op3 match do_op2 in allowing for failure, and thus fall back expansions. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	f75da2988e	tcg: Add support for vector compare select Perform a per-element conditional move. This combination operation is easier to implement on some host vector units than plain cmp+bitsel. Omit the usual gvec interface, as this is intended to be used by target-specific gvec expansion call-backs. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	38dc12947e	tcg: Add support for vector bitwise select This operation performs d = (b & a) \| (c & ~a), and is present on a majority of host vector units. Include gvec expanders. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	532ba368a1	tcg: Fix missing checks and clears in tcg_gen_gvec_dup_mem The paths through tcg_gen_dup_mem_vec and through MO_128 were missing the check_size_align. The path through MO_128 was also missing the expand_clr. This last was not visible because the only user is ARM SVE, which would set oprsz == maxsz, and not require the clear. Fix by adding the check_size_align and using do_dup directly instead of duplicating the check in tcg_gen_gvec_dup_{i32,i64}. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	7b60ef3264	tcg/i386: Fix dupi/dupm for avx1 and 32-bit hosts The VBROADCASTSD instruction only allows %ymm registers as destination. Rather than forcing VEX.L and writing to the entire 256-bit register, revert to using MOVDDUP with an %xmm register. This is sufficient for an avx1 host since we do not support TCG_TYPE_V256 for that case. Also fix the 32-bit avx2, which should have used VPBROADCASTW. Fixes: `1e262b49b5` Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-22 15:09:43 -04:00
Richard Henderson	a7b6d286cf	tcg/aarch64: Do not advertise minmax for MO_64 The min/max instructions are not available for 64-bit elements. Fixes: `93f332a503` Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	a456394ae5	tcg/aarch64: Support vector absolute value Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	18f9b65f1a	tcg/i386: Support vector absolute value Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	bcefc90208	tcg: Add support for vector absolute value Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	ff1f11f7f8	tcg: Add support for integer absolute value Remove a function of the same name from target/arm/. Use a branchless implementation of abs gleaned from gcc. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	0a8d7a3bf5	tcg/i386: Support vector scalar shift opcodes Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	b4578cd91c	tcg: Add gvec expanders for vector shift by scalar Allow expansion either via shift by scalar or by replicating the scalar for shift by vector. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Use a private structure for do_gvec_shifts.	2019-05-13 22:52:08 +00:00
Richard Henderson	79525dfd08	tcg/aarch64: Support vector variable shift opcodes Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	a2ce146a06	tcg/i386: Support vector variable shift opcodes Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	5ee5c14cac	tcg: Add gvec expanders for variable shift The gvec expanders perform a modulo on the shift count. If the target requires alternate behaviour, then it cannot use the generic gvec expanders anyway, and will have to have its own custom code. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	37ee55a081	tcg: Add INDEX_op_dupm_vec Allow the backend to expand dup from memory directly, instead of forcing the value into a temp first. This is especially important if integer/vector register moves do not exist. Note that officially tcg_out_dupm_vec is allowed to fail. If it did, we could fix this up relatively easily: VECE == 32/64: Load the value into a vector register, then dup. Both of these must work. VECE == 8/16: If the value happens to be at an offset such that an aligned load would place the desired value in the least significant end of the register, go ahead and load w/garbage in high bits. Load the value w/INDEX_op_ld{8,16}_i32. Attempt a move directly to vector reg, which may fail. Store the value into the backing store for OTS. Load the value into the vector reg w/TCG_TYPE_I32, which must work. Duplicate from the vector reg into itself, which must work. All of which is well and good, except that all supported hosts can support dupm for all vece, so all of the failure paths would be dead code and untestable. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	f23e5e15ed	tcg/aarch64: Implement tcg_out_dupm_vec The LD1R instruction does all the work. Note that the only useful addressing mode is a base register with no offset. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:50:35 +00:00
Richard Henderson	1e262b49b5	tcg/i386: Implement tcg_out_dupm_vec At the same time, improve tcg_out_dupi_vec wrt broadcast from the constant pool. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	d6ecb4a978	tcg: Add tcg_out_dupm_vec to the backend interface Currently stubbed out in all backends that support vectors. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	bab1671f0f	tcg: Manually expand INDEX_op_dup_vec This case is similar to INDEX_op_mov_* in that we need to do different things depending on the current location of the source. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Added some commentary to the tcg_reg_alloc_* functions.	2019-05-13 14:44:03 -07:00
Richard Henderson	e7632cfa8b	tcg: Promote tcg_out_{dup,dupi}_vec to backend interface The i386 backend already has these functions, and the aarch64 backend could easily split out one. Nothing is done with these functions yet, but this will aid register allocation of INDEX_op_dup_vec in a later patch. Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	240c08d099	tcg: Support cross-class moves without instruction support PowerPC Altivec does not support direct moves between vector registers and general registers. So when tcg_out_mov fails, we can use the backing memory for the temporary to perform the move. Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	78113e83e0	tcg: Return bool success from tcg_out_mov This patch merely changes the interface, aborting on all failures, of which there are currently none. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	c16f52b2c5	tcg/arm: Use tcg_out_mov_reg in tcg_out_mov We have a function that takes an additional condition parameter over the standard backend interface. It already takes care of eliding no-op moves. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	d63e3b6e69	tcg: Assert fixed_reg is read-only The only fixed_reg is cpu_env, and it should not be modified during any TB. Therefore code that tries to special-case moves into a fixed_reg is dead. Remove it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	53229a7703	tcg: Specify optional vector requirements with a list Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	ce27c5d1a3	tcg: Allow add_vec, sub_vec, neg_vec, not_vec to be expanded PowerPC Altivec does not support add and subtract of 64-bit elements. Prepare for that configuration by not assuming the operation is universally supported. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
Richard Henderson	ac383dde33	tcg: Do not recreate INDEX_op_neg_vec unless supported Use tcg_can_emit_vec_op instead of just TCG_TARGET_HAS_neg_vec, so that we check the type and vece for the actual operation. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:26:28 -07:00
David Hildenbrand	e1227bb6e5	tcg: Implement tcg_gen_gvec_3i() Let's add tcg_gen_gvec_3i(), similar to tcg_gen_gvec_2i(), however without introducing "gen_helper_gvec_3i *fnoi", as it isn't needed for now. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190416185301.25344-2-david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:26:28 -07:00
Richard Henderson	b4b82d7e9c	tcg/arm: Restrict constant pool displacement to 12 bits This will not necessarily restrict the size of the TB, since for v7 the majority of constant pool usage is for calls from the out-of-line ldst code, which is already at the end of the TB. But this does allow us to save one insn per reference on the off-chance. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-25 10:39:39 -07:00
Richard Henderson	a7cdaf710f	tcg/ppc: Allow the constant pool to overflow at 32k There is no point in coding for a 2GB offset when the max TB size is already limited to 64k. If we further restrict to 32k then we can eliminate the extra ADDIS instruction. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:28 -07:00
Richard Henderson	aeee05f53a	tcg: Restart TB generation after out-of-line ldst overflow This is part c of relocation overflow handling. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:28 -07:00
Richard Henderson	1768987b73	tcg: Restart TB generation after constant pool overflow This is part b of relocation overflow handling. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:28 -07:00
Richard Henderson	7ecd02a06f	tcg: Restart TB generation after relocation overflow If the TB generates too much code, such that backend relocations overflow, try again with a smaller TB. In support of this, move relocation processing from a random place within tcg_out_op, in the handling of branch opcodes, to a new function at the end of tcg_gen_code. This is not a complete solution, as there are additional relocs generated for out-of-line ldst handling and constant pools. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:05:21 -07:00
Richard Henderson	6e6c4efed9	tcg: Restart after TB code generation overflow If a TB generates too much code, try again with fewer insns. Fixes: https://bugs.launchpad.net/bugs/1824853 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	464c2969d5	tcg/aarch64: Support INDEX_op_extract2_{i32,i64} Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	3b832d67a9	tcg/arm: Support INDEX_op_extract2_i32 Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	c6fb8c0cf7	tcg/i386: Support INDEX_op_extract2_{i32,i64} Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	b0a6056719	tcg: Use extract2 in tcg_gen_deposit_{i32,i64} Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	02616bad6f	tcg: Use deposit and extract2 in tcg_gen_shifti_i64 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Richard Henderson	fce1296f13	tcg: Add INDEX_op_extract2_{i32,i64} This will let backends implement the double-word shift operation. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
David Hildenbrand	2089fcc9e7	tcg: Implement tcg_gen_extract2_{i32,i64} Will be helpful for s390x. Input 128 bit and output 64 bit only, which is sufficient for now. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20190225154204.26751-1-david@redhat.com> [rth: Add matching tcg_gen_extract2_i32.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-24 13:04:33 -07:00
Markus Armbruster	3de2faa9a8	tcg: Simplify how dump_exec_info() prints dump_exec_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_jit() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-5-armbru@redhat.com>	2019-04-18 22:18:59 +02:00
Markus Armbruster	d4c51a0af3	tcg: Simplify how dump_opcount_info() prints dump_opcount_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_opcount() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-4-armbru@redhat.com>	2019-04-18 22:18:59 +02:00
Richard Henderson	9e564a1dde	tcg: Remove TODO file The last update to this file was 9 years ago. In the meantime, 4 of the 6 ideas have actually been completed. The lat two do not actually make sense anymore. Suggested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-21 10:22:24 -08:00
Mark Cave-Ayland	3115584d39	tcg/i386: fix unsigned vector saturating arithmetic Due to a cut/paste error in the original implementation, the unsigned vector saturating arithmetic was erroneously being calculated as signed vector saturating arithmetic. Fixes: `8ffafbcec2` ("tcg/i386: Implement vector saturating arithmetic") Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20190207224258.426-1-mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-11 08:52:44 -08:00
Richard Henderson	bef16ab4e6	tcg: Diagnose referenced labels that have not been emitted Currently, a jump to a label that is not defined anywhere will be emitted not be relocated. This results in a jump to a random jump target. With tcg debugging, print a diagnostic to the -d op file and abort. This could help debug or detect errors like `c2d9644e6d` ("target/arm: Fix crash on conditional instruction in an IT block") Reported-by: Roman Kapl <code@rkapl.cz> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-11 08:52:44 -08:00
Thomas Huth	fb0343d5b4	tcg: Fix LGPL version number It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1548252536-6242-5-git-send-email-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 11:01:52 +01:00
Richard Henderson	e77c89fb08	cputlb: Remove static tlb sizing Now that all tcg backends support TCG_TARGET_IMPLEMENTS_DYN_TLB, remove the define and the old code. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:35 -08:00
Richard Henderson	0a9a83d6bf	tcg/tci: enable dynamic TLB sizing This is automatic due to TCI using the other softtlb macros. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	ac33373e0e	tcg/mips: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	a31aa4ce00	tcg/mips: Fix tcg_out_qemu_ld_slow_path Patch the branch after it has been emitted rather than before it exists. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	cd7d3cb7a2	tcg/arm: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	41b70f220b	tcg/riscv: enable dynamic TLB sizing Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:24 -08:00
Richard Henderson	4f47e338f6	tcg/s390: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	17ff9f7801	tcg/sparc: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	644f591ab0	tcg/ppc: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	f7bcd96669	tcg/aarch64: enable dynamic TLB sizing Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Emilio G. Cota	54eaf40b8f	tcg/i386: enable dynamic TLB sizing As the following experiments show, this series is a net perf gain, particularly for memory-heavy workloads. Experiments are run on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. 1. System boot + shudown, debian aarch64: - Before (v3.1.0): Performance counter stats for './die.sh v3.1.0' (10 runs): 9019.797015 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 29,910,312,379 cycles # 3.316 GHz ( +- 0.14% ) 54,699,252,014 instructions # 1.83 insn per cycle ( +- 0.08% ) 10,061,951,686 branches # 1115.541 M/sec ( +- 0.08% ) 172,966,530 branch-misses # 1.72% of all branches ( +- 0.07% ) 9.084039051 seconds time elapsed ( +- 0.23% ) - After: Performance counter stats for './die.sh tlb-dyn-v5' (10 runs): 8624.084842 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 28,556,123,404 cycles # 3.311 GHz ( +- 0.13% ) 51,755,089,512 instructions # 1.81 insn per cycle ( +- 0.05% ) 9,526,513,946 branches # 1104.641 M/sec ( +- 0.05% ) 166,578,509 branch-misses # 1.75% of all branches ( +- 0.19% ) 8.680540350 seconds time elapsed ( +- 0.24% ) That is, a 4.4% perf increase. 2. System boot + shutdown, ubuntu 18.04 x86_64: - Before (v3.1.0): 56100.574751 task-clock (msec) # 1.016 CPUs utilized ( +- 4.81% ) 200,745,466,128 cycles # 3.578 GHz ( +- 5.24% ) 431,949,100,608 instructions # 2.15 insn per cycle ( +- 5.65% ) 77,502,383,330 branches # 1381.490 M/sec ( +- 6.18% ) 844,681,191 branch-misses # 1.09% of all branches ( +- 3.82% ) 55.221556378 seconds time elapsed ( +- 5.01% ) - After: 56603.419540 task-clock (msec) # 1.019 CPUs utilized ( +- 10.19% ) 202,217,930,479 cycles # 3.573 GHz ( +- 10.69% ) 439,336,291,626 instructions # 2.17 insn per cycle ( +- 14.14% ) 80,538,357,447 branches # 1422.853 M/sec ( +- 16.09% ) 776,321,622 branch-misses # 0.96% of all branches ( +- 3.77% ) 55.549661409 seconds time elapsed ( +- 10.44% ) No improvement (within noise range). Note that for this workload, increasing the time window too much can lead to perf degradation, since it flushes the TLB very frequently. 3. x86_64 SPEC06int: x86_64-softmmu speedup vs. v3.1.0 for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 5.5 +------------------------------------------------------------------------+ \| +-+ \| 5 \|-+.................+-+...............................tlb-dyn-v5.......+-\| \| * * \| 4.5 \|-+..................................................................+-\| \| * * \| 4 \|-+..................................................................+-\| \| * * \| 3.5 \|-+..................................................................+-\| \| * * \| 3 \|-+......+-+........................................................+-\| \| * * * \| 2.5 \|-+.................................................+-+...........+-\| \| * * * * * \| 2 \|-+..............................................................+-\| \| * * * * * * +-+ \| 1.5 \|-+....................................................+-+.+-+.+-\| \| * * +-+ * +-+ +-+ +-+ +-+ * * * * * \| 1 \|++++-++++++++++++++++-+++-+++-++++-++++-++++++++++++++\| \| * * * * * * * * * * * * * * * * * * * * * * * * * * \| 0.5 +------------------------------------------------------------------------+ 400.perlb401.bzip403.g429445.g456.hm462.libq464.h471.omn47483.xalancbgeomean png: https://imgur.com/YRF90f7 That is, a 1.51x average speedup over the baseline, with a max speedup of 5.17x. Here's a different look at the SPEC06int results, using KVM as the baseline: x86_64-softmmu slowdown vs. KVM for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 25 +---------------------------------------------------------------------------+ \| +-+ +-+ \| \| * * +-+ v3.1.0 \| \| * * +-+ tlb-dyn-v5 \| \| * * * * +-+ \| 20 \|-+................................................+-+...............+-\| \| * * # # * * \| \| +-+ * * * # # * * \| \| * * * * * # # * * \| 15 \|-+..............................................#.#.......+-+......+-\| \| * * * * * # # * #\|# \| \| * * * * +-+ * # # * +-+ \| \| * * +-+ * * ++-+ +-+ * # # * # # +-+ \| \| * * +-+ * * * ## \| +-+ # # * # # +-+ \| 10 \|-+..........+-+...........##.......++-+..+-+.#.#.......#.#....+-\| \| * * +-+ * * * ## +-+ # # # #* # # +-+ * # # * * \| \| * * * # # * * +-+ * ## * +-+ # # # #* # # * * * # # +-+ \| \| * * # # * * * +-+ * ## * # # # # # #* # # * * * # # * ## \| 5 \|-+.......+-+.#.#.....#.#..##..#.#.#.#..#.#.#.#.....#.#..##.+-\| \| * # #* # # * +-+* # # * ## * # # # # # #* # # * * * # # * ## \| \| * # #* # # * # #* # # * ## * # # # # # #* # # * +-+* # # * ## \| \| ++-+ * # #* # # * # #* # # * ## * # # # # # #* # # * # #* # # * ## \| \|+++#+#++#+#+#+#++#+#+#+#++##++#+#+#+#++#+#+#+#++#+#+#+#+*+##+++\| 0 +---------------------------------------------------------------------------+ 400.perlbe401.bzi403.gc429445.go456.h462.libqu464.h471.omne4483.xalancbmgeomean png: https://imgur.com/YzAMNEV After this series, we bring down the average SPEC06int slowdown vs KVM from 11.47x to 7.58x. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Emilio G. Cota	86e1eff8bc	tcg: introduce dynamic TLB sizing Disabled in all TCG backends for now. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	93f332a503	tcg/aarch64: Implement vector minmax arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	d32648d445	tcg/aarch64: Implement vector saturating arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	bc37faf4cb	tcg/i386: Implement vector minmax arithmetic The avx instruction set does not directly provide MO_64. We can still implement 64-bit with comparison and vpblendvb. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	8ffafbcec2	tcg/i386: Implement vector saturating arithmetic Only MO_8 and MO_16 are implemented, since that's all the instruction set provides. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	44f1441dbe	tcg/i386: Split subroutines out of tcg_expand_vec_op This routine was becoming too large. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	dd0a0fcdd8	tcg: Add opcodes for vector minmax arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	8afaf05066	tcg: Add opcodes for vector saturated arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	5d6acdd4a4	tcg: Add write_aofs to GVecGen4 This allows writing 2 output, 3 input operations. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	f550805d83	tcg: Add gvec expanders for nand, nor, eqv Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	9a9eda78e4	tcg: Add logical simplifications during gvec expand We handle many of these during integer expansion, and the rest of them during integer optimization. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Paolo Bonzini	7d37435bd5	avoid TABs in files that only contain a few Most files that have TABs only contain a handful of them. Change them to spaces so that we don't confuse people. disas, standard-headers, linux-headers and libdecnumber are imported from other projects and probably should be exempted from the check. Outside those, after this patch the following files still contain both 8-space and TAB sequences at the beginning of the line. Many of them have a majority of TABs, or were initially committed with all tabs. bsd-user/i386/target_syscall.h bsd-user/x86_64/target_syscall.h crypto/aes.c hw/audio/fmopl.c hw/audio/fmopl.h hw/block/tc58128.c hw/display/cirrus_vga.c hw/display/xenfb.c hw/dma/etraxfs_dma.c hw/intc/sh_intc.c hw/misc/mst_fpga.c hw/net/pcnet.c hw/sh4/sh7750.c hw/timer/m48t59.c hw/timer/sh_timer.c include/crypto/aes.h include/disas/bfd.h include/hw/sh4/sh.h libdecnumber/decNumber.c linux-headers/asm-generic/unistd.h linux-headers/linux/kvm.h linux-user/alpha/target_syscall.h linux-user/arm/nwfpe/double_cpdo.c linux-user/arm/nwfpe/fpa11_cpdt.c linux-user/arm/nwfpe/fpa11_cprt.c linux-user/arm/nwfpe/fpa11.h linux-user/flat.h linux-user/flatload.c linux-user/i386/target_syscall.h linux-user/ppc/target_syscall.h linux-user/sparc/target_syscall.h linux-user/syscall.c linux-user/syscall_defs.h linux-user/x86_64/target_syscall.h slirp/cksum.c slirp/if.c slirp/ip.h slirp/ip_icmp.c slirp/ip_icmp.h slirp/ip_input.c slirp/ip_output.c slirp/mbuf.c slirp/misc.c slirp/sbuf.c slirp/socket.c slirp/socket.h slirp/tcp_input.c slirp/tcpip.h slirp/tcp_output.c slirp/tcp_subr.c slirp/tcp_timer.c slirp/tftp.c slirp/udp.c slirp/udp.h target/cris/cpu.h target/cris/mmu.c target/cris/op_helper.c target/sh4/helper.c target/sh4/op_helper.c target/sh4/translate.c tcg/sparc/tcg-target.inc.c tests/tcg/cris/check_addo.c tests/tcg/cris/check_moveq.c tests/tcg/cris/check_swap.c tests/tcg/multiarch/test-mmap.c ui/vnc-enc-hextile-template.h ui/vnc-enc-zywrle.h util/envlist.c util/readline.c The following have only TABs: bsd-user/i386/target_signal.h bsd-user/sparc64/target_signal.h bsd-user/sparc64/target_syscall.h bsd-user/sparc/target_signal.h bsd-user/sparc/target_syscall.h bsd-user/x86_64/target_signal.h crypto/desrfb.c hw/audio/intel-hda-defs.h hw/core/uboot_image.h hw/sh4/sh7750_regnames.c hw/sh4/sh7750_regs.h include/hw/cris/etraxfs_dma.h linux-user/alpha/termbits.h linux-user/arm/nwfpe/fpopcode.h linux-user/arm/nwfpe/fpsr.h linux-user/arm/syscall_nr.h linux-user/arm/target_signal.h linux-user/cris/target_signal.h linux-user/i386/target_signal.h linux-user/linux_loop.h linux-user/m68k/target_signal.h linux-user/microblaze/target_signal.h linux-user/mips64/target_signal.h linux-user/mips/target_signal.h linux-user/mips/target_syscall.h linux-user/mips/termbits.h linux-user/ppc/target_signal.h linux-user/sh4/target_signal.h linux-user/sh4/termbits.h linux-user/sparc64/target_syscall.h linux-user/sparc/target_signal.h linux-user/x86_64/target_signal.h linux-user/x86_64/termbits.h pc-bios/optionrom/optionrom.h slirp/mbuf.h slirp/misc.h slirp/sbuf.h slirp/tcp.h slirp/tcp_timer.h slirp/tcp_var.h target/i386/svm.h target/sparc/asi.h target/xtensa/core-dc232b/xtensa-modules.inc.c target/xtensa/core-dc233c/xtensa-modules.inc.c target/xtensa/core-de212/core-isa.h target/xtensa/core-de212/xtensa-modules.inc.c target/xtensa/core-fsf/xtensa-modules.inc.c target/xtensa/core-sample_controller/core-isa.h target/xtensa/core-sample_controller/xtensa-modules.inc.c target/xtensa/core-test_kc705_be/core-isa.h target/xtensa/core-test_kc705_be/xtensa-modules.inc.c tests/tcg/cris/check_abs.c tests/tcg/cris/check_addc.c tests/tcg/cris/check_addcm.c tests/tcg/cris/check_addoq.c tests/tcg/cris/check_bound.c tests/tcg/cris/check_ftag.c tests/tcg/cris/check_int64.c tests/tcg/cris/check_lz.c tests/tcg/cris/check_openpf5.c tests/tcg/cris/check_sigalrm.c tests/tcg/cris/crisutils.h tests/tcg/cris/sys.c tests/tcg/i386/test-i386-ssse3.c ui/vgafont.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20181213223737.11793-3-pbonzini@redhat.com> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Acked-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Eric Blake <eblake@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Stefan Markovic <smarkovic@wavecomp.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-01-11 15:46:56 +01:00
Paolo Bonzini	eae3eb3e18	qemu/queue.h: simplify reverse access to QTAILQ The new definition of QTAILQ does not require passing the headname, remove it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-01-11 15:46:55 +01:00
Paolo Bonzini	b58deb344d	qemu/queue.h: leave head structs anonymous unless necessary Most list head structs need not be given a name. In most cases the name is given just in case one is going to use QTAILQ_LAST, QTAILQ_PREV or reverse iteration, but this does not apply to lists of other kinds, and even for QTAILQ in practice this is only rarely needed. In addition, we will soon reimplement those macros completely so that they do not need a name for the head struct. So clean up everything, not giving a name except in the rare case where it is necessary. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-01-11 15:46:55 +01:00
Richard Henderson	4250da1092	tcg: Improve call argument loading Free the argument register only after we have verified that the temporary is not already in that register. This case is likely now that we are back propagating the preferred register. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:58:43 +11:00
Richard Henderson	25f49c5f15	tcg: Record register preferences during liveness With these preferences, we can arrange for function call arguments to be computed into the proper registers instead of requiring extra moves. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:58:35 +11:00
Richard Henderson	ae36a246ed	tcg: Add TCG_OPF_BB_EXIT Use this to notice the opcodes that exit the TB, which implies that local temps are really dead and need not be synced. Previously we so marked the true end of the TB, but that was immediately overwritten by the la_bb_end invoked by any TCG_OPF_BB_END opcode, like exit_tb. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:58:20 +11:00
Richard Henderson	f65a061c39	tcg: Split out more subroutines from liveness_pass_1 Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:58:17 +11:00
Richard Henderson	2616c80821	tcg: Rename and adjust liveness_pass_1 helpers No need for a "tcg_" prefix for a static function; we already have another "la_" prefix for indicating liveness analysis. Pass in nb_globals and nb_temps, as we will already have them in registers for other loops within the parent function. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:58:09 +11:00
Richard Henderson	152c35aab4	tcg: Reindent parts of liveness_pass_1 There are two blocks of the form if (foo) { stuff1; goto bar; } else { baz: stuff2; } which have unnecessary and confusing indentation. Remove the else and unindent stuff2. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:57 +11:00
Richard Henderson	1894f69a61	tcg: Dump register preference info with liveness Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:48 +11:00
Richard Henderson	d62816f2db	tcg: Improve register allocation for matching constraints Try harder to honor the output_pref. When we're forced to allocate a second register for the input, it does not need to use the input constraint; that will be honored by the register we allocate for the output and a move is already required. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:40 +11:00
Richard Henderson	69e3706d2b	tcg: Add output_pref to TCGOp Allocate storage for, but do not yet fill in, per-opcode preferences for the output operands. Pass it in to the register allocation routines for output operands. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:37 +11:00
Richard Henderson	ba87719cd2	tcg: Add preferred_reg argument to tcg_reg_alloc_do_movi Pass this through to temp_sync. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:34 +11:00
Richard Henderson	98b4e186c1	tcg: Add preferred_reg argument to temp_sync Pass this through to tcg_reg_alloc. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:32 +11:00
Richard Henderson	b722452aef	tcg: Add preferred_reg argument to temp_load Pass this through to tcg_reg_alloc. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:29 +11:00
Richard Henderson	b016486e7b	tcg: Add preferred_reg argument to tcg_reg_alloc This new argument will aid register allocation by indicating how the temporary will be used in future. If the preference cannot be satisfied, fall back to the constraints of the current insn. Short circuit the preference when it cannot be satisfied or if it does not further constrain the operation. With an eye toward optimizing function call sequences, optimize for the preferred_reg set containing a single register. For the moment, all users pass 0 for preference. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:57:06 +11:00
Richard Henderson	b4fc67c7af	tcg: Add reachable_code_pass Delete trivially dead code that follows unconditional branches and noreturn helpers. These can occur either via optimization or via the structure of a target's translator following an exception. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:30 +11:00
Richard Henderson	d88a117eaa	tcg: Reference count labels Increment when adding branches, and decrement when removing them. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:27 +11:00
Richard Henderson	15d7409260	tcg: Add TCG_CALL_NO_RETURN Remember which helpers have been marked noreturn. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:24 +11:00
Richard Henderson	3b50352b05	tcg: Renumber TCG_CALL_* flags Previously, the low 4 bits were used for TCG_CALL_TYPE_MASK, which was removed in `6a18ae2d29`. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:15 +11:00
Alistair Francis	7a5549f2ae	tcg/riscv: Add the target init code Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <dd6e439ab81883974b8fd91f904f6de26ab5d697.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	92c041c59b	tcg/riscv: Add the prologue generation and register the JIT Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <c4d023127967a0217d8d1eabdf5de6c0e8f8c228.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	bdf503819e	tcg/riscv: Add the out op decoder Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <7c47f00cb4a9a777120456e0704b4076a5d943ab.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	03a7d0213d	tcg/riscv: Add direct load and store instructions Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <2e047a95c39c007c66cda024c095e29b0ac4c43e.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	efbea94c76	tcg/riscv: Add slowpath load and store instructions Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1a0a7e8f3347764f212c5efa5c07c9be17efdec6.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	15840069e1	tcg/riscv: Add branch and jump instructions Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <c356657e627168d89cb5b012b7e21e4efbbe83f3.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	28ca738e9d	tcg/riscv: Add the add2 and sub2 instructions Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <5665a57809e32b35775e8e98fdab898853af37b8.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	61535d4988	tcg/riscv: Add the out load and store instructions Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <d5d88ff29163788938368bbdbd18815d59cef6a0.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	27fd64144b	tcg/riscv: Add the extract instructions Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <c4d2afba46efefa9388cf3205fcedbb9a5fa411f.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	6cd2eda39f	tcg/riscv: Add the mov and movi instruction Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <bd6a45c73a67b77ddaa2fe590a6bb8ee422b9683.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	dfa8e74f94	tcg/riscv: Add the relocation functions Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <6ac4f4b0d5ea93cb0ee9a3b8b47ee9f7b3711494.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	bedf14e335	tcg/riscv: Add the instruction emitters Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <c740aca183675625bb9cf3ce7b9e8b9d431ca694.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	54a9ce0f68	tcg/riscv: Add the immediate encoders Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <d54dc56303fd1b0d7ed53869de2dbb59b111c7ca.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	8ce23a1312	tcg/riscv: Add support for the constraints Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <dba7315e4e20e879933f72d47ccf98f1cc612b8a.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	505e75c592	tcg/riscv: Add the tcg target registers Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <6e43abaa64361d57b9bc9439820d0e7701f2d47e.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Alistair Francis	fb1f70f368	tcg/riscv: Add the tcg-target.h file Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Michael Clark <mjc@sifive.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <a135ee1a88cd7bd08993a519d4d654da27785254.1545246859.git.alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-26 06:40:02 +11:00
Emilio G. Cota	ac1043f6d6	tcg: Drop nargs from tcg_op_insert_{before,after} It's unused since `75e8b9b7aa`. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181209193749.12277-9-cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Alistair Francis	161dec9d1b	tcg/mips: Improve the add2/sub2 command to use TCG_TARGET_REG_BITS Instead of hard coding 31 for the shift right use TCG_TARGET_REG_BITS - 1. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <7dfbddf7014a595150aa79011ddb342c3cc17ec3.1544648105.git.alistair.francis@wdc.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	e1dcf3529d	tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP For now, defined universally as true, since we previously required backends to implement swapped memory operations. Future patches may now remove that support where it is onerous. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	6498594c8e	tcg/optimize: Optimize bswap Somehow we forgot these operations, once upon a time. This will allow immediate stores to have their bswap optimized away. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	9e821eab0a	tcg: Clean up generic bswap64 Based on the only current user, Sparc: New code uses 2 constants that take 2 insns to load from constant pool, plus 13. Old code used 6 constants that took 1 or 2 insns to create, plus 21. The result is a new total of 17 vs an old total of 29. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	a686dc71d8	tcg: Clean up generic bswap32 Based on the only current user, Sparc: New code uses 1 constant that takes 2 insns to create, plus 8. Old code used 2 constants that took 2 insns to create, plus 9. The result is a new total of 10 vs an old total of 13. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	5785c17f31	tcg/i386: Add setup_guest_base_seg for FreeBSD Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	913c2bddc2	tcg/i386: Precompute all guest_base parameters These values are constant between all qemu_ld/st invocations; there is no need to figure this out each time. If we cannot use a segment or an offset directly for guest_base, load the value into a register in the prologue. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	4810d96f03	tcg/i386: Assume 32-bit values are zero-extended We now have an invariant that all TCG_TYPE_I32 values are zero-extended, which means that we do not need to extend them again during qemu_ld/st, either explicitly via a separate tcg_out_ext32u or implicitly via P_ADDR32. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	75478279a0	tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests This preserves the invariant that all TCG_TYPE_I32 values are zero-extended in the 64-bit host register. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	3dbc8c61de	tcg/i386: Propagate is64 to tcg_out_qemu_ld_slow_path This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	1d21d95b61	tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	55dfd8fedc	tcg/s390x: Return false on failure from patch_reloc This does require an extra two checks within the slow paths to replace the assert that we're moving. Also add two checks within existing functions that lacked any kind of assert for out of range branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	d513290351	tcg/ppc: Return false on failure from patch_reloc The reloc_pc{14,24}_val routines retain their asserts. Use these directly within the slow paths. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	43fabd30e2	tcg/arm: Return false on failure from patch_reloc This does require an extra two checks within the slow paths to replace the assert that we're moving. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	214bfe83d5	tcg/aarch64: Return false on failure from patch_reloc This does require an extra two checks within the slow paths to replace the assert that we're moving. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	bec3afd5fc	tcg/i386: Return false on failure from patch_reloc Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	6ac1778676	tcg: Return success from patch_reloc This will move the assert for success from within (subroutines of) patch_reloc into the callers. It will also let new code do something different when a relocation is out of range. For the moment, all backends are trivially converted to return true. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	8c1b079279	tcg/mips: Remove retranslation code There is no longer a need for preserving branch offset operands, as we no longer re-translate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	791645f022	tcg/sparc: Remove retranslation code There is no longer a need for preserving branch offset operands, as we no longer re-translate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	3661612fc3	tcg/s390: Remove retranslation code There is no longer a need for preserving branch offset operands, as we no longer re-translate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	f9c7246faa	tcg/ppc: Fold away "noaddr" branch routines There is no longer a need for preserving branch offset operands, as we no longer re-translate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	37ee93a974	tcg/arm: Fold away "noaddr" branch routines There are one use apiece for these. There is no longer a need for preserving branch offset operands, as we no longer re-translate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	2672ccc7ee	tcg/arm: Remove reloc_pc24_atomic It is unused since `3fb53fb4d1`. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	733589b338	tcg/aarch64: Fold away "noaddr" branch routines There are one use apiece for these. There is no longer a need for preserving branch offset operands, as we no longer re-translate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	90d6cb7811	tcg/aarch64: Remove reloc_pc26_atomic It is unused since `b68686bd4b`. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	66c0285df4	tcg/i386: Move TCG_REG_CALL_STACK from define to enum Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	5740d9f714	tcg/i386: Always use %ebp for TCG_AREG0 For x86_64, this can remove a REX prefix resulting in smaller code when manipulating globals of type i32, as we move them between backing store via cpu_env, aka TCG_AREG0. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	f6823cbe37	target/sparc: Remove the constant pool Partially reverts `ab20bdc116`. The 14-bit displacement that we allowed to reach the constant pool is not always sufficient. Retain the tb-relative addressing, as that is how most return values from the tb are computed. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:42 +03:00
Thomas Huth	6fa2cef205	tcg/tcg.h: Remove GCC check for tcg_debug_assert() macro Both GCC v4.8 and Clang v3.4 (our minimum versions) support __builtin_unreachable(), so we can remove the version check here now. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-12 10:01:12 +01:00
Peter Maydell	a7ce790a02	tcg/tcg-op.h: Add multiple include guard The tcg-op.h header was missing the usual guard against multiple inclusion; add it. (Spotted by lgtm.com's static analyzer.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20181108125256.30986-1-peter.maydell@linaro.org	2018-11-08 15:15:32 +00:00
Richard Henderson	e6cd4bb59b	tcg: Split CONFIG_ATOMIC128 GCC7+ will no longer advertise support for 16-byte __atomic operations if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still has support for __sync_compare_and_swap_16 and we can make use of that. AArch64 does not have, nor ever has had such support, so open-code it. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:36 -07:00
Emilio G. Cota	72fd2efbbd	tcg: distribute tcg_time into TCG contexts When we implemented per-vCPU TCG contexts, we forgot to also distribute the tcg_time counter, which has remained as a global accessed without any serialization, leading to potentially missed counts. Fix it by distributing the field over the TCG contexts, embedding it into TCGProfile with a field called "cpu_exec_time", which is more descriptive than "tcg_time". Add a function to query this value directly, and for completeness, fill in the field in tcg_profile_snapshot, even though its callers do not use it. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-5-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Emilio G. Cota	dd1d7da23b	tcg: plug holes in struct TCGProfile This plugs two 4-byte holes in 64-bit. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Emilio G. Cota	c1f543b739	tcg: fix use of uninitialized variable under CONFIG_PROFILER We forgot to initialize n in commit `15fa08f845` ("tcg: Dynamically allocate TCGOps", 2017-12-29). Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Richard Henderson	d7f425fdea	tcg: Implement CPU_LOG_TB_NOCHAIN during expansion Rather than test NOCHAIN before linking, do not emit the goto_tb opcode at all. We already do this for goto_ptr. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Roman Kapl	93bf9a4273	tcg/i386: fix vector operations on 32-bit hosts The TCG backend uses LOWREGMASK to get the low 3 bits of register numbers. This was defined as no-op for 32-bit x86, with the assumption that we have eight registers anyway. This assumption is not true once we have xmm regs. Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes and have overflown into other opcode fields, wreaking havoc. To trigger these problems, you can try running the "movi d8, #0x0" AArch64 instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated, but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2". Fixes: `770c2fc7bb` ("Add vector operations") Signed-off-by: Roman Kapl <rka@sysgo.com> Message-Id: <20180824131734.18557-1-rka@sysgo.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-09-26 09:02:51 -07:00
Richard Henderson	1fb57da72a	tcg/optimize: Do not skip default processing of dup_vec If we do not opimize away dup_vec, we must mark its output as changed. Fixes: `170ba88f45` Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Message-id: 20180805233258.31892-1-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-08-06 14:57:48 +01:00
Richard Henderson	672189cd58	tcg/i386: Mark xmm registers call-clobbered When host vector registers and operations were introduced, I failed to mark the registers call clobbered as required by the ABI. Fixes: `770c2fc7bb` Cc: qemu-stable@nongnu.org Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-07-23 09:21:14 -07:00
Alex Bennée	e65a5f227d	tcg/aarch64: limit mul_vec size In AdvSIMD we can only do 32x32 integer multiples although SVE is capable of larger 64 bit multiples. As a result we can end up generating invalid opcodes. Fix this by only reprting we can emit mul vector ops if the size is small enough. Fixes a crash on: sve-all-short-v8.3+sve@vq3/insn_mul_z_zi___INC.risu.bin When running on AArch64 hardware. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20180719154248.29669-1-alex.bennee@linaro.org> [rth: Removed the tcg_debug_assert -- there are plenty of other cases that we do not diagnose within the insn encoding helpers.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-07-19 09:07:31 -07:00
Richard Henderson	499748d768	tcg: Restrict check_size_impl to multiples of the line size Normally this is automatic in the size restrictions that are placed on vector sizes coming from the implementation. However, for the legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final 24 bytes of the vector register. Without this check, do_dup selects TCG_TYPE_V128 and clears only 16 bytes. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180705191929.30773-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-07-09 14:51:34 +01:00
Richard Henderson	9f75462065	tcg: Reduce max TB opcode count Also, assert that we don't overflow any of two different offsets into the TB. Both unwind and goto_tb both record a uint16_t for later use. This fixes an arm-softmmu test case utilizing NEON in which there is a TB generated that runs to 7800 opcodes, and compiles to 96k on an x86_64 host. This overflows the 16-bit offset in which we record the goto_tb reset offset. Because of that overflow, we install a jump destination that goes to neverland. Boom. With this reduced op count, the same TB compiles to about 48k for aarch64, ppc64le, and x86_64 hosts, and neither assertion fires. Cc: qemu-stable@nongnu.org Reported-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 09:39:53 -10:00
Emilio G. Cota	0ac20318ce	tcg: remove tb_lock Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+--------------+-+ \| + + + + + B \| \| before *B* ** * \| \|tb lock removal ###D### * \| 600 +-+ * +-+ \| ** # \| \| B #D \| \| *** * ## \| 500 +-+ *** ### +-+ \| * *** ### \| \| B # ## \| \| ** * #D# \| 400 +-+ ## +-+ \| ### \| \| ## \| \| # ## \| 300 +-+ * B* #D# +-+ \| B *** ### \| \| * ** #### \| \| * *** ### \| 200 +-+ B B #D# +-+ \| #B * ## # \| \| #* ## \| \| + D##D# + + + + \| 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ \| + + + + + + \| \| before *B* B \| 80 +tb lock removal ###D### D +-+ \| ### \| \| ## \| 70 +-+ # +-+ \| ## \| \| # \| 60 +-+ B ## +-+ \| * ## \| \| * #D \| 50 +-+ * ## +-+ \| * ### \| \| B* ### \| 40 +-+ ** # ## +-+ \| #D# \| \| B* ### \| 30 +-+ B*B #### +-+ \| B * * # ### \| \| B ###D# \| 20 +-+ D ##D## +-+ \| D# \| \| + + + + + + \| 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	128ed2278c	tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx Thereby making it per-TCGContext. Once we remove tb_lock, this will avoid an atomic increment every time a TB is invalidated. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	be2cdc5e35	tcg: track TBs with per-region BST's This paves the way for enabling scalable parallel generation of TCG code. Instead of tracking TBs with a single binary search tree (BST), use a BST for each TCG region, protecting it with a lock. This is as scalable as it gets, since each TCG thread operates on a separate region. The core of this change is the introduction of struct tcg_region_tree, which contains a pointer to a GTree and an associated lock to serialize accesses to it. We then allocate an array of tcg_region_tree's, adding the appropriate padding to avoid false sharing based on qemu_dcache_linesize. Given a tc_ptr, we first find the corresponding region_tree. This is done by special-casing the first and last regions first, since they might be of size != region.size; otherwise we just divide the offset by region.stride. I was worried about this division (several dozen cycles of latency), but profiling shows that this is not a fast path. Note that region.stride is not required to be a power of two; it is only required to be a multiple of the host's page size. Note that with this design we can also provide consistent snapshots about all region trees at once; for instance, tcg_tb_foreach acquires/releases all region_tree locks before/after iterating over them. For this reason we now drop tb_lock in dump_exec_info(). As an alternative I considered implementing a concurrent BST, but this can be tricky to get right, offers no consistent snapshots of the BST, and performance and scalability-wise I don't think it could ever beat having separate GTrees, given that our workload is insert-mostly (all concurrent BST designs I've seen focus, understandably, on making lookups fast, which comes at the expense of convoluted, non-wait-free insertions/removals). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
John Arbuckle	1019242af1	tcg/i386: Use byte form of xgetbv instruction The assembler in most versions of Mac OS X is pretty old and does not support the xgetbv instruction. To go around this problem, the raw encoding of the instruction is used instead. Signed-off-by: John Arbuckle <programmingkidx@gmail.com> Message-Id: <20180604215102.11002-1-programmingkidx@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Peter Maydell	163670542f	tcg-next queue -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJbEdLqAAoJEGTfOOivfiFfQaEH/Rq96S5bo94495KmRJY9e/jw lV321YYI7nx7sHtViG/B3iTkvnxzZPWcc7XbBMxyV5xmMQ/5zjS/ynZPFyy/cYRn zLM4W0SJ38EqhHTZpkkvw9Nle8UbNWKm5PgND2TyE4hmeuQ98OrQ6Y1GvP4MFpXs uQErbmMjYHMq7thbfCO6ulJjjEliRy3AJ2C3fCCCUgBQrJt6JeqbGr/Zzi2y88M9 IhoK8RbJiWT2O5Tl95q2NOQvr11WbFlu/K0nuaVgbfTwd2tp3ygmRKPpeZ24qA52 qtwgcIjWHHkkC5s1qaP8oW4FtoMQZdsaOwSOPw0ZBnG+VA7P/h33fWr9f5SistA= =UVdE -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/tcg-next-pull-request' into staging tcg-next queue # gpg: Signature made Sat 02 Jun 2018 00:12:42 BST # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/tcg-next-pull-request: tcg: Pass tb and index to tcg_gen_exit_tb separately Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-06-04 11:28:31 +01:00
Richard Henderson	07ea28b418	tcg: Pass tb and index to tcg_gen_exit_tb separately Do the cast to uintptr_t within the helper, so that the compiler can type check the pointer argument. We can also do some more sanity checking of the index argument. Reviewed-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-01 15:15:27 -07:00
Philippe Mathieu-Daudé	23c11b04dc	target: Do not include "exec/exec-all.h" if it is not necessary Code change produced with: $ git grep '#include "exec/exec-all.h"' \| \ cut -d: -f-1 \| \ xargs egrep -L "(cpu_address_space_init\|cpu_loop_\|tlb_\|tb_\|GETPC\|singlestep\|TranslationBlock)" \| \ xargs sed -i.bak '/#include "exec\/exec-all.h"/d' Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180528232719.4721-10-f4bug@amsat.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-06-01 14:15:10 +02:00
Emilio G. Cota	1d34982155	tcg: fix s/compliment/complement/ typos Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2018-05-20 08:25:23 +03:00
Peter Maydell	f5583c527f	target-arm queue: * hw/arm/iotkit.c: fix minor memory leak * softfloat: fix wrong-exception-flags bug for multiply-add corner case * arm: isolate and clean up DTB generation * implement Arm v8.1-Atomics extension * Fix some bugs and missing instructions in the v8.2-FP16 extension -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJa9IUCAAoJEDwlJe0UNgzeEGMQAKKjVRzZ7MBgvxQj0FJSWhSP BZkATf3ktid255PRpIssBZiY9oM+uY6n+/IRozAGvfDBp9eQOkrZczZjfW5hpe0B YsQadtk5cUOXqQzRTegSMPOoMmz8f5GaGOk4R6AEXJEX+Rug/zbOn9Q8Yx7JTd7o yBvU1+fys3galSiB88cffA95B9fwGfLsM7rP6OC4yNdUBYwjHf3wtY53WsxtWqX9 oX4keEiROQkrOfbSy9wYPZzu/0iRo8v35+7wIZhvNSlf02k6yJ7a+w0C4EQIRhWm 5zciE+aMYr7nOGpj7AEJLrRekhwnD6Ppje6aUd15yrxfNRZkpk/FeECWnaOPDis7 QNijx5Zqg6+GyItQKi5U4vFVReMj09OB7xDyAq77xDeBj4l3lg2DNkRfRhqQZAcv 2r4EW+pfLNj76Ah1qtQ410fprw462Sopb6bHmeuFbf1QFbQvJ4CL1+7Jl3ExrDX4 2+iQb4sQghWDxhDLfRSLxQ7K+bX+mNfGdFW8h+jPShD/+JY42dTKkFZEl4ghNgMD mpj8FrQuIkSMqnDmPfoTG5MVTMERacqPU7GGM7/fxudIkByO3zTiLxJ/E+Iy8HvX 29xKoOBjKT5FJrwJABsN6VpA3EuyAARgQIZ/dd6N5GZdgn2KAIHuaI+RHFOesKFd dJGM6sdksnsAAz28aUEJ =uXY+ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20180510' into staging target-arm queue: * hw/arm/iotkit.c: fix minor memory leak * softfloat: fix wrong-exception-flags bug for multiply-add corner case * arm: isolate and clean up DTB generation * implement Arm v8.1-Atomics extension * Fix some bugs and missing instructions in the v8.2-FP16 extension # gpg: Signature made Thu 10 May 2018 18:44:34 BST # gpg: using RSA key 3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20180510: (21 commits) target/arm: Clear SVE high bits for FMOV target/arm: Fix float16 to/from int16 target/arm: Implement vector shifted FCVT for fp16 target/arm: Implement vector shifted SCVF/UCVF for fp16 target/arm: Enable ARM_FEATURE_V8_ATOMICS for user-only target/arm: Implement CAS and CASP target/arm: Fill in disas_ldst_atomic target/arm: Introduce ARM_FEATURE_V8_ATOMICS and initial decode target/riscv: Use new atomic min/max expanders tcg: Use GEN_ATOMIC_HELPER_FN for opposite endian atomic add tcg: Introduce atomic helpers for integer min/max target/xtensa: Use new min/max expanders target/arm: Use new min/max expanders tcg: Introduce helpers for integer min/max atomic.h: Work around gcc spurious "unused value" warning make sure that we aren't overwriting mc->get_hotplug_handler by accident arm/boot: split load_dtb() from arm_load_kernel() platform-bus-device: use device plug callback instead of machine_done notifier pc: simplify MachineClass::get_hotplug_handler handling softfloat: Handle default NaN mode after pickNaNMulAdd, not before ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> # Conflicts: # target/riscv/translate.c	2018-05-11 17:41:54 +01:00
Richard Henderson	5507c2bf35	tcg: Introduce atomic helpers for integer min/max Given that this atomic operation will be used by both risc-v and aarch64, let's not duplicate code across the two targets. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180508151437.4232-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-05-10 18:10:57 +01:00
Richard Henderson	b87fb8cd9f	tcg: Introduce helpers for integer min/max These operations are re-invented by several targets so far. Several supported hosts have insns for these, so place the expanders out-of-line for a future introduction of tcg opcodes. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180508151437.4232-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-05-10 18:10:57 +01:00
Richard Henderson	abebf92597	tcg: Limit the number of ops in a TB In `6001f7729e` we partially attempt to address the branch displacement overflow caused by `15fa08f845`. However, gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqtbX.c is a testcase that contains a TB so large as to overflow anyway. The limit here of 8000 ops produces a maximum output TB size of 24112 bytes on a ppc64le host with that test case. This is still much less than the maximum forward branch distance of 32764 bytes. Cc: qemu-stable@nongnu.org Fixes: `15fa08f845` ("tcg: Dynamically allocate TCGOps") Reviewed-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-09 08:30:57 -07:00
Peter Maydell	7eb30ef0ba	tcg/i386: Fix dup_vec in non-AVX2 codepath The VPUNPCKLD* instructions are all "non-destructive source", indicated by "NDS" in the encoding string in the x86 ISA manual. This means that they take two source operands, one of which is encoded in the VEX.vvvv field. We were incorrectly treating them as if they were destructive-source and passing 0 as the 'v' argument of tcg_out_vex_modrm(). This meant we were always using %xmm0 as one of the source operands, causing incorrect results if the register allocator happened to want to use something else. For instance the input AArch64 insn: DUP v26.16b, w21 which becomes TCG IR ops: dup_vec v128,e8,tmp2,x21 st_vec v128,e8,tmp2,env,$0xa40 was assembled to: 0x607c568c: c4 c1 7a 7e 86 e8 00 00 vmovq 0xe8(%r14), %xmm0 0x607c5694: 00 0x607c5695: c5 f9 60 c8 vpunpcklbw %xmm0, %xmm0, %xmm1 0x607c5699: c5 f9 61 c9 vpunpcklwd %xmm1, %xmm0, %xmm1 0x607c569d: c5 f9 70 c9 00 vpshufd $0, %xmm1, %xmm1 0x607c56a2: c4 c1 7a 7f 8e 40 0a 00 vmovdqu %xmm1, 0xa40(%r14) 0x607c56aa: 00 when the vpunpcklwd insn should be "%xmm1, %xmm1, %xmm1". This resulted in our incorrectly setting the output vector to q26=0000320000003200:0000320000003200 when given an input of x21 == 0000000002803200 rather than the expected all-zeroes. Pass the correct source register number to tcg_out_vex_modrm() for these insns. Fixes: `770c2fc7bb` Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20180504153431.5169-1-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-09 08:30:57 -07:00
Laurent Vivier	6001f7729e	tcg: workaround branch instruction overflow in tcg_out_qemu_ld/st ppc64 uses a BC instruction to call the tcg_out_qemu_ld/st slow path. BC instruction uses a relative address encoded on 14 bits. The slow path functions are added at the end of the generated instructions buffer, in the reverse order of the callers. So more we have slow path functions more the distance between the caller (BC) and the function increases. This patch changes the behavior to generate the functions in the same order of the callers. Cc: qemu-stable@nongnu.org Fixes: `15fa08f845` ("tcg: Dynamically allocate TCGOps") Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20180429235840.16659-1-lvivier@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-01 11:56:55 -07:00
Richard Henderson	5bfa803448	tcg: Improve TCGv_ptr support Drop TCGV_PTR_TO_NAT and TCGV_NAT_TO_PTR internal macros. Add tcg_temp_local_new_ptr, tcg_gen_brcondi_ptr, tcg_gen_ext_i32_ptr, tcg_gen_trunc_i64_ptr, tcg_gen_extu_ptr_i64, tcg_gen_trunc_ptr_i32. Use inlines instead of macros where possible. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-01 11:56:16 -07:00
Richard Henderson	9a938d86b0	tcg: Allow wider vectors for cmp and mul In `db432672`, we allow wide inputs for operations such as add. However, in `212be173` and `3774030a` we didn't do the same for compare and multiply. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-01 11:56:16 -07:00
Henry Wertz	3f814b8037	tcg/arm: Fix memory barrier encoding I found with qemu 2.11.x or newer that I would get an illegal instruction error running some Intel binaries on my ARM chromebook. On investigation, I found it was quitting on memory barriers. qemu instruction: mb $0x31 was translating as: 0x604050cc: 5bf07ff5 blpl #0x600250a8 After patch it gives: 0x604050cc: f57ff05b dmb ish In short, I found INSN_DMB_ISH (memory barrier for ARMv7) appeared to be correct based on online docs, but due to some endian-related shenanigans it had to be byte-swapped to suit qemu; it appears INSN_DMB_MCR (memory barrier for ARMv6) also should be byte swapped (and this patch does so). I have not checked for correctness of aarch64's barrier instruction. Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Henry Wertz <hwertz10@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-01 11:56:07 -07:00
Richard Henderson	d103021269	tcg: Document INDEX_mul[us]h_* Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-01 11:41:56 -07:00
Peter Maydell	161dfd1e7f	tcg/mips: Handle large offsets from target env to tlb_table The MIPS TCG target makes the assumption that the offset from the target env pointer to the tlb_table is less than about 64K. This used to be true, but gradual addition of features to the Arm target means that it's no longer true there. This results in the build-time assertion failing: In file included from /home/pm215/qemu/include/qemu/osdep.h:36:0, from /home/pm215/qemu/tcg/tcg.c:28: /home/pm215/qemu/tcg/mips/tcg-target.inc.c: In function ‘tcg_out_tlb_load’: /home/pm215/qemu/include/qemu/compiler.h:90:36: error: static assertion failed: "not expecting: offsetof(CPUArchState, tlb_table[NB_MMU_MODES - 1][1]) > 0x7ff0 + 0x7fff" #define QEMU_BUILD_BUG_MSG(x, msg) _Static_assert(!(x), msg) ^ /home/pm215/qemu/include/qemu/compiler.h:98:30: note: in expansion of macro ‘QEMU_BUILD_BUG_MSG’ #define QEMU_BUILD_BUG_ON(x) QEMU_BUILD_BUG_MSG(x, "not expecting: " #x) ^ /home/pm215/qemu/tcg/mips/tcg-target.inc.c:1236:9: note: in expansion of macro ‘QEMU_BUILD_BUG_ON’ QEMU_BUILD_BUG_ON(offsetof(CPUArchState, ^ /home/pm215/qemu/rules.mak:66: recipe for target 'tcg/tcg.o' failed An ideal long term approach would be to rearrange the CPU state so that the tlb_table was not so far along it, but this is tricky because it would move it from the "not cleared on CPU reset" part of the struct to the "cleared on CPU reset" part. As a simple fix for the 2.12 release, make the MIPS TCG target handle an arbitrary offset by emitting more add instructions. This will mean an extra instruction in the fastpath for TCG loads and stores for the affected guests (currently just aarch64-softmmu). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 20180413142336.32163-1-peter.maydell@linaro.org	2018-04-16 11:51:37 +01:00
Richard Henderson	9743cd5736	tcg: Introduce tcg_set_insn_start_param The parameters for tcg_gen_insn_start are target_ulong, which may be split into two TCGArg parameters for storage in the opcode on 32-bit hosts. Fixes the ARM target and its direct use of tcg_set_insn_param, which would set the wrong argument in the 64-on-32 case. Cc: qemu-stable@nongnu.org Reported-by: alarson@ddci.com Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180410003558.2470-1-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-04-10 13:02:26 +01:00
Richard Henderson	f2f1dde751	tcg: Mark muluh_i64 and mulsh_i64 as 64-bit ops Failure to do so results in the tcg optimizer sign-extending any constant fold from 32-bits. This turns out to be visible in the RISC-V testsuite using a host that emits these opcodes (e.g. any non-x86_64). Reported-by: Michael Clark <mjc@sifive.com> Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-03-28 12:45:16 +08:00
Richard Henderson	adb196cbd5	tcg: Add choose_vector_size This unifies 5 copies of checks for supported vector size, and in the process fixes a missing check in tcg_gen_gvec_2s. This lead to an assertion failure for 64-bit vector multiply, which is not available in the AVX instruction set. Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-03-16 00:55:04 +08:00
Richard Henderson	7f34ed4bcd	tcg/i386: Support INDEX_op_dup2_vec for -m32 Unknown why -m32 was passing with gcc but not clang; it should have failed for both. This would be used for tcg_gen_dup_i64_vec, and visible with the right TB and an aarch64 guest. Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-03-16 00:55:04 +08:00
Richard Henderson	b2e3ae9452	tcg: Improve tcg_gen_muli_i32/i64 Convert multiplication by power of two to left shift. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-03-16 00:55:04 +08:00
Richard Henderson	14e4c1e235	tcg/aarch64: Add vector operations Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:08 +00:00
Richard Henderson	770c2fc7bb	tcg/i386: Add vector operations The x86 vector instruction set is extremely irregular. With newer editions, Intel has filled in some of the blanks. However, we don't get many 64-bit operations until SSE4.2, introduced in 2009. The subsequent edition was for AVX1, introduced in 2011, which added three-operand addressing, and adjusts how all instructions should be encoded. Given the relatively narrow 2 year window between possible to support and desirable to support, and to vastly simplify code maintainence, I am only planning to support AVX1 and later cpus. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:08 +00:00
Richard Henderson	170ba88f45	tcg/optimize: Handle vector opcodes during optimize Trivial move and constant propagation. Some identity and constant function folding, but nothing that requires knowledge of the size of the vector element. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:06 +00:00
Richard Henderson	22fc352703	tcg: Add generic vector helpers with a scalar operand Use dup to convert a non-constant scalar to a third vector. Add addition, multiplication, and logical operations with an immediate. Add addition, subtraction, multiplication, and logical operations with a non-constant scalar. Allow for the front-end to build operations in which the scalar operand comes first. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:06 +00:00
Richard Henderson	f49b12c6e6	tcg: Add generic helpers for saturating arithmetic No vector ops as yet. SSE only has direct support for 8- and 16-bit saturation; handling 32- and 64-bit saturation is much more expensive. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:06 +00:00
Richard Henderson	3774030a3e	tcg: Add generic vector ops for multiplication Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:06 +00:00
Richard Henderson	212be173f0	tcg: Add generic vector ops for comparisons Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:05 +00:00
Richard Henderson	d0ec97967f	tcg: Add generic vector ops for constant shifts Opcodes are added for scalar and vector shifts, but considering the varied semantics of these do not expose them to the front ends. Do go ahead and provide them in case they are needed for backend expansion. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:05 +00:00
Richard Henderson	db432672dc	tcg: Add generic vector expanders Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:05 +00:00
Richard Henderson	474b2e8f0f	tcg: Standardize integral arguments to expanders Some functions use intN_t arguments, some use uintN_t, some just used "unsigned". To aid putting function pointers in tables, we need consistency. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:05 +00:00
Richard Henderson	d2fd745fe8	tcg: Add types and basic operations for host vectors Nothing uses or enables them yet. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:54:04 +00:00
Richard Henderson	da73a4abca	tcg: Allow multiple word entries into the constant pool This will be required for storing vector constants. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-08 15:53:34 +00:00
Richard Henderson	030ffe39dd	tcg/ppc: Allow a 32-bit offset to the constant pool We recently relaxed the limit of the number of opcodes that can appear in a TranslationBlock. In certain cases this has resulted in relocation overflow. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-01-16 08:21:56 -08:00
Richard Henderson	4a64e0fd68	tcg/ppc: Support tlb offsets larger than 64k AArch64 with SVE has an offset of 80k to the 8th TLB. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-01-16 08:21:56 -08:00
Richard Henderson	71f9cee9d0	tcg/arm: Support tlb offsets larger than 64k AArch64 with SVE has an offset of 80k to the 8th TLB. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-01-16 08:21:56 -08:00
Richard Henderson	7170ac3313	tcg/arm: Fix double-word comparisons The code sequence we were generating was only good for unsigned comparisons. For signed comparisions, use the sequence from gcc. Fixes booting of ppc64 firmware, with a patch changing the code sequence for ppc comparisons. Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-01-16 08:20:39 -08:00
Richard Henderson	1df3caa946	tcg: Allow 6 arguments to TCG helpers We already handle this in the backends, and the lifetime datum for the TCGOp is already large enough. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-12-29 12:43:40 -08:00
Richard Henderson	923ed17501	tcg: Add tcg_signed_cond Complimenting the existing tcg_unsigned_cond. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-12-29 12:43:40 -08:00
Richard Henderson	cd9090aa9d	tcg: Generalize TCGOp parameters We had two fields specific to INDEX_op_call. Rename these and add some macros so that the fields may be reused for other opcodes. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-12-29 12:43:39 -08:00
Richard Henderson	15fa08f845	tcg: Dynamically allocate TCGOps With no fixed array allocation, we can't overflow a buffer. This will be important as optimizations related to host vectors may expand the number of ops used. Use QTAILQ to link the ops together. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-12-29 12:43:39 -08:00
Richard Henderson	f764718d0c	tcg: Remove TCGV_UNUSED* and TCGV_IS_UNUSED* These are now trivial sets and tests against NULL. Unwrap. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-12-29 12:43:39 -08:00
Richard Henderson	ba2c747992	tcg/s390x: Use constant pool for prologue Rather than have separate code only used for guest_base, rely on a recent change to handle constant pool entries. Cc: qemu-s390x@nongnu.org Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-11-03 09:33:45 +01:00
Richard Henderson	5b38ee3161	tcg: Allow constant pool entries in the prologue Both ARMv6 and AArch64 currently may drop complex guest_base values into the constant pool. But generic code wasn't expecting that, and the pool is not emitted. Correct that. Tested-by: Emilio G. Cota <cota@braap.org> Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-11-03 09:33:45 +01:00
Richard Henderson	1c2adb958f	tcg: Initialize cpu_env generically This is identical for each target. So, move the initialization to common code. Move the variable itself out of tcg_ctx and name it cpu_env to minimize changes within targets. This also means we can remove tcg_global_reg_new_{ptr,i32,i64}, since there are no longer global-register temps created by targets. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	3468b59e18	tcg: enable multiple TCG contexts in softmmu This enables parallel TCG code generation. However, we do not take advantage of it yet since tb_lock is still held during tb_gen_code. In user-mode we use a single TCG context; see the documentation added to tcg_region_init for the rationale. Note that targets do not need any conversion: targets initialize a TCGContext (e.g. defining TCG globals), and after this initialization has finished, the context is cloned by the vCPU threads, each of them keeping a separate copy. TCG threads claim one entry in tcg_ctxs[] by atomically increasing n_tcg_ctxs. Do not be too annoyed by the subsequent atomic_read's of that variable and tcg_ctxs; they are there just to play nice with analysis tools such as thread sanitizer. Note that we do not allocate an array of contexts (we allocate an array of pointers instead) because when tcg_context_init is called, we do not know yet how many contexts we'll use since the bool behind qemu_tcg_mttcg_enabled() isn't set yet. Previous patches folded some TCG globals into TCGContext. The non-const globals remaining are only set at init time, i.e. before the TCG threads are spawned. Here is a list of these set-at-init-time globals under tcg/: Only written by tcg_context_init: - indirect_reg_alloc_order - tcg_op_defs Only written by tcg_target_init (called from tcg_context_init): - tcg_target_available_regs - tcg_target_call_clobber_regs - arm: arm_arch, use_idiv_instructions - i386: have_cmov, have_bmi1, have_bmi2, have_lzcnt, have_movbe, have_popcnt - mips: use_movnz_instructions, use_mips32_instructions, use_mips32r2_instructions, got_sigill (tcg_target_detect_isa) - ppc: have_isa_2_06, have_isa_3_00, tb_ret_addr - s390: tb_ret_addr, s390_facilities - sparc: qemu_ld_trampoline, qemu_st_trampoline (build_trampolines), use_vis3_instructions Only written by tcg_prologue_init: - 'struct jit_code_entry one_entry' - aarch64: tb_ret_addr - arm: tb_ret_addr - i386: tb_ret_addr, guest_base_flags - ia64: tb_ret_addr - mips: tb_ret_addr, bswap32_addr, bswap32u_addr, bswap64_addr Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	e8feb96fcc	tcg: introduce regions to split code_gen_buffer This is groundwork for supporting multiple TCG contexts. The naive solution here is to split code_gen_buffer statically among the TCG threads; this however results in poor utilization if translation needs are different across TCG threads. What we do here is to add an extra layer of indirection, assigning regions that act just like pages do in virtual memory allocation. (BTW if you are wondering about the chosen naming, I did not want to use blocks or pages because those are already heavily used in QEMU). We use a global lock to serialize allocations as well as statistics reporting (we now export the size of the used code_gen_buffer with tcg_code_size()). Note that for the allocator we could just use a counter and atomic_inc; however, that would complicate the gathering of tcg_code_size()-like stats. So given that the region operations are not a fast path, a lock seems the most reasonable choice. The effectiveness of this approach is clear after seeing some numbers. I used the bootup+shutdown of debian-arm with '-tb-size 80' as a benchmark. Note that I'm evaluating this after enabling per-thread TCG (which is done by a subsequent commit). * -smp 1, 1 region (entire buffer): qemu: flush code_size=83885014 nb_tbs=154739 avg_tb_size=357 qemu: flush code_size=83884902 nb_tbs=153136 avg_tb_size=363 qemu: flush code_size=83885014 nb_tbs=152777 avg_tb_size=364 qemu: flush code_size=83884950 nb_tbs=150057 avg_tb_size=373 qemu: flush code_size=83884998 nb_tbs=150234 avg_tb_size=373 qemu: flush code_size=83885014 nb_tbs=154009 avg_tb_size=360 qemu: flush code_size=83885014 nb_tbs=151007 avg_tb_size=370 qemu: flush code_size=83885014 nb_tbs=151816 avg_tb_size=367 That is, 8 flushes. * -smp 8, 32 regions (80/32 MB per region) [i.e. this patch]: qemu: flush code_size=76328008 nb_tbs=141040 avg_tb_size=356 qemu: flush code_size=75366534 nb_tbs=138000 avg_tb_size=361 qemu: flush code_size=76864546 nb_tbs=140653 avg_tb_size=361 qemu: flush code_size=76309084 nb_tbs=135945 avg_tb_size=375 qemu: flush code_size=74581856 nb_tbs=132909 avg_tb_size=375 qemu: flush code_size=73927256 nb_tbs=135616 avg_tb_size=360 qemu: flush code_size=78629426 nb_tbs=142896 avg_tb_size=365 qemu: flush code_size=76667052 nb_tbs=138508 avg_tb_size=368 Again, 8 flushes. Note how buffer utilization is not 100%, but it is close. Smaller region sizes would yield higher utilization, but we want region allocation to be rare (it acquires a lock), so we do not want to go too small. * -smp 8, static partitioning of 8 regions (10 MB per region): qemu: flush code_size=21936504 nb_tbs=40570 avg_tb_size=354 qemu: flush code_size=11472174 nb_tbs=20633 avg_tb_size=370 qemu: flush code_size=11603976 nb_tbs=21059 avg_tb_size=365 qemu: flush code_size=23254872 nb_tbs=41243 avg_tb_size=377 qemu: flush code_size=28289496 nb_tbs=52057 avg_tb_size=358 qemu: flush code_size=43605160 nb_tbs=78896 avg_tb_size=367 qemu: flush code_size=45166552 nb_tbs=82158 avg_tb_size=364 qemu: flush code_size=63289640 nb_tbs=116494 avg_tb_size=358 qemu: flush code_size=51389960 nb_tbs=93937 avg_tb_size=362 qemu: flush code_size=59665928 nb_tbs=107063 avg_tb_size=372 qemu: flush code_size=38380824 nb_tbs=68597 avg_tb_size=374 qemu: flush code_size=44884568 nb_tbs=79901 avg_tb_size=376 qemu: flush code_size=50782632 nb_tbs=90681 avg_tb_size=374 qemu: flush code_size=39848888 nb_tbs=71433 avg_tb_size=372 qemu: flush code_size=64708840 nb_tbs=119052 avg_tb_size=359 qemu: flush code_size=49830008 nb_tbs=90992 avg_tb_size=362 qemu: flush code_size=68372408 nb_tbs=123442 avg_tb_size=368 qemu: flush code_size=33555560 nb_tbs=59514 avg_tb_size=378 qemu: flush code_size=44748344 nb_tbs=80974 avg_tb_size=367 qemu: flush code_size=37104248 nb_tbs=67609 avg_tb_size=364 That is, 20 flushes. Note how a static partitioning approach uses the code buffer poorly, leading to many unnecessary flushes. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	34184b0718	tcg: allocate optimizer temps with tcg_malloc Groundwork for supporting multiple TCG contexts. While at it, also allocate temps_used directly as a bitmap of the required size, instead of using a bitmap of TCG_MAX_TEMPS via TCGTempSet. Performance-wise we lose about 1.12% in a translation-heavy workload such as booting+shutting down debian-arm: Performance counter stats for 'taskset -c 0 arm-softmmu/qemu-system-arm \ -machine type=virt -nographic -smp 1 -m 4096 \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-device,netdev=unet \ -drive file=die-on-boot.qcow2,id=myblock,index=0,if=none \ -device virtio-blk-device,drive=myblock \ -kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \ -name arm,debug-threads=on -smp 1' (10 runs): exec time (s) Relative slowdown wrt original (%) --------------------------------------------------------------- original 20.213321616 0. tcg_malloc 20.441130078 1.1270214 TCGContext 20.477846517 1.3086662 g_malloc 20.780527895 2.8061013 The other two alternatives shown in the table are: - TCGContext: embed temps[TCG_MAX_TEMPS] and TCGTempSet used_temps in TCGContext. This is simple enough but it isn't faster than using tcg_malloc; moreover, it wastes memory. - g_malloc: allocate/deallocate both temps and used_temps every time tcg_optimize is executed. Suggested-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	c3fac1138e	tcg: distribute profiling counters across TCGContext's This is groundwork for supporting multiple TCG contexts. To avoid scalability issues when profiling info is enabled, this patch makes the profiling info counters distributed via the following changes: 1) Consolidate profile info into its own struct, TCGProfile, which TCGContext also includes. Note that tcg_table_op_count is brought into TCGProfile after dropping the tcg_ prefix. 2) Iterate over the TCG contexts in the system to obtain the total counts. This change also requires updating the accessors to TCGProfile fields to use atomic_read/set whenever there may be conflicting accesses (as defined in C11) to them. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	df2cce2968	tcg: introduce **tcg_ctxs to keep track of all TCGContext's Groundwork for supporting multiple TCG contexts. Note that having n_tcg_ctxs is unnecessary. However, it is convenient to have it, since it will simplify iterating over the array: we'll have just a for loop instead of having to iterate over a NULL-terminated array (which would require n+1 elems) or having to check with ifdef's for usermode/softmmu. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	26689780f8	gen-icount: fold exitreq_label into TCGContext Groundwork for supporting multiple TCG contexts. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	b1311c4acf	tcg: define tcg_init_ctx and make tcg_ctx a pointer Groundwork for supporting multiple TCG contexts. The core of this patch is this change to tcg/tcg.h: > -extern TCGContext tcg_ctx; > +extern TCGContext tcg_init_ctx; > +extern TCGContext tcg_ctx; Note that for now we set tcg_ctx to whatever TCGContext is passed to tcg_context_init -- in this case &tcg_init_ctx. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	44ded3d048	tcg: take tb_ctx out of TCGContext Groundwork for supporting multiple TCG contexts. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	e82d5a2460	tcg: check CF_PARALLEL instead of parallel_cpus Thereby decoupling the resulting translated code from the current state of the system. The tb->cflags field is not passed to tcg generation functions. So we add a field to TCGContext, storing there a copy of tb->cflags. Most architectures have <= 32 registers, which results in a 4-byte hole in TCGContext. Use this hole for the new field. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:42 -07:00
Emilio G. Cota	4e2ca83e71	tcg: define CF_PARALLEL and use it for TB hashing along with CF_COUNT_MASK This will enable us to decouple code translation from the value of parallel_cpus at any given time. It will also help us minimize TB flushes when generating code via EXCP_ATOMIC. Note that the declaration of parallel_cpus is brought to exec-all.h to be able to define there the "curr_cflags" inline. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:41 -07:00
Richard Henderson	e89b28a635	tcg: Use offsets not indices for TCGv_* Using the offset of a temporary, relative to TCGContext, rather than its index means that we don't use 0. That leaves offset 0 free for a NULL representation without having to leave index 0 unused. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 13:53:36 -07:00
Richard Henderson	11f4e8f8bf	tcg: Remove TCGV_EQUAL* When we used structures for TCGv_*, we needed a macro in order to perform a comparison. Now that we use pointers, this is just clutter. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 21:50:15 +02:00
Richard Henderson	dc41aa7d34	tcg: Remove GET_TCGV_* and MAKE_TCGV_* The GET and MAKE functions weren't really specific enough. We now have a full complement of functions that convert exactly between temporaries, arguments, tcgv pointers, and indices. The target/sparc change is also a bug fix, which would have affected a host that defines TCG_TARGET_HAS_extr[lh]_i64_i32, i.e. MIPS64. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 21:49:30 +02:00
Richard Henderson	085272b35e	tcg: Introduce temp_tcgv_{i32,i64,ptr} Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 21:48:59 +02:00
Richard Henderson	ae8b75dc6e	tcg: Introduce tcgv_{i32,i64,ptr}_{arg,temp} Transform TCGv_* to an "argument" or a temporary. For now, an argument is simply the temporary index. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 21:47:46 +02:00
Richard Henderson	960c50e077	tcg: Push tcg_ctx into tcg_gen_callN Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 21:47:29 +02:00
Richard Henderson	b7e8b17a77	tcg: Push tcg_ctx into generator functions Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-24 21:45:07 +02:00
Richard Henderson	6349039d0b	tcg: Use per-temp state data in optimize While we're touching many of the lines anyway, adjust the naming of the functions to better distinguish when "TCGArg" vs "TCGTemp" should be used. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:45:07 +02:00
Richard Henderson	54534d7cfd	tcg: Remove unused TCG_CALL_DUMMY_TCGV Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:45:07 +02:00
Richard Henderson	2272e4a791	tcg: Change temp_allocate_frame arg to TCGTemp Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:44:52 +02:00
Richard Henderson	ac3b88911e	tcg: Avoid loops against variable bounds Copy s->nb_globals or s->nb_temps to a local variable for the purposes of iteration. This should allow the compiler to use low-overhead looping constructs on some hosts. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:44:34 +02:00
Richard Henderson	b83eabeac0	tcg: Use per-temp state data in liveness This avoids having to allocate external memory for each temporary. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:44:34 +02:00
Richard Henderson	1807f4c400	tcg: Introduce temp_arg, export temp_idx At the same time, drop the TCGContext argument and use tcg_ctx instead. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:44:12 +02:00
Richard Henderson	c6c7d84df8	tcg: Return NULL temp for TCG_CALL_DUMMY_ARG Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:44:02 +02:00
Richard Henderson	fa477d2547	tcg: Add temp_global bit to TCGTemp This avoids needing to test the index of a temp against nb_globals. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:43:50 +02:00
Richard Henderson	434391390b	tcg: Introduce arg_temp Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:43:36 +02:00
Richard Henderson	dd18629201	tcg: Propagate TCGOp down to allocators Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:34:47 +02:00
Richard Henderson	efee3746fa	tcg: Propagate args to op->args in tcg.c Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:34:47 +02:00
Richard Henderson	acd937019b	tcg: Propagate args to op->args in optimizer Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:34:47 +02:00
Richard Henderson	75e8b9b7aa	tcg: Merge opcode arguments into TCGOp Rather than have a separate buffer of 10*max_ops entries, give each opcode 10 entries. The result is actually a bit smaller and should have slightly more cache locality. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-10-24 21:34:47 +02:00
Jiang Biao	8df8d529ed	tcg/mips: delete commented out extern keyword. Delete commented out extern keyword on link_error(). Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Message-Id: <1506762042-32145-1-git-send-email-jiang.biao2@zte.com.cn> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 09:45:01 -07:00
Emilio G. Cota	a505785cd2	tcg: define TCG_HIGHWATER Will come in handy very soon. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 09:45:00 -07:00
Emilio G. Cota	619205fd1f	tcg: take .helpers out of TCGContext Groundwork for supporting multiple TCG contexts. The hash table becomes read-only after it is filled in, so we can save space by keeping just a global pointer to it. Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	5e75150cdf	tci: move tci_regs to tcg_qemu_tb_exec's stack Groundwork for supporting multiple TCG contexts. Compile-tested for all targets on an x86_64 host. Suggested-by: Richard Henderson <rth@twiddle.net> Acked-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	e7e168f413	exec-all: extract tb->tc_* into a separate struct tc_tb In preparation for adding tc.size to be able to keep track of TB's using the binary search tree implementation from glib. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	7f11636dbe	tcg: remove addr argument from lookup_tb_ptr It is unlikely that we will ever want to call this helper passing an argument other than the current PC. So just remove the argument, and use the pc we already get from cpu_get_tb_cpu_state. This change paves the way to having a common "tb_lookup" function. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	d453ec7825	tcg/mips: constify tcg_target_callee_save_regs Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Emilio G. Cota	e268f4c036	tcg/i386: constify tcg_target_callee_save_regs Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-10-10 07:37:10 -07:00
Richard Henderson	89b2e37e65	tcg/mips: Fully convert tcg_target_op_def Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	9be44a16c2	tcg/sparc: Fully convert tcg_target_op_def Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	6cb3658a04	tcg/ppc: Fully convert tcg_target_op_def Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	7536b82d28	tcg/arm: Fully convert tcg_target_op_def Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	1897cc2eb8	tcg/aarch64: Fully convert tcg_target_op_def Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	80a8b9a910	tcg: Fix types in tcg_regset_{set,reset}_reg There was a potential problem here with an ILP32 host with 64 host registers. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	f46934df66	tcg: Remove tcg_regset_set32 It's not even clear what the interface REG and VAL32 were supposed to mean. All uses had REG = 0 and VAL32 was the bitset assigned to the destination. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	07ddf036fa	tcg: Remove tcg_regset_{or,and,andnot,not} Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	d21369f5fb	tcg: Remove tcg_regset_set Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	ccb1bb66ea	tcg: Remove tcg_regset_clear Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Richard Henderson	be0f34b584	tcg: Add tcg_op_supported Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Philippe Mathieu-Daudé	61a3f8f6c0	accel/tcg: move tcg-runtime to accel/tcg/ Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20170911213328.9701-4-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Philippe Mathieu-Daudé	ba026602a6	tcg/ppc: disable atomic write check on ppc32 This fixes building for ppc64 on ppc32 (changed in `5964fca8a1`): tcg/ppc/tcg-target.inc.c: In function 'tb_target_set_jmp_target': include/qemu/compiler.h:86:30: error: static assertion failed: \ "not expecting: sizeof((uint64_t )jmp_addr) > ATOMIC_REG_SIZE" QEMU_BUILD_BUG_ON(sizeof(ptr) > ATOMIC_REG_SIZE); \ ^ tcg/ppc/tcg-target.inc.c:1377:9: note: in expansion of macro 'atomic_set' atomic_set((uint64_t )jmp_addr, pair); ^ Suggested-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20170911204936.5020-1-f4bug@amsat.org> [rth: Added commentary requested by pmm.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-17 06:52:19 -07:00
Philippe Mathieu-Daudé	04ef33052c	tcg/tci: do not use ldst label (never implemented) changed in `659ef5cbb8`, this fixes building with --enable-tcg-interpreter: /home/travis/build/qemu/qemu/tcg/tcg.c:116:14: error: ‘tcg_out_ldst_finalize’ used but never defined [-Werror] static bool tcg_out_ldst_finalize(TCGContext *s); ^ Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20170911022839.23231-1-f4bug@amsat.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-11 19:24:05 +01:00
Richard Henderson	53c89efd02	tcg/ppc: Use constant pool for movi Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	77bfc7c0b4	tcg/ppc: Look for shifted constants Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	5964fca8a1	tcg/ppc: Change TCG_REG_RA to TCG_REG_TB At this point the conversion is a wash. Loading of TB+ofs is smaller, but the actual return address from exit_tb is larger. There are a few more insns required to transition between TBs. But the expectation is that accesses to the constant pool will on the whole be smaller. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	afe74dbd6a	tcg/arm: Use constant pool for call Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	880ad9626c	tcg/arm: Use constant pool for movi Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	2a8ab93c6b	tcg/arm: Extract INSN_NOP We'll want this for tcg_out_nop_fill. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	1507061637	tcg/arm: Code rearrangement Move constants before all of the functions. Move tcg_out_<format> functions before all of the others. No functional change. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	95ede84f4d	tcg/arm: Tighten tlb indexing offset test We are not going to use ldrd for loading the comparator for 32-bit guests, so don't limit cmp_off to 8 bits then. This eliminates one insn in the tlb load for some guests. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	647ab96aaf	tcg/arm: Improve tlb load for armv7 Use UBFX to avoid limitation on CPU_TLB_BITS. Since we're dropping the initial shift, we need to replace the page masking. We can use MOVW+BIC to do this without shifting. The result is the same size as the armv6 path with one less conditional instruction. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	e9823b4c33	tcg/sparc: Use constant pool for movi Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	ab20bdc116	tcg/sparc: Introduce TCG_REG_TB Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	55129955e9	tcg/aarch64: Use constant pool for movi Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	a534bb15f3	tcg/s390: Use constant pool for cmpi Also use CHI/CGHI for 16-bit signed constants. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	5bf67a9217	tcg/s390: Use constant pool for xori Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	4046d9ca04	tcg/s390: Use constant pool for ori Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	bdcd5d1926	tcg/s390: Use constant pool for andi Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	28eef8aaec	tcg/s390: Use constant pool for movi Split out maybe_out_small_movi for use with other operations that want to add to the constant pool. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	e692a3492d	tcg/s390: Fix sign of patch_reloc addend We were passing in -2 instead of +2, but then ignoring the actual contents of addend in the calculation. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	829e1376d9	tcg/s390: Introduce TCG_REG_TB Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	4e45f23943	tcg/i386: Store out-of-range call targets in constant pool Already it saves 2 bytes per call, but also the constant pool entry may well be shared across multiple calls. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	57a269469d	tcg: Infrastructure for managing constant pools A new shared header tcg-pool.inc.c adds new_pool_label, for registering a tcg_target_ulong to be emitted after the generated code, plus relocation data to install a pointer to the data. A new pointer is added to the TCGContext, so that we dump the constant pool as data, not code. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	659ef5cbb8	tcg: Rearrange ldst label tracking Dispense with TCGBackendData, as it has never been used for more than holding a single pointer. Use a define in the cpu/tcg-target.h to signal requirement for TCGLabelQemuLdst, so that we can drop the no-op tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:35 -07:00
Richard Henderson	a858339336	tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.h Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional function tb_target_set_jmp_target. While we're touching all backends, add a parameter for tb->tc_ptr; we're going to need it shortly for some backends. Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c. This opens the possibility for TCG_TARGET_HAS_direct_jump to be a runtime decision -- based on host cpu capabilities, the size of code_gen_buffer, or a future debugging switch. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-07 11:57:34 -07:00
Richard Henderson	cda4a338c4	tcg/tci: Add TCG_TARGET_DEFAULT_MO Missed being added as part of `71650df7b0`. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-09-07 18:57:34 +01:00
Richard Henderson	4609190b5f	tcg/s390: Use slbgr for setcond le and leu Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:24:46 -07:00
Richard Henderson	7af525af01	tcg/s390: Use load-on-condition-2 facility This allows LOAD HALFWORD IMMEDIATE ON CONDITION, eliminating one insn in some common cases. Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:24:41 -07:00
Richard Henderson	c2097136ad	tcg/s390: Use distinct-operands facility This allows using a 3-operand insn form for some arithmetic, logicals and shifts. Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:24:38 -07:00
Richard Henderson	e42349cbd6	tcg/s390: Merge ori+xori facilities check to tcg_target_op_def Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:24:35 -07:00
Richard Henderson	ba18b07dc6	tcg/s390: Merge add2i facilities check to tcg_target_op_def Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:24:33 -07:00
Richard Henderson	a8f0269e9e	tcg/s390: Merge muli facilities check to tcg_target_op_def Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:24:31 -07:00
Richard Henderson	07952d9570	tcg/s390: Merge cmpi facilities check to tcg_target_op_def Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:24:28 -07:00
Richard Henderson	9b5500b697	tcg/s390: Fully convert tcg_target_op_def Use a switch instead of searching a table. Acked-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-09-06 07:22:24 -07:00
Pranith Kumar	b32dc3370a	tcg: Implement implicit ordering semantics Currently, we cannot use mttcg for running strong memory model guests on weak memory model hosts due to missing ordering semantics. We implicitly generate fence instructions for stronger guests if an ordering mismatch is detected. We generate fences only for the orders for which fence instructions are necessary, for example a fence is not necessary between a store and a subsequent load on x86 since its absence in the guest binary tells that ordering need not be ensured. Also note that if we find multiple subsequent fence instructions in the generated IR, we combine them in the TCG optimization pass. This patch allows us to boot an x86 guest on ARM64 hosts using mttcg. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170829063313.10237-4-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-05 13:41:46 -07:00
Pranith Kumar	71650df7b0	tcg: Add tcg target default memory ordering Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170829063313.10237-3-bobby.prani@gmail.com> [rth: Dropped ia64 hunk] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-05 12:56:40 -07:00
Richard Henderson	a46c1244a0	tcg: Remove support for ia64 as host We threatened to remove ia64 as host in v2.9.0. Its time has now come. There are still some usages of defined(__ia64__) throughout the source code that would be triggered if one were to enable TCI on an ia64 host. Leave those alone for now. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-09-05 12:39:25 -07:00
Richard Henderson	13aaef678e	tcg: Increase minimum alignment from tcg_malloc to 8 For a 64-bit ILP32 host, aligning to sizeof(long) is not enough. Guess the minimum for any host is 8, as that covers uint64_t. Qemu doesn't use a host long double or host vectors, except in extremely limited circumstances. Fixes a bus error for a sparc v8plus host. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-08-03 11:00:30 -07:00
Richard Henderson	ca671de8af	tcg/arm: Fix runtime overalignment test Patch `85aa80813d` changed the IF emitting the TST instruction, but failed to change the ?: converting CMP to CMPEQ, so the result of the TST is ignored. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-08-03 10:56:44 -07:00
Philippe Mathieu-Daudé	b208ac07ea	docs: fix broken paths to docs/devel/atomics.txt With the move of some docs/ to docs/devel/ on `ac06724a71`, a couple of references were not updated. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-07-31 13:12:47 +03:00
Richard Henderson	5dd8990841	util: Introduce include/qemu/cpuid.h Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the supplied <cpuid.h> does not contain the bit_AVX2 define that we use when detecting whether the routine can be enabled. Introduce a qemu-specific header that uses the compiler's definition of __cpuid et al, but supplies any missing bit_* definitions needed. This avoids introducing any extra ifdefs to util/bufferiszero.c, and allows quite a few to be removed from tcg/i386/tcg-target.inc.c. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20170719044018.18063-1-rth@twiddle.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-07-24 12:42:55 +01:00
Philippe Mathieu-Daudé	797ed66d29	tcg/tci: enable bswap16_i64 Altough correctly implemented, bswap16_i64() never got tested/executed so the safety TODO() statement was never removed. Since it got now tested the TODO() can be removed. while running Alex Bennée's image aarch64-linux-3.15rc2-buildroot.img: Trace 0x7fa1904b0890 [0: ffffffc00036cd04] ---------------- IN: 0xffffffc00036cd24: 5ac00694 rev16 w20, w20 OP: ---- ffffffc00036cd24 0000000000000000 0000000000000000 ext32u_i64 tmp3,x20 ext16u_i64 tmp2,tmp3 bswap16_i64 x20,tmp2 movi_i64 tmp4,$0x10 shr_i64 tmp2,tmp3,tmp4 ext16u_i64 tmp2,tmp2 bswap16_i64 tmp2,tmp2 deposit_i64 x20,x20,tmp2,$0x10,$0x10 Linking TBs 0x7fa1904b0890 [ffffffc00036cd04] index 0 -> 0x7fa1904b0aa0 [ffffffc00036cd24] Trace 0x7fa1904b0aa0 [0: ffffffc00036cd24] TODO qemu/tci.c:1049: tcg_qemu_tb_exec() qemu/tci.c:1049: tcg fatal error Aborted Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Jaroslaw Pelczar <j.pelczar@samsung.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20170718045540.16322-11-f4bug@amsat.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-07-19 14:45:16 -07:00
Jiang Biao	4df9cac57f	tcg/mips: reserve a register for the guest_base. Reserve a register for the guest_base using ppc code for reference. By doing so, we do not have to recompute it for every memory load. Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1499677934-2249-1-git-send-email-jiang.biao2@zte.com.cn>	2017-07-19 14:45:15 -07:00
Lluís Vilanova	61a67f71dd	exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-id: 149915775266.6295.10060144081246467690.stgit@frigg.lan Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-07-17 13:11:05 +01:00
Jiang Biao	8b8d768f19	tcg/mips: Bugfix for crash when running program with qemu-i386. When running a helloworld program with qemu-i386 in linux-user mode on Loongson 3A3000, it will crash. This patch fix the bug. Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Message-Id: <1499669979-25904-1-git-send-email-jiang.biao2@zte.com.cn> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-07-09 21:11:38 -10:00
Pranith Kumar	2acee8b2b5	tcg/aarch64: Enable indirect jump path using LDR (literal) This patch enables the indirect jump path using an LDR (literal) instruction. It will be interesting to test and see which performs better among the two paths. CC: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170630143614.31059-3-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-07-09 21:10:23 -10:00
Pranith Kumar	b68686bd4b	tcg/aarch64: Use ADRP+ADD to compute target address We use ADRP+ADD to compute the target address for goto_tb. This patch introduces the NOP instruction which is used to align the above instruction pair so that we can use one atomic instruction to patch the destination offsets. CC: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170630143614.31059-2-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-07-09 21:10:23 -10:00
Pranith Kumar	23b7aa1d2a	tcg/aarch64: Introduce and use long branch to register We can use a branch to register instruction for exit_tb for offsets greater than 128MB. CC: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170630143614.31059-1-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-07-09 21:10:23 -10:00
Paolo Bonzini	beeaef55e4	tcg: move tb_lock out of translate-all.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-04 16:01:16 +02:00
Peter Maydell	db7a99cdc1	Queued TCG patches -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZSBP2AAoJEK0ScMxN0CebnyMH/1ZiDhYiqCD7PYfk4/Y7Db+h MNKNozrWKyChWQp1RzwWqcBaIzbuMZkDYn8dfS419PNtFRNoYtHjhYvjSTfcrxS0 U8dGOoqQUHCr/jlyIDUE4y5+aFA9R/1Ih5IQv+QCi5QNXcfeST8zcYF+ImuikP6C 7heIc7dE9kXdA8ycWJ39kYErHK9qEJbvDx6dxMPmb4cM36U239Zb9so985TXULlQ LoHrDpOCBzCbsICBE8iP2RKDvcwENIx21Dwv+9gW/NqR+nRdKcxhTjKEodkS8gl/ UxMxM/TjIPQOLLUhdck5DFgIgBgQWHRqPMJKqt466I0JlXvSpifmWxckWzslXLc= =R+em -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170619' into staging Queued TCG patches # gpg: Signature made Mon 19 Jun 2017 19:12:06 BST # gpg: using RSA key 0xAD1270CC4DD0279B # gpg: Good signature from "Richard Henderson <rth7680@gmail.com>" # gpg: aka "Richard Henderson <rth@redhat.com>" # gpg: aka "Richard Henderson <rth@twiddle.net>" # Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B * remotes/rth/tags/pull-tcg-20170619: target/arm: Exit after clearing aarch64 interrupt mask target/s390x: Exit after changing PSW mask target/alpha: Use tcg_gen_lookup_and_goto_ptr tcg: Increase hit rate of lookup_tb_ptr tcg/arm: Use ldr (literal) for goto_tb tcg/arm: Try pc-relative addresses for movi tcg/arm: Remove limit on code buffer size tcg/arm: Use indirect branch for goto_tb tcg/aarch64: Use ADR in tcg_out_movi translate-all: consolidate tb init in tb_gen_code tcg: allocate TB structs before the corresponding translated code util: add cacheinfo Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-06-22 10:25:03 +01:00
Richard Henderson	308714e6bc	tcg/arm: Use ldr (literal) for goto_tb The new placement of the TB means that we can use one insn to load the goto_tb destination directly from the TB. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-19 11:10:59 -07:00
Richard Henderson	9c39b94f14	tcg/arm: Try pc-relative addresses for movi Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-19 11:10:59 -07:00
Richard Henderson	3fb53fb4d1	tcg/arm: Use indirect branch for goto_tb Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-19 11:10:59 -07:00
Richard Henderson	cc74d332ff	tcg/aarch64: Use ADR in tcg_out_movi The new placement of the TB means that we can use one insn to load the return value for exit_tb returning the TB pointer. Tested-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-19 11:10:59 -07:00
Emilio G. Cota	6e3b2bfd6a	tcg: allocate TB structs before the corresponding translated code Allocating an arbitrarily-sized array of tbs results in either (a) a lot of memory wasted or (b) unnecessary flushes of the code cache when we run out of TB structs in the array. An obvious solution would be to just malloc a TB struct when needed, and keep the TB array as an array of pointers (recall that tb_find_pc() needs the TB array to run in O(log n)). Perhaps a better solution, which is implemented in this patch, is to allocate TB's right before the translated code they describe. This results in some memory waste due to padding to have code and TBs in separate cache lines--for instance, I measured 4.7% of padding in the used portion of code_gen_buffer when booting aarch64 Linux on a host with 64-byte cache lines. However, it can allow for optimizations in some host architectures, since TCG backends could safely assume that the TB and the corresponding translated code are very close to each other in memory. See this message by rth for a detailed explanation: https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html Subject: Re: GSoC 2017 Proposal: TCG performance enhancements Message-ID: <1e67644b-4b30-887e-d329-1848e94c9484@twiddle.net> Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1496790745-314-3-git-send-email-cota@braap.org> [rth: Simplify the arithmetic in tcg_tb_alloc] Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-19 11:10:59 -07:00
Emilio G. Cota	b255b2c8a5	util: add cacheinfo Add helpers to gather cache info from the host at init-time. For now, only export the host's I/D cache line sizes, which we will use to improve cache locality to avoid false sharing. Suggested-by: Richard Henderson <rth@twiddle.net> Suggested-by: Geert Martin Ijewski <gm.ijewski@web.de> Tested-by: Geert Martin Ijewski <gm.ijewski@web.de> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1496794624-4083-1-git-send-email-cota@braap.org> [rth: Move all implementations from tcg/ppc/] Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-19 11:10:59 -07:00
Yang Zhong	244f144134	tcg: move tcg backend files into accel/tcg/ move tcg-runtime.c, translate-all.(ch) and translate-common.c into accel/tcg/ subdirectory and updated related trace-events file. Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <1496383606-18060-4-git-send-email-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-15 11:04:06 +02:00
Aurelien Jarno	5786e0683c	tcg/mips: implement goto_ptr Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <20170430145254.25616-2-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Richard Henderson	085c648bef	tcg/arm: Implement goto_ptr Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Richard Henderson	702a947484	tcg/arm: Clarify tcg_out_bx for arm4 host In theory this would re-enable usage of QEMU on an armv4 host. Whether this is worthwhile is debatable -- we've been unconditionally issuing the armv5t BX instruction in the prologue since 2011 without complaint. Possibly we should simply require an armv6 host. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Richard Henderson	46644483ca	tcg/s390: Implement goto_ptr Tested-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Richard Henderson	38f81dc593	tcg/sparc: Implement goto_ptr Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Richard Henderson	b19f0c2e7d	tcg/aarch64: Implement goto_ptr Measurements: SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit ARMv8 (Atlas/A57) @ 2.4 GHz 1.45x +-+-------------------------------------------------------------------------------------------------------------+-+ \| ***** \| \| +++ * * +goto-ptr \| 1.4x +-+...****...................................................................................................+-+ \| +++* * * +++ \| 1.35x +-+................................................................****....................................+-+ \| * * * +++ \| \| * * * * * * \| 1.3x +-+.......................................................................................................+-+ \| * * * * * * \| \| * * * * * * ***** \| 1.25x +-+.................****.........................................................***.................+-+ \| * * * * * * * +++ * * \| 1.2x +-+.................................................................................................+-+ \| * * * * * * * * * * * * \| \| * * * * * * * * * * * * ***** \| 1.15x +-+...............................................................................................+-+ \| * * * * * * * * +++ * * * * * * \| \| * * * * * * * * ***** * * * * * * \| 1.1x +-+........................****.........***..................................................+-+ \| * * * * * * * * * * * * * * * * * * * \| 1.05x +-+.........................................................................................+-+ \| * * ***** * * * * * * * * * * * * * * * * * * \| \| * * * * * * * * * * * * *** *** * * * * * * * * * * \| 1x +-+---***---*---*----*---*---*---*---*---*---*----*---*---***---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjenxalancbmk hmean png: http://imgur.com/en9HE8L Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Richard Henderson	0c240785a8	tcg/ppc: Implement goto_ptr Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Emilio G. Cota	5cb4ef80f6	tcg/i386: implement goto_ptr Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-6-git-send-email-cota@braap.org> [rth: Reuse goto_ptr epilogue for exit_tb 0.] Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Emilio G. Cota	cedbcb0152	tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org> [rth: Squashed 4 related commits.] Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-06-05 09:25:42 -07:00
Aurelien Jarno	2f5a5f5774	tcg/mips: fix field extraction opcode The "msb" argument should correspond to (len - 1). Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2017-05-06 12:48:53 +02:00
Richard Henderson	79b1af9062	tcg: Initialize return value after exit_atomic Users of tcg_gen_atomic_cmpxchg and do_atomic_op rightfully utilize the output. Even though this code is dead, it gets translated, and without the initialization we encounter a tcg_error. Reported-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Tested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-04-26 19:26:11 +02:00
Peter Maydell	fa54abb8c2	Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1 We already require gcc 4.1 or newer (for the atomic support), so the fallback codepaths for older gcc versions than that are now dead code and we can just delete them. NB: clang reports itself as gcc 4.2 (regardless of clang version), so clang won't be using the fallbacks either. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Markus Armbruster <armbru@redhat.com>	2017-04-20 18:33:33 +01:00
Peter Maydell	5c32be5baf	tcg/sparc: Zero extend address argument to ld/st helpers The C store helper functions take the address argument as a target_ulong type; if this is 32 bit but the host is 64 bit then the SPARC calling convention requires that the caller must zero extend the value. We weren't doing this, which meant we could pass values to the caller with high bits set and QEMU would crash if it was compiled with optimizations. In particular, the i386 BIOS would not start. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1490871151-29029-3-git-send-email-peter.maydell@linaro.org Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-04-03 12:59:37 +01:00
Peter Maydell	709a340d67	tcg/sparc: Zero extend data argument to store helpers The C store helper functions take the data argument as a uint8_t, uint16_t, etc depending on the store size. The SPARC calling convention requires that data types smaller than the register size must be extended by the caller. We weren't doing this, which meant that if QEMU was compiled with optimizations enabled we could end up storing incorrect values to guest memory. (In particular the i386 guest BIOS would crash on startup.) Add code to the trampolines that call the store helpers to do the zero extension as required. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1490871151-29029-2-git-send-email-peter.maydell@linaro.org Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-04-03 12:59:14 +01:00
Paolo Bonzini	30f3dda24b	Merge branch 'icount-update' into HEAD Merge the original development branch due to breakage caused by the MTTCG merge. Conflicts: cpu-exec.c translate-common.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-03-03 16:39:18 +01:00
Pranith Kumar	dc1eccd661	aarch64: Change ext type to TCGType to fix warnings To fix the following warnings: In file included from /users/pranith/qemu/tcg/tcg.c:255: /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:879:24: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_cmp(s, ext, a, b, b_const); ~~~~~~~~~~~ ^~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:893:36: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_insn(s, 3201, CBZ, ext, a, offset); ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn' glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__) ^ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:895:37: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_insn(s, 3201, CBNZ, ext, a, offset); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn' glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__) ^ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:1610:27: warning: implicit conversion from enumeration type 'TCGType' (aka 'enum TCGType') to different enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') [-Wenum-conversion] tcg_out_brcond(s, ext, a2, a0, a1, const_args[1], arg_label(args[3])); ~~~~~~~~~~~~~~ ^~~ Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170217154311.13920-1-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-03-01 08:28:06 +11:00
Alex Bennée	ca759f9e38	tcg: enable MTTCG by default for ARM on x86 hosts This enables the multi-threaded system emulation by default for ARMv7 and ARMv8 guests using the x86_64 TCG backend. This is because on the guest side: - The ARM translate.c/translate-64.c have been converted to - use MTTCG safe atomic primitives - emit the appropriate barrier ops - The ARM machine has been updated to - hold the BQL when modifying shared cross-vCPU state - defer powerctl changes to async safe work All the host backends support the barrier and atomic primitives but need to provide same-or-better support for normal load/store operations. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Acked-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Pranith Kumar <bobby.prani@gmail.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>	2017-02-24 10:32:46 +00:00
KONRAD Frederic	8d4e9146b3	tcg: add options for enabling MTTCG We know there will be cases where MTTCG won't work until additional work is done in the front/back ends to support. It will however be useful to be able to turn it on. As a result MTTCG will default to off unless the combination is supported. However the user can turn it on for the sake of testing. Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> [AJB: move to -accel tcg,thread=multi\|single, defaults] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-02-24 10:32:45 +00:00
Alex Bennée	2093714314	tcg: move TCG_MO/BAR types into own file We'll be using the memory ordering definitions to define values for both the host and guest. To avoid fighting with circular header dependencies just move these types into their own minimal header. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>	2017-02-24 10:32:45 +00:00
Paolo Bonzini	1aab16c28a	cpu-exec: unify icount_decr and tcg_exit_req The icount interrupt flag and tcg_exit_req serve almost the same purpose, let's make them completely the same. The former TB_EXIT_REQUESTED and TB_EXIT_ICOUNT_EXPIRED cases are unified, since we can distinguish them from the value of the interrupt flag. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-22 14:56:34 +01:00
Stefan Weil	77e217d1bf	tci: Remove invalid assertions tb_jmp_insn_offset and tb_jmp_reset_offset are pointers and cannot be used with ARRAY_SIZE. Signed-off-by: Stefan Weil <sw@weilnetz.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 20170202195601.11286-1-sw@weilnetz.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-02-03 11:38:55 +00:00
Richard Henderson	39f099ec9d	tcg/i386: Always use TZCNT when available I think this is cleaner than sometimes using BSF. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-17 12:02:08 -08:00
Richard Henderson	9bf38308f6	Revert "tcg/i386: Rely on undefined/undocumented behaviour of BSF/BSR" This reverts commit `4ac7691073`. This fixes http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03062.html While I think we could get away with relying on the undocumented behaviour, the tcg constraint system isn't powerful enough to properly describe the required (non-)overlap conditions. Reported-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-17 11:59:13 -08:00
Richard Henderson	8cf9a3d3f7	tcg/aarch64: Fix tcg_out_movi There were some patterns, like 0x0000_ffff_ffff_00ff, for which we would select to begin a multi-insn sequence with MOVN, but would fail to set the 0x0000 lane back from 0xffff. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <20161207180727.6286-3-rth@twiddle.net>	2017-01-13 11:47:29 -08:00
Richard Henderson	b1eb20da62	tcg/aarch64: Fix addsub2 for 0+C When al == xzr, we cannot use addi/subi because that encodes xsp. Force a zero into the temp register for that (rare) case. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <20161207180727.6286-2-rth@twiddle.net>	2017-01-13 11:46:27 -08:00
Richard Henderson	a32b6ae897	tcg/s390: Fix merge error with facilities The variable was renamed s390_facilities. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-13 09:30:40 -08:00
Richard Henderson	993508e43e	tcg/i386: Handle ctpop opcode Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:49:59 -08:00
Richard Henderson	33e75fb9c8	tcg/ppc: Handle ctpop opcode Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:49:59 -08:00
Richard Henderson	14e99210f6	tcg: Use ctpop to generate ctz if needed Particularly when andc is also available, this is two insns shorter than using clz to compute ctz. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:49:59 -08:00
Richard Henderson	a768e4e992	tcg: Add opcode for ctpop The number of actual invocations of ctpop itself does not warrent an opcode, but it is very helpful for POWER7 to use in generating an expansion for ctz. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:48:56 -08:00
Richard Henderson	086920c2c8	tcg: Add helpers for clrsb The number of actual invocations does not warrent an opcode, and the backends generating it. But at least we can eliminate redundant helpers. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	4ac7691073	tcg/i386: Rely on undefined/undocumented behaviour of BSF/BSR The ISA manual documents the output is undefined if the input was zero. However, we document in target-i386 that the behavior of real silicon is to preserve the contents of the output register. We also mention that there are real applications that depend on this. That this is baked into silicon is mentioned as a potential cause for some false sharing behaviour wrt lzcnt/tzcnt. Taking advantage of this allows us to save 2 insns in the normal case, and 4 insns for i686 emulating a 64-bit clz. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	bbf25f90ba	tcg/i386: Handle ctz and clz opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	6a5aed4bdc	tcg/i386: Allow bmi2 shiftx to have non-matching operands Previously we could not have different constraints for different ISA levels, which prevented us from eliding the matching constraint for shifts. We do now have to make sure that the operands match for constant shifts. We can also handle some small left shifts via lea. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	42d5b51492	tcg/i386: Hoist common arguments in tcg_out_op Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	cd26449a50	tcg/i386: Fuly convert tcg_target_op_def Use a switch instead of searching a table. Share constraints between 32-bit and 64-bit, when at all possible. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	ce411066f4	tcg/s390: Handle clz opcode Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	2a1d9d41ae	tcg/mips: Handle clz opcode Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:47:48 -08:00
Richard Henderson	cc0fec8a4d	tcg/arm: Handle ctz and clz opcodes Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	53c76c1990	tcg/aarch64: Handle ctz and clz opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	d0b07481fa	tcg/ppc: Handle ctz and clz opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	0e28d0063b	tcg: Add clz and ctz opcodes Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	17280ff4a5	tcg: Allow an operand to be matching or a constant This allows an output operand to match an input operand only when the input operand needs a register. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	069ea736b5	tcg: Pass the opcode width to target_parse_constraint This will let us choose how to interpret a given constraint depending on whether the opcode is 32- or 64-bit. Which will let us share more constraint combinations between opcodes. At the same time, change the interface to return the advanced pointer instead of passing it in/out by reference. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	f69d277ece	tcg: Transition flat op_defs array to a target callback This will allow the target to tailor the constraints to the auto-detected ISA extensions. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	82790a8709	tcg: Add markup for output requires new register This is the same concept as, and same markup as, the early clobber markup in gcc. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:11 -08:00
Richard Henderson	333b21b809	tcg/optimize: Fold movcond 0/1 into setcond Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:10 -08:00
Richard Henderson	752b1be947	tcg/s390: Support deposit into zero Since we can no longer use matching constraints, this does mean we must handle that data movement by hand. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:10 -08:00
Richard Henderson	b0bf5fe82d	tcg/s390: Implement field extraction opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:10 -08:00
Richard Henderson	b2c98d9d39	tcg/s390: Expose host facilities to tcg-target.h This lets us expose facilities to TCG_TARGET_HAS_* defines directly, rather than hiding behind function calls. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:06:10 -08:00
Richard Henderson	c05021c3c8	tcg/ppc: Implement field extraction opcodes Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:03:59 -08:00
Richard Henderson	befbb3ced5	tcg/mips: Implement field extraction opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 08:03:59 -08:00
Richard Henderson	78fdbfb946	tcg/i386: Implement field extraction opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 07:59:11 -08:00
Richard Henderson	ec903af184	tcg/arm: Implement field extraction opcodes Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 07:59:11 -08:00
Richard Henderson	40b2ccb156	tcg/arm: Move isa detection to tcg-target.h This allows us to use this detection within the TCG_TARGET_HAS_* macros, instead of requiring a function call into tcg-target.inc.c. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 07:59:11 -08:00
Richard Henderson	e2179f94a1	tcg/aarch64: Implement field extraction opcodes Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 07:59:11 -08:00
Richard Henderson	07cc68d528	tcg: Add deposit_z expander While we don't require a new opcode, it is handy to have an expander that knows the first source is zero. Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 07:59:11 -08:00
Richard Henderson	0d0d309df0	tcg: Minor adjustments to deposit expanders Assert that len is not 0. Since we have asserted that ofs + len <= N, a later check for len == N implies that ofs == 0. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 07:59:11 -08:00
Richard Henderson	7ec8bab3de	tcg: Add field extraction primitives Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of fixed position bitfields, much like we already have for deposit. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2017-01-10 07:59:11 -08:00
Jin Guojie	f0d703314e	tcg-mips: Adjust qemu_ld/st for mips64 Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-11-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:09:10 -08:00
Jin Guojie	999b941633	tcg-mips: Adjust calling conventions for mips64 Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-10-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	98d690761a	tcg-mips: Add tcg unwind info Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-9-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	0973b1cff8	tcg-mips: Adjust prologue for mips64 Take stack frame parameters out from the function body. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-8-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	32b69707df	tcg-mips: Adjust load/store functions for mips64 tcg_out_ldst: using a generic ALIAS_PADD to avoid ifdefs tcg_out_ld: generates LD or LW tcg_out_st: generates SD or SW Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-7-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	2294d05dab	tcg-mips: Adjust move functions for mips64 tcg_out_mov: using OPC_OR as most mips assemblers do; tcg_out_movi: extended to 64-bit immediate. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-6-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	7f54eaa3b7	tcg-mips: Add bswap32u and bswap64 Without the mips32r2 instructions to perform swapping, bswap is quite large, dominating the size of each reverse-endian qemu_ld/qemu_st operation. Create two subroutines in the prologue block. The subroutines require extra reserved registers (TCG_TMP[2, 3]). Using these within qemu_ld means that we need not place additional restrictions on the qemu_ld outputs. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-5-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	0119b1927d	tcg-mips: Support 64-bit opcodes Bulk patch adding 64-bit opcodes into tcg_out_op. Note that mips64 is as yet neither complete nor enabled. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-4-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	57a701fc2b	tcg-mips: Add mips64 opcodes Since the mips manual tables are in octal, reorg all of the opcodes into that format for clarity. Note that the 64-bit opcodes are as yet unused. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-3-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Jin Guojie	bb08afe9f0	tcg-mips: Move bswap code to a subroutine Without the mips32r2 instructions to perform swapping, bswap is quite large, dominating the size of each reverse-endian qemu_ld/qemu_st operation. Create a subroutine in the prologue block. The subroutine requires extra reserved registers (TCG_TMP[2, 3]). Using these within qemu_ld means that we need not place additional restrictions on the qemu_ld outputs. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: James Hogan <james.hogan@imgtec.com> Tested-by: YunQiang Su <wzssyqa@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Jin Guojie <jinguojie@loongson.cn> Message-Id: <1483592275-4496-2-git-send-email-jinguojie@loongson.cn>	2017-01-06 10:03:54 -08:00
Richard Henderson	e45d4ef6e3	tcg/s390: Remove 'R' constraint Since R0 is reserved, we don't need a special case constraint. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-12-23 19:38:27 -08:00
Richard Henderson	65839b56b9	tcg/s390: Fix setcond expansion We can't use LOAD AND TEST for unsigned data and then expect to extract the result with ADD LOGICAL WITH CARRY. Fall through to using COMPARE LOGICAL IMMEDIATE instead. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-12-23 19:38:27 -08:00
Joseph Myers	3ff91d7e85	tcg: correct 32-bit tcg_gen_ld8s_i64 sign-extension The version of tcg_gen_ld8s_i64 for 32-bit systems does a load into the low part of the return value - then attempts a sign extension into the high part, but wrongly sets the high part to a sign extension of itself rather than of the low part. This results in TCG internal errors from the use of the uninitialized high part (in some GCC tests of AArch64 NEON shift intrinsics, in particular). This patch corrects the sign-extension logic, making it match other functions such as tcg_gen_ld16s_i64. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Joseph Myers <joseph@codesourcery.com> Message-Id: <alpine.DEB.2.20.1610272333560.22353@digraph.polyomino.org.uk> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-11-01 10:30:45 -06:00
Peter Maydell	a40d4701bc	tcg/tcg.h: Improve documentation of TCGv_i32 etc types The typedefs we use for the TCGv_i32, TCGv_i64 and TCGv_ptr types are somewhat confusing, because we define them as pointers to structs, but the structs themselves are never defined. Explain in the comments a bit more clearly why this is OK and what is going on under the hood. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1477067922-26202-1-git-send-email-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-11-01 10:30:45 -06:00
Richard Henderson	5087abfb7d	tcg: Add tcg_gen_mulsu2_{i32,i64,tl} This multiply has one signed input and one unsigned input, producing the full double-width result. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1475011433-24456-2-git-send-email-rth@twiddle.net>	2016-11-01 10:30:45 -06:00
Richard Henderson	1ee73216f4	log: Add locking to large logging blocks Reuse the existing locking provided by stdio to keep in_asm, cpu, op, op_opt, op_ind, and out_asm as contiguous blocks. While it isn't possible to interleave e.g. in_asm or op_opt logs because of the TB lock protecting all code generation, it is possible to interleave cpu logs, or to interleave a cpu dump with an out_asm dump. For mingw32, we appear to have no viable solution for this. The locking functions are not properly exported from the system runtime library. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-11-01 10:29:03 -06:00
Paolo Bonzini	7d7500d998	tcg: comment on which functions have to be called with tb_lock held softmmu requires more functions to be thread-safe, because translation blocks can be invalidated from e.g. notdirty callbacks. Probably the same holds for user-mode emulation, it's just that no one has ever tried to produce a coherent locking there. This patch will guide the introduction of more tb_lock and tb_unlock calls for system emulation. Note that after this patch some (most) of the mentioned functions are still called outside tb_lock/tb_unlock. The next one will rectify this. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-31 10:51:16 +01:00
Richard Henderson	91682118aa	tcg: Emit barriers with parallel_cpus Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:01 -07:00
Richard Henderson	df79b996a7	tcg: Add CONFIG_ATOMIC64 Allow qemu to build on 32-bit hosts without 64-bit atomic ops. Even if we only allow 32-bit hosts to multi-thread emulate 32-bit guests, we still need some way to handle the 32-bit guest using a 64-bit atomic operation. Do so by dropping back to single-step. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:01 -07:00
Richard Henderson	7ebee43ee3	tcg: Add atomic128 helpers Force the use of cmpxchg16b on x86_64. Wikipedia suggests that only very old AMD64 (circa 2004) did not have this instruction. Further, it's required by Windows 8 so no new cpus will ever omit it. If we truely care about these, then we could check this at startup time and then avoid executing paths that use it. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:01 -07:00
Richard Henderson	c482cb117c	tcg: Add atomic helpers Add all of cmpxchg, op_fetch, fetch_op, and xchg. Handle both endian-ness, and sizes up to 8. Handle expanding non-atomically, when emulating in serial. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:01 -07:00
Richard Henderson	fdbc2b5722	tcg: Add EXCP_ATOMIC When we cannot emulate an atomic operation within a parallel context, this exception allows us to stop the world and try again in a serial context. Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-10-26 08:29:00 -07:00
Paolo Bonzini	0fe4fca4e1	tcg: try sti when moving a constant into a dead memory temp This comes from free from unifying tcg_reg_alloc_mov and tcg_reg_alloc_movi's handling of TEMP_VAL_CONST. It triggers often on moves to cc_dst, such as the following translation of "sub $0x3c,%esp": before: after: subl $0x3c,%ebp subl $0x3c,%ebp movl %ebp,0x10(%r14) movl %ebp,0x10(%r14) movl $0x3c,%ebx movl $0x3c,0x2c(%r14) movl %ebx,0x2c(%r14) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1473945360-13663-1-git-send-email-pbonzini@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-24 15:27:19 +02:00
Paolo Bonzini	bf28a69eeb	qemu-tech: move text from qemu-tech to tcg/README Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-07 10:05:18 +02:00
Alex Bennée	550276ae0a	tcg/optimize: move default return out of if statement This is to appease sanitizer builds which complain that: "error: control reaches end of non-void function" Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20160930213106.20186-5-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-04 10:00:25 +02:00
Richard Henderson	ebb90a005d	tcg/i386: Extend TARGET_PAGE_MASK to the proper type TARGET_PAGE_MASK, as defined, has type "int". We need to extend that to the proper target width before oring in an "unsigned". Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-20 11:45:30 -07:00
Pranith Kumar	34f939218c	tcg: Optimize fence instructions This commit optimizes fence instructions. Two optimizations are currently implemented: (1) unnecessary duplicate fence instructions, and (2) merging weaker fences into a stronger fence. [rth: Merge tcg_optimize_mb back into tcg_optimize, so that we only loop over the opcode stream once. Merge "unrelated" weaker barriers into one stronger barrier.] Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160823134825.32578-1-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:12 -07:00
Pranith Kumar	a1e69e2f81	tcg/tci: Add support for fence Cc: Stefan Weil <sw@weilnetz.de> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-11-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:12 -07:00
Pranith Kumar	f8f03b3707	tcg/sparc: Add support for fence Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-10-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	c9314d610e	tcg/s390: Add support for fence Cc: Alexander Graf <agraf@suse.de> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-9-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	7b4af5ee8a	tcg/ppc: Add support for fence Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-8-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	6f0b99104a	tcg/mips: Add support for fence Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-7-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	5bbadbdfd6	tcg/ia64: Add support for fence Cc: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-6-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	40f191ab82	tcg/arm: Add support for fence Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-5-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	c7a59c2a92	tcg/aarch64: Add support for fence Cc: Claudio Fontana <claudio.fontana@gmail.com> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-4-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	a7d00d4eff	tcg/i386: Add support for fence Generate a 'lock orl $0,0(%esp)' instruction for ordering instead of mfence which has similar ordering semantics. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-3-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Pranith Kumar	f65e19bc2c	Introduce TCGOpcode for memory barrier This commit introduces the TCGOpcode for memory barrier instruction. This opcode takes an argument which is the type of memory barrier which should be generated. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-2-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:11 -07:00
Richard Henderson	85aa80813d	tcg: Support arbitrary size + alignment Previously we allowed fully unaligned operations, but not operations that are aligned but with less alignment than the operation size. In addition, arm32, ia64, mips, and sparc had been omitted from the previous overalignment patch, which would have led to that alignment being enforced. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-09-16 08:12:06 -07:00
Ladi Prosek	d4b84d564e	Remove unused function declarations Unused function declarations were found using a simple gcc plugin and manually verified by grepping the sources. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-09-15 15:32:22 +03:00
Thomas Huth	347519eb9d	tcg: Remove duplicate header includes host-utils.h and timer.h are included twice in tcg.c. One time should be enough. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-09-15 15:32:22 +03:00
Thomas Huth	d41f3c3cc7	Remove remainders of HPPA backend The HPPA backend has been removed by the following commit: `802b508123` tcg-hppa: Remove tcg backend But some small pieces of the HPPA backend still survived until today. Since we also do not have support for a HPPA target in QEMU, we can nowadays safely remove the remaining HPPA parts (like the disassembler code, or the detection of HPPA in the configure script). Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-09-15 15:32:22 +03:00
Richard Henderson	5a18407f55	tcg: Lower indirect registers in a separate pass Rather than rely on recursion during the middle of register allocation, lower indirect registers to loads and stores off the indirect base into plain temps. For an x86_64 host, with sufficient registers, this results in identical code, modulo the actual register assignments. For an i686 host, with insufficient registers, this means that temps can be (temporarily) spilled to the stack in order to satisfy an allocation. This as opposed to the possibility of not being able to spill, to allocate a register for the indirect base, in order to perform a spill. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-08-05 21:44:40 +05:30
Richard Henderson	c0ef05b5e6	tcg: Require liveness analysis Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-08-05 21:44:40 +05:30
Richard Henderson	bdfb460ef7	tcg: Include liveness info in the dumps Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-08-05 21:44:40 +05:30
Richard Henderson	c70fbf0a99	tcg: Compress dead_temps and mem_temps into a single array We only need two bits per temporary. Fold the two bytes into one, and reduce the memory and cachelines required during compilation. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-08-05 21:44:40 +05:30
Richard Henderson	bee158cb4d	tcg: Fold life data into TCGOp Reduce the size of other bitfields to make room. This reduces the cache footprint of compilation. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-08-05 21:44:40 +05:30
Richard Henderson	dcb8e75870	tcg: Reorg TCGOp chaining Instead of using -1 as end of chain, use 0, and link through the 0 entry as a fully circular double-linked list. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-08-05 21:44:18 +05:30
Richard Henderson	a1b3c48d2b	tcg: Compress liveness data to 16 bits This reduces both memory usage and per-insn cacheline usage during code generation. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-08-05 21:44:17 +05:30
Paolo Bonzini	8bff06a0bb	compiler: never omit assertions if using a static analysis tool Assertions help both Coverity and the clang static analyzer avoid false positives, but on the other hand both are confused when the condition is compiled as (void)(x != FOO). Always expand assertion macros when using Coverity or clang, through a new QEMU_STATIC_ANALYSIS preprocessor symbol. This fixes a couple false positives in TCG. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-07-17 09:59:21 +02:00
Markus Armbruster	175de52487	Clean up decorations and whitespace around header guards Cleaned up with scripts/clean-header-guards.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:20:46 +02:00
Markus Armbruster	14e54f8ecf	tcg: Clean up tcg-target.h header guards These use guard symbols like TCG_TARGET_$target. scripts/clean-header-guards.pl doesn't like them because they don't match their file name (they should, to make guard collisions less likely). Clean them up: use guard symbol $target_TCG_TARGET_H for tcg/$target/tcg-target.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2016-07-12 16:19:16 +02:00
Sergey Sorokin	1f00b27f17	tcg: Improve the alignment check infrastructure Some architectures (e.g. ARMv8) need the address which is aligned to a size more than the size of the memory access. To support such check it's enough the current costless alignment check implementation in QEMU, but we need to support an alignment size specifying. Signed-off-by: Sergey Sorokin <afarallax@yandex.ru> Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru> Signed-off-by: Richard Henderson <rth@twiddle.net> [rth: Assert in tcg_canonicalize_memop. Leave get_alignment_bits available for, though unused by, user-mode. Retain logging difference based on ALIGNED_ONLY.]	2016-07-05 20:50:13 -07:00
Richard Henderson	59d7c14eef	tcg: Optimize spills of constants While we can store constants via constrants on INDEX_op_st_i32 et al, we weren't able to spill constants to backing store. Add a new backend interface, tcg_out_sti, which may store the constant (and is allowed to fail). Rearrange the temp_* helpers so that we only attempt to directly store a constant when the temp is becoming dead/free. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-07-05 20:50:13 -07:00
Richard Henderson	120c1084ed	tcg: Fix name for high-half register Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-07-05 20:50:12 -07:00
Lluís Vilanova	dcdaadb6ea	trace: [all] Add "guest_mem_before" event The event is described in "trace-events". Note that the "MO_AMASK" flag is not traced, since it does not seem to affect the visible semantics of instructions. [s/inline inline/inline/ to fix clang build. --Stefan] Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 146549350711.18437.726780393247474362.stgit@fimbulvetr.bsc.es Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-20 17:21:56 +01:00
Lluís Vilanova	7c2550432a	exec: [tcg] Track which vCPU is performing translation and execution Information is tracked inside the TCGContext structure, and later used by tracing events with the 'tcg' and 'vcpu' properties. The 'cpu' field is used to check tracing of translation-time events ("_trans"). The 'tcg_env' field is used to pass it to execution-time events ("_exec"). Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 146549350162.18437.3033661139638458143.stgit@fimbulvetr.bsc.es Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-20 15:30:01 +01:00
Paolo Bonzini	63c915526d	cpu: move exec-all.h inclusion out of cpu.h exec-all.h contains TCG-specific definitions. It is not needed outside TCG-specific files such as translate.c, exec.c or *helper.c. One generic function had snuck into include/exec/exec-all.h; move it to include/qom/cpu.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Paolo Bonzini	00f6da6a1a	exec: extract exec/tb-context.h TCG backends do not need most of exec-all.h; extract what they actually need to a separate file or move it directly to tcg.h. The next patch will stop including exec-all.h from everywhere. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Paolo Bonzini	33c11879fd	qemu-common: push cpu.h inclusion out of qemu-common.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Stefan Weil	cb8d4c8f54	Fix some typos found by codespell Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-05-18 15:04:27 +03:00
Sergey Fedorov	819af24b9c	tcg: Clean up from 'next_tb' The value returned from tcg_qemu_tb_exec() is the value passed to the corresponding tcg_gen_exit_tb() at translation time of the last TB attempted to execute. It is a little confusing to store it in a variable named 'next_tb'. In fact, it is a combination of 4-byte aligned pointer and additional information in its two least significant bits. Break it down right away into two variables named 'last_tb' and 'tb_exit' which are a pointer to the last TB attempted to execute and the TB exit reason, correspondingly. This simplifies the code and improves its readability. Correct a misleading documentation comment for tcg_qemu_tb_exec() and fix logging in cpu_tb_exec(). Also rename a misleading 'next_tb' in another couple of places. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:42 -10:00
Sergey Fedorov	90aa39a1cc	tcg: Allow goto_tb to any target PC in user mode In user mode, there's only a static address translation, TBs are always invalidated properly and direct jumps are reset when mapping change. Thus the destination address is always valid for direct jumps and there's no need to restrict it to the pages the TB resides in. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Cc: Riku Voipio <riku.voipio@iki.fi> Cc: Blue Swirl <blauwirbel@gmail.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:42 -10:00
Sergey Fedorov	5b053a4a28	tcg: Clean up direct block chaining safety checks We don't take care of direct jumps when address mapping changes. Thus we must be sure to generate direct jumps so that they always keep valid even if address mapping changes. Luckily, we can only allow to execute a TB if it was generated from the pages which match with current mapping. Document tcg_gen_goto_tb() declaration and note the reason for destination PC limitations. Some targets with variable length instructions allow TB to straddle a page boundary. However, we make sure that both of TB pages match the current address mapping when looking up TBs. So it is safe to do direct jumps into the both pages. Correct the checks for some of those targets. Given that, we can safely patch a TB which spans two pages. Remove the unnecessary check in cpu_exec() and allow such TBs to be patched. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	f309101c26	tcg: Clean up direct block chaining data fields Briefly describe in a comment how direct block chaining is done. It should help in understanding of the following data fields. Rename some fields in TranslationBlock and TCGContext structures to better reflect their purpose (dropping excessive 'tb_' prefix in TranslationBlock but keeping it in TCGContext): tb_next_offset => jmp_reset_offset tb_jmp_offset => jmp_insn_offset tb_next => jmp_target_addr jmp_next => jmp_list_next jmp_first => jmp_list_first Avoid using a magic constant as an invalid offset which is used to indicate that there's no n-th jump generated. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	c82460a560	tcg/mips: Make direct jump patching thread-safe Ensure direct jump patching in MIPS is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-11-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net> [rth: Merged the deposit32 followup.] [rth: Merged the following followup.] Message-Id: <1462210518-26522-1-git-send-email-sergey.fedorov@linaro.org>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	84f79fb7c6	tcg/sparc: Make direct jump patching thread-safe Ensure direct jump patching in SPARC is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1461341333-19646-10-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	9e26911295	tcg/aarch64: Make direct jump patching thread-safe Ensure direct jump patching in AArch64 is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-9-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	7d14e0e2d6	tcg/arm: Make direct jump patching thread-safe Ensure direct jump patching in ARM is atomic by using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-8-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	ed3d51ecd7	tcg/s390: Make direct jump patching thread-safe Ensure direct jump patching in s390 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-7-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	0d07abf05e	tcg/i386: Make direct jump patching thread-safe Ensure direct jump patching in i386 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. tcg_out_nopn() implementation: Suggested-by: Richard Henderson <rth@twiddle.net>. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:41 -10:00
Sergey Fedorov	399f164857	tcg/ppc: Make direct jump patching thread-safe Ensure direct jump patching in PPC is atomic by: * limiting translation buffer size in 32-bit mode to be addressable by Branch I-form instruction; * using atomic_read()/atomic_set() for code patching. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <1461341333-19646-5-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:40 -10:00
Sergey Fedorov	76442a939e	tci: Make direct jump patching thread-safe Ensure direct jump patching in TCI is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() to load/store the address. Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-05-12 14:06:40 -10:00
Edgar E. Iglesias	1d41478fd4	tcg: Add tcg_set_insn_param Add tcg_set_insn_param as a mechanism to modify an insn parameter after emiting the insn. This is useful for icount and also for embedding fault information for a specific insn. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 1461931684-1867-2-git-send-email-edgar.iglesias@gmail.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-05-12 13:22:26 +01:00
Aurelien Jarno	8d8fdbae01	tcg: check for CONFIG_DEBUG_TCG instead of NDEBUG Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-id: 1461228530-14852-2-git-send-email-aurelien@aurel32.net Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-04-21 15:43:20 +01:00
Aurelien Jarno	eabb7b91b3	tcg: use tcg_debug_assert instead of assert (fix performance regression) The TCG code is quite performance sensitive, but at the same time can also be quite tricky. That is why asserts that can be enabled with the --enable-debug-tcg configure option. This used to work the following way: \| #include "config.h" \| \| ... \| \| #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG) \| /* define it to suppress various consistency checks (faster) */ \| #define NDEBUG \| #endif \| \| ... \| \| #include <assert.h> Since commit `757e725b` (tcg: Clean up includes) "config.h" as been replaced by "qemu/osdep.h" which itself includes <assert.h>. As a consequence the assertions are always enabled, even when using --disable-debug-tcg, causing a performance regression, especially on targets with many registers. For instance on qemu-system-ppc the speed difference is about 15%. tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already uses in some places. This patch replaces all the calls to assert into calss to tcg_debug_assert. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-id: 1461228530-14852-1-git-send-email-aurelien@aurel32.net Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-04-21 15:41:47 +01:00
James Hogan	2dc7553d0c	tcg/mips: Fix type of tcg_target_reg_alloc_order[] The MIPS TCG backend is the only one to have tcg_target_reg_alloc_order[] elements of type TCGReg rather than int. This resulted in commit `91478cefaa` ("tcg: Allocate indirect_base temporaries in a different order") breaking the build on MIPS since the type differed from indirect_reg_alloc_order[]: tcg/tcg.c:1725:44: error: pointer type mismatch in conditional expression [-Werror] order = rev ? indirect_reg_alloc_order : tcg_target_reg_alloc_order; ^ Make it an array of ints to fix the build and match other architectures. Fixes: `91478cefaa` ("tcg: Allocate indirect_base temporaries in a different order") Signed-off-by: James Hogan <james.hogan@imgtec.com> Acked-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1459522179-6584-1-git-send-email-james.hogan@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-04-05 12:47:47 -07:00
Alex Bennée	d977e1c2db	qemu-log: dfilter-ise exec, out_asm, op and opt_op This ensures the code generation debug code will honour -dfilter if set. For the "exec" tracing I've added a new inline macro for efficiency's sake. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Aurelien Jarno <aurelien@aureL32.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-8-git-send-email-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:18 +01:00
Alex Bennée	5bd2ec3d7b	tcg: pass down TranslationBlock to tcg_code_gen My later debugging patches need access to the origin PC which is held in the TranslationBlock structure. Pass down the whole structure as it also holds the information about the code start point. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-3-git-send-email-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:17 +01:00
Veronia Bahaa	f348b6d1a5	util: move declarations out of qemu-common.h Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:17 +01:00
Lluís Vilanova	5d4e1a1081	tcg: Move definition of type TCGv The target-dependant type TCGv must be defined in "tcg/tcg.h" before including the tracing helper wrappers in "tcg/tcg-op.h". It also makes more sense to define it here, where other TCG types are defined too. Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Message-id: 145641860129.30295.17554707227384022653.stgit@localhost Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-03-01 13:27:09 +00:00
Lluís Vilanova	1bcea73e13	tcg: Add type for vCPU pointers Adds the 'TCGv_env' type for pointers to 'CPUArchState' objects. The tracing infrastructure later needs to differentiate between regular pointers and pointers to vCPUs. Also changes all targets to use the new 'TCGv_env' type instead of the generic 'TCGv_ptr'. As of now, the change is merely cosmetic ('TCGv_env' translates into 'TCGv_ptr'), but that could change in the future to enforce the difference. Note that a 'TCGv_env' type (for 'CPUState') is not added, since all helpers currently receive the architecture-specific pointer ('CPUArchState'). Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu> Acked-by: Richard Henderson <rth@twiddle.net> Message-id: 145641859552.30295.7821536833590725201.stgit@localhost Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-03-01 13:27:09 +00:00
Peter Maydell	c3b7f66800	tcg: Remove unnecessary osdep.h includes from tcg-target.inc.c Commit `757e725b58` added a number of #include "qemu/osdep.h" files to the tcg-target.c files (as they were named at the time). These are unnecessary because these files are not standalone C files, and the tcg/tcg.c file which includes them will have already included osdep.h on their behalf. Remove the unneeded include directives. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1456238983-10160-4-git-send-email-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-23 08:31:03 -08:00
Peter Maydell	ce15110981	tcg: Rename tcg-target.c to tcg-target.inc.c Rename the per-architecture tcg-target.c files to tcg-target.inc.c. This makes it clearer that they are not intended to be standalone C files, but are instead #included into another source file. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <1456238983-10160-2-git-send-email-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-23 08:30:38 -08:00
Richard Henderson	91478cefaa	tcg: Allocate indirect_base temporaries in a different order Since we've not got liveness analysis for indirect bases, placing them at the end of the call-saved registers makes it more likely that it'll stay live. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-23 08:07:14 -08:00
Richard Henderson	b3915dbbdc	tcg: Implement indirect memory registers That is, global_mem registers whose base is another global_mem register, rather than a fixed register. Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-23 08:07:14 -08:00
Richard Henderson	869938ae2a	tcg: Work around clang bug wrt enum ranges, part 2 A previous patch patch changed the type of REG from int to enum TCGReg, which provokes the following bug in clang: https://llvm.org/bugs/show_bug.cgi?id=16154 Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-23 08:07:14 -08:00
Peter Maydell	30456d5ba3	all: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-02-23 12:43:05 +00:00
Richard Henderson	40ae5c62eb	tcg: Introduce temp_load Unify all of the places that realize a temporary into a register. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	b13eb728d3	tcg: Change temp_save argument to TCGTemp Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	12b9b11a27	tcg: Change temp_sync argument to TCGTemp Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	f8bf00f102	tcg: Change temp_dead argument to TCGTemp Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	f8b2f20234	tcg: Change reg_to_temp to TCGTemp pointer Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	e4ce0d4eb7	tcg: Remove tcg_get_arg_str_i32/64 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	b663866231	tcg: More use of TCGReg where appropriate Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	c807402320	tcg: Work around clang bug wrt enum ranges A subsequent patch patch will change the type of REG from int to enum TCGReg, which provokes the following bug in clang: https://llvm.org/bugs/show_bug.cgi?id=16154 Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:45:34 +11:00
Richard Henderson	7ca4b752fe	tcg: Tidy temporary allocation In particular, make sure the memory is memset before use. Continues the increased use of TCGTemp pointers instead of integer indices where appropriate. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:19:32 +11:00
Richard Henderson	b3a6293956	tcg: Change ts->mem_reg to ts->mem_base Chain the temporaries together via pointers intstead of indices. The mem_reg value is now mem_base->reg. This will be important later. This does require that the frame pointer have a global temporary allocated for it. This is simple bar the existing reserved_regs check. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:19:32 +11:00
Richard Henderson	e1ccc05444	tcg: Change tcg_global_mem_new_* to take a TCGv_ptr Thus, use cpu_env as the parameter, not TCG_AREG0 directly. Update all uses in the translators. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:19:32 +11:00
Richard Henderson	2015770593	tcg: Remove lingering references to gen_opc_buf Three in comments and one in code in the stub tcg_liveness_analysis. Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:19:32 +11:00
Richard Henderson	23dceda62a	tcg: Respect highwater in tcg_out_tb_finalize Undo the workaround at `b17a6d3390`. If there are lots of memory operations in a TB, the slow path code can exceed the highwater reservation. Add a check within the loop. Tested-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2016-02-09 10:19:32 +11:00
Paolo Bonzini	508127e243	log: do not unnecessarily include qom/cpu.h Split the bits that require it to exec/log.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-id: 1452174932-28657-8-git-send-email-den@openvz.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-02-03 09:19:10 +00:00
Peter Maydell	757e725b58	tcg: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-16-git-send-email-peter.maydell@linaro.org	2016-01-29 15:07:23 +00:00
Richard Henderson	b17a6d3390	tcg: Increase the highwater reservation If there are a lot of guest memory ops in the TB, the amount of code generated by tcg_out_tb_finalize could be well more than 1k. In the short term, increase the reservation larger than any TB seen in practice. Reported-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-12-01 14:36:32 -08:00
John Clarke	644da9b39e	tcg: Fix highwater check A simple typo in the variable to use when comparing vs the highwater mark. Reports are that qemu can in fact segfault occasionally due to this mistake. Signed-off-by: John Clarke <johnc@kirriwa.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-11-23 13:16:05 +01:00
James Hogan	137d63902f	tcg/mips: Support r6 SEL{NE, EQ}Z instead of MOVN/MOVZ Extend MIPS movcond implementation to support the SELNEZ/SELEQZ instructions introduced in MIPS r6 (where MOVN/MOVZ have been removed). Whereas the "MOVN/MOVZ rd, rs, rt" instructions have the following semantics: rd = [!]rt ? rs : rd The "SELNEZ/SELEQZ rd, rs, rt" instructions are slightly different: rd = [!]rt ? rs : 0 First we ensure that if one of the movcond input values is zero that it comes last (we can swap the input arguments if we invert the condition). This is so that it can exactly match one of the SELNEZ/SELEQZ instructions and avoid the need to emit the other one. Otherwise we emit the opposite instruction first into a temporary register, and OR that into the result: SELNEZ/SELEQZ TMP1, v2, c1 SELEQZ/SELNEZ ret, v1, c1 OR ret, ret, TMP1 Which does the following: ret = cond ? v1 : v2 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1443788657-14537-7-git-send-email-james.hogan@imgtec.com>	2015-10-19 11:04:39 -10:00
James Hogan	bc6d0c22b0	tcg/mips: Support r6 multiply/divide encodings MIPSr6 adds several new integer multiply, divide, and modulo instructions, and removes several pre-r6 encodings, along with the HI/LO registers which were the implicit operands of some of those instructions. Update TCG to use the new instructions when built for r6. The new instructions actually map much more directly to the TCG ops, as they only provide a single 32-bit half of the result and in a normal general purpose register instead of HI or LO. The mulu2_i32 and muls2_i32 operations are no longer appropriate for r6, so they are removed from the TCG opcode table. This is because they would need to emit two separate host instructions anyway (for the high and low half of the result), which TCG can arrange automatically for us in the absense of mulu2_i32/muls2_i32 by splitting it into mul_i32 and mul*h_i32 TCG ops. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1443788657-14537-6-git-send-email-james.hogan@imgtec.com>	2015-10-19 11:04:39 -10:00
James Hogan	6e0d096989	tcg/mips: Support r6 JR encoding MIPSr6 encodes JR as JALR with zero as the link register, and the pre-r6 JR encoding is removed. Update TCG to use the new encoding when built for r6. We still use the old encoding for pre-r6, so as not to confuse return prediction stack hardware which may detect only particular encodings of the return instruction. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1443788657-14537-5-git-send-email-james.hogan@imgtec.com>	2015-10-19 11:04:38 -10:00
James Hogan	ce14bd4d46	tcg/mips: Add use_mips32r6_instructions definition Add definition use_mips32r6_instructions to the MIPS TCG backend which is constant 1 when built for MIPS release 6. This will be used to decide between pre-R6 and R6 instruction encodings. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1443788657-14537-4-git-send-email-james.hogan@imgtec.com>	2015-10-19 11:04:38 -10:00
James Hogan	c0e40dbdcc	tcg-opc.h: Simplify insn_start def We already have a TLADDR_ARGS definition, so rearrange the order slightly and use it in the definition of insn_start, instead of having an #ifdef. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1443788657-14537-2-git-send-email-james.hogan@imgtec.com>	2015-10-19 11:04:38 -10:00
Richard Henderson	1e1df962e3	tcg/ppc: Prefer mask over andi. Prefer the instruction that isn't required to modify cr0. Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-19 11:04:38 -10:00
Richard Henderson	5bfd75a35c	tcg/ppc: Revise goto_tb implementation Restrict the size of code_gen_buffer to 2GB on ppc64, which lets us assert that everything is reachable with addis+addi from tb_ret_addr. This lets us use a max of 4 insns for goto_tb instead of 7. Emit the indirect branch portion of goto_tb up front, which means we only have to update two insns to update any link. With a 64-bit store, we can update the link atomically, which may be required in future. Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-19 11:04:37 -10:00
Richard Henderson	70f897bdc4	tcg/ppc: Adjust exit_tb for change in prologue placement Changing the prologue to the beginning of the code_gen_buffer changes the direction of the "return" branch. Need to change the logic to match. Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-19 11:04:37 -10:00
Richard Henderson	b125f9dc7b	tcg: Check for overflow via highwater mark We currently pre-compute an worst case code size for any TB, which works out to be 122kB. Since the average TB size is near 1kB, this wastes quite a lot of storage. Instead, check for overflow in between generating code for each opcode. The overhead of the check isn't measurable and wastage is minimized. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:53 +11:00
Richard Henderson	8163b74938	tcg: Emit prologue to the beginning of code_gen_buffer By putting the prologue at the end, we risk overwriting the prologue should our estimate of maximum TB size. Given the two different placements of the call to tcg_prologue_init, move the high water mark computation into tcg_prologue_init. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:53 +11:00
Richard Henderson	04fe640001	tcg: Remove tcg_gen_code_search_pc It's no longer used, so tidy up everything reached by it. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:53 +11:00
Richard Henderson	4e5e121515	tcg: Remove gen_intermediate_code_pc It is no longer used, so tidy up everything reached by it. This includes the gen_opc_* arrays, the search_pc parameter and the inline gen_intermediate_code_internal functions. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:52 +11:00
Richard Henderson	fca8a500d5	tcg: Save insn data and use it in cpu_restore_state_from_tb We can now restore state without retranslation. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:51 +11:00
Richard Henderson	bad729e272	tcg: Pass data argument to restore_state_to_opc The gen_opc_* arrays are already redundant with the data stored in the insn_start arguments. Transition restore_state_to_opc to use data from the latter. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:51 +11:00
Richard Henderson	190ce7fbc7	tcg: Add TCG_MAX_INSNS Adjust all translators to respect it. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:50 +11:00
Richard Henderson	9aef40ed1f	tcg: Allow extra data to be attached to insn_start With an eye toward having this data replace the gen_opc_* arrays that each target collects in order to enable restore_state_from_tb. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:46 +11:00
Richard Henderson	765b842ade	tcg: Rename debug_insn_start to insn_start With an eye toward making it mandatory. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-10-07 20:36:26 +11:00
Aurelien Jarno	81dfaf1a8f	tcg/mips: pass oi to tcg_out_tlb_load Instead of computing mem_index and s_bits in both tcg_out_qemu_ld and tcg_out_qemu_st function and passing them to tcg_out_tlb_load, directly pass oi to the tcg_out_tlb_load function and compute mem_index and s_bits there. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2015-09-19 11:53:15 +02:00
Aurelien Jarno	d9f26847f1	tcg/mips: move tcg_out_addsub2 Somehow the tcg_out_addsub2 function ended-up in the middle of the qemu_ld/st related functions. Move it with other arithmetics related functions. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2015-09-19 11:53:14 +02:00
James Hogan	5eb4f645eb	tcg/mips: Fix clobbering of qemu_ld inputs The MIPS TCG backend implements qemu_ld with 64-bit targets using the v0 register (base) as a temporary to load the upper half of the QEMU TLB comparator (see line 5 below), however this happens before the input address is used (line 8 to mask off the low bits for the TLB comparison, and line 12 to add the host-guest offset). If the input address (addrl) also happens to have been placed in v0 (as in the second column below), it gets clobbered before it is used. addrl in t2 addrl in v0 1 srl a0,t2,0x7 srl a0,v0,0x7 2 andi a0,a0,0x1fe0 andi a0,a0,0x1fe0 3 addu a0,a0,s0 addu a0,a0,s0 4 lw at,9136(a0) lw at,9136(a0) set TCG_TMP0 (at) 5 lw v0,9140(a0) lw v0,9140(a0) set base (v0) 6 li t9,-4093 li t9,-4093 7 lw a0,9160(a0) lw a0,9160(a0) set addend (a0) 8 and t9,t9,t2 and t9,t9,v0 use addrl 9 bne at,t9,0x836d8c8 bne at,t9,0x836d838 use TCG_TMP0 10 nop nop 11 bne v0,t8,0x836d8c8 bne v0,a1,0x836d838 use base 12 addu v0,a0,t2 addu v0,a0,v0 use addrl, addend 13 lw t0,0(v0) lw t0,0(v0) Fix by using TCG_TMP0 (at) as the temporary instead of v0 (base), pushing the load on line 5 forward into the delay slot of the low comparison (line 10). The early load of the addend on line 7 also needs pushing even further for 64-bit targets, or it will clobber a0 before we're done with it. The output for 32-bit targets is unaffected. srl a0,v0,0x7 andi a0,a0,0x1fe0 addu a0,a0,s0 lw at,9136(a0) -lw v0,9140(a0) load high comparator li t9,-4093 -lw a0,9160(a0) load addend and t9,t9,v0 bne at,t9,0x836d838 - nop + lw at,9140(a0) load high comparator +lw a0,9160(a0) load addend -bne v0,a1,0x836d838 +bne at,a1,0x836d838 addu v0,a0,v0 lw t0,0(v0) Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2015-09-19 11:53:14 +02:00
Peter Crosthwaite	162e992270	tcg: Move tci_tb_ptr to -common This requires global visibility to common code. Move to tcg-common. Cc: Stefan Weil <sw@weilnetz.de> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <cb0340eba225ab4945aa6cf7c9013f33aa05bcf8.1441614289.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-16 17:33:33 +02:00
Peter Crosthwaite	7d8f787d9d	tcg: split tcg_op_defs to -common tcg_op_defs (and the _max) are both needed by the TCI disassembler. For multi-arch, tcg.c will be multiple-compiled (arch-obj) with its symbols hidden from common code. So split the definition off to new file, tcg-common.c which will remain a regular obj-y for use by both the TCI disas as well as the multiple tcg.c's. Cc: Stefan Weil <sw@weilnetz.de> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <4b607425886d85aee65878e4935dfad46b3e6085.1441614289.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-16 17:33:33 +02:00
Peter Maydell	a2aa09e181	* Support for jemalloc * qemu_mutex_lock_iothread "No such process" fix * cutils: qemu_strto* wrappers * iohandler.c simplification * Many other fixes and misc patches. And some MTTCG work (with Emilio's fixes squashed): * Signal-free TCG kick * Removing spinlock in favor of QemuMutex * User-mode emulation multi-threading fixes/docs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3 7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA= =nB06 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * Support for jemalloc * qemu_mutex_lock_iothread "No such process" fix * cutils: qemu_strto* wrappers * iohandler.c simplification * Many other fixes and misc patches. And some MTTCG work (with Emilio's fixes squashed): * Signal-free TCG kick * Removing spinlock in favor of QemuMutex * User-mode emulation multi-threading fixes/docs # gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (44 commits) cutils: work around platform differences in strto{l,ul,ll,ull} cpu-exec: fix lock hierarchy for user-mode emulation exec: make mmap_lock/mmap_unlock globally available tcg: comment on which functions have to be called with mmap_lock held tcg: add memory barriers in page_find_alloc accesses remove unused spinlock. replace spinlock by QemuMutex. cpus: remove tcg_halt_cond and tcg_cpu_thread globals cpus: protect work list with work_mutex scripts/dump-guest-memory.py: fix after RAMBlock change configure: Add support for jemalloc add macro file for coccinelle configure: factor out adding disas configure vhost-scsi: fix wrong vhost-scsi firmware path checkpatch: remove tests that are not relevant outside the kernel checkpatch: adapt some tests to QEMU CODING_STYLE: update mixed declaration rules qmp: Add example usage of strto*l() qemu wrapper cutils: Add qemu_strtoull() wrapper cutils: Add qemu_strtoll() wrapper ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-09-14 16:13:16 +01:00
Pavel Dovgalyuk	282dffc8a4	softmmu: add helper function to pass through retaddr This patch introduces several helpers to pass return address which points to the TB. Correct return address allows correct restoring of the guest PC and icount. These functions should be used when helpers embedded into TB invoke memory operations. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20150710095650.13280.32255.stgit@PASHA-ISP> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-09-11 08:15:32 -07:00
Veres Lajos	67cc32ebfd	typofixes - v4 Signed-off-by: Veres Lajos <vlajos@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-09-11 10:45:43 +03:00
KONRAD Frederic	677ef6230b	replace spinlock by QemuMutex. spinlock is only used in two cases: * cpu-exec.c: to protect TranslationBlock * mem_helper.c: for lock helper in target-i386 (which seems broken). It's a pthread_mutex_t in user-mode, so we can use QemuMutex directly, with an #ifdef. The #ifdef will be removed when multithreaded TCG will need the mutex as well. Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> Message-Id: <1439220437-23957-5-git-send-email-fred.konrad@greensocs.com> Signed-off-by: Emilio G. Cota <cota@braap.org> [Merge Emilio G. Cota's patch to remove volatile. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-09-09 15:34:55 +02:00
Aurelien Jarno	08b0b23be6	tcg/i386: omit a few REXW prefixes in softmmu code When computing the TLB address we are likely to mask out the high 32-bits by using shr + and. We can use 32-bit instructions in that case. This saves 2 bytes per TLB access. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1437306632-20655-1-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-09-02 14:24:10 -07:00
Richard Henderson	352bcb0a2b	tcg/aarch64: Fix tcg_out_qemu_{ld, st} for guest_base == 0 In `ffc6372851`, we swapped the guest base to the address base register from the address index register. Except that 31 in the base slot is SP not XZR, so we need to be more intelligent about which reg gets placed in which slot. Cc: qemu-stable@nongnu.org (v2.4.0) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-09-02 14:23:14 -07:00
Laurent Vivier	090d0bfd94	s390: fix softmmu compilation guest_base must be used only in linux-user mode. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Message-id: 1440757421-9674-1-git-send-email-laurent@vivier.eu Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-08-28 16:05:24 +01:00
Laurent Vivier	b76f21a707	linux-user: remove useless macros GUEST_BASE and RESERVED_VA As we have removed CONFIG_USE_GUEST_BASE, we always use a guest base and the macros GUEST_BASE and RESERVED_VA become useless: replace them by their values. Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <1440420834-8388-1-git-send-email-laurent@vivier.eu> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:14:30 -07:00
Laurent Vivier	4cbea59869	linux-user: remove --enable-guest-base/--disable-guest-base All tcg host architectures now support the guest base and as there is no real performance lost, it can be always enabled. Anyway, guest base use can be disabled lively by setting guest base to 0. CONFIG_USE_GUEST_BASE is defined as (USE_GUEST_BASE && USER_ONLY), it should have to be replaced by CONFIG_USER_ONLY in non CONFIG_USER_ONLY parts, but as some other parts are using !CONFIG_SOFTMMU I have chosen to use !CONFIG_SOFTMMU instead. Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Laurent Vivier <laurent@vivier.eu> Message-Id: <1440373328-9788-2-git-send-email-laurent@vivier.eu> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:14:17 -07:00
Richard Henderson	9ee14902bf	tcg/aarch64: Use softmmu fast path for unaligned accesses Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Richard Henderson	a5e39810b9	tcg/s390: Use softmmu fast path for unaligned accesses Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Benjamin Herrenschmidt	68d45bb61c	tcg/ppc: Improve unaligned load/store handling on 64-bit backend Currently, we get to the slow path for any unaligned access in the backend, because we effectively preserve the bottom address bits below the alignment requirement when comparing with the TLB entry, so any non-0 bit there will cause the compare to fail. For the same number of instructions, we can instead add the access size - 1 to the address and stick to clearing all the bottom bits. That means that normal unaligned accesses will not fallback (the HW will handle them fine). Only when crossing a page boundary well we end up having a mismatch because we'll end up pointing to the next page which cannot possibly be in that same TLB entry. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Message-Id: <1437455978.5809.2.camel@kernel.crashing.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	8cc580f6a0	tcg/i386: use softmmu fast path for unaligned accesses Softmmu unaligned load/stores currently goes through through the slow path for two reasons: - to support unaligned access on host with strict alignement - to correctly handle accesses crossing pages x86 is only concerned by the second reason. Unaligned accesses are avoided by compilers, but are not uncommon. We therefore would like to see them going through the fast path, if they don't cross pages. For that we can use the fact that two adjacent TLB entries can't contain the same page. Therefore accessing the TLB entry corresponding to the first byte, but comparing its content to page address of the last byte ensures that we don't cross pages. We can do this check without adding more instructions in the TLB code (but increasing its length by one byte) by using the LEA instruction to combine the existing move with the size addition. On an x86-64 host, this gives a 3% boot time improvement for a powerpc guest and 4% for an x86-64 guest. [rth: Tidied calculation of the offset mask] Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1436467197-2183-1-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Richard Henderson	ecc7b3aa71	tcg: Remove tcg_gen_trunc_i64_i32 Replacing it with tcg_gen_extrl_i64_i32. Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Richard Henderson	609ad70562	tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32 Rather than allow arbitrary shift+trunc, only concern ourselves with low and high parts. This is all that was being used anyway. Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	870ad1547a	tcg: update README about size changing ops Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	8bcb5c8f34	tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops They behave the same as ext32s_i64 and ext32u_i64 from the constant folding and zero propagation point of view, except that they can't be replaced by a mov, so we don't compute the affected value. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	4f2331e5b6	tcg: implement real ext_i32_i64 and extu_i32_i64 ops Implement real ext_i32_i64 and extu_i32_i64 ops. They ensure that a 32-bit value is always converted to a 64-bit value and not propagated through the register allocator or the optimizer. Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Alexander Graf <agraf@suse.de> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Stefan Weil <sw@weilnetz.de> Acked-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	6acd2558fd	tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32 The tcg_gen_trunc_shr_i64_i32 function takes a 64-bit argument and returns a 32-bit value. Directly call tcg_gen_op3 with the correct types instead of calling tcg_gen_op3i_i32 and abusing the TCG types. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	0632e555fc	tcg: rename trunc_shr_i32 into trunc_shr_i64_i32 The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32, and the name in the README doesn't match the name offered to the frontends. Always use the long name to make it clear it is a size changing op. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	299f801304	tcg/optimize: allow constant to have copies Now that copies and constants are tracked separately, we can allow constant to have copies, deferring the choice to use a register or a constant to the register allocation pass. This prevent this kind of regular constant reloading: -OUT: [size=338] +OUT: [size=298] mov -0x4(%r14),%ebp test %ebp,%ebp jne 0x7ffbe9cb0ed6 mov $0x40002219f8,%rbp mov %rbp,(%r14) - mov $0x40002219f8,%rbp mov $0x4000221a20,%rbx mov %rbp,(%rbx) mov $0x4000000000,%rbp mov %rbp,(%r14) - mov $0x4000000000,%rbp mov $0x4000221d38,%rbx mov %rbp,(%rbx) mov $0x40002221a8,%rbp mov %rbp,(%r14) - mov $0x40002221a8,%rbp mov $0x4000221d40,%rbx mov %rbp,(%rbx) mov $0x4000019170,%rbp mov %rbp,(%r14) - mov $0x4000019170,%rbp mov $0x4000221d48,%rbx mov %rbp,(%rbx) mov $0x40000049ee,%rbp mov %rbp,0x80(%r14) mov %r14,%rdi callq 0x7ffbe99924d0 mov $0x4000001680,%rbp mov %rbp,0x30(%r14) mov 0x10(%r14),%rbp mov $0x4000001680,%rbp mov %rbp,0x30(%r14) mov 0x10(%r14),%rbp shl $0x20,%rbp mov (%r14),%rbx mov %ebx,%ebx mov %rbx,(%r14) or %rbx,%rbp mov %rbp,0x10(%r14) mov %rbp,0x90(%r14) mov 0x60(%r14),%rbx mov %rbx,0x38(%r14) mov 0x28(%r14),%rbx mov $0x4000220e60,%r12 mov %rbx,(%r12) mov $0x40002219c8,%rbx mov %rbp,(%rbx) mov 0x20(%r14),%rbp sub $0x8,%rbp mov $0x4000004a16,%rbx mov %rbx,0x0(%rbp) mov %rbp,0x20(%r14) mov $0x19,%ebp mov %ebp,0xa8(%r14) mov $0x4000015110,%rbp mov %rbp,0x80(%r14) xor %eax,%eax jmpq 0x7ffbebcae426 lea -0x5f6d72a(%rip),%rax # 0x7ffbe3d437b3 jmpq 0x7ffbebcae426 Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:54 -07:00
Aurelien Jarno	b41059dd9d	tcg/optimize: track const/copy status separately Instead of using an enum which could be either a copy or a const, track them separately. This will be used in the next patch. Constants are tracked through a bool. Copies are tracked by initializing temp's next_copy and prev_copy to itself, allowing to simplify the code a bit. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:53 -07:00
Aurelien Jarno	d9c769c609	tcg/optimize: add temp_is_const and temp_is_copy functions Add two accessor functions temp_is_const and temp_is_copy, to make the code more readable and make code change easier. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:53 -07:00
Aurelien Jarno	1208d7dd5f	tcg/optimize: optimize temps tracking The tcg_temp_info structure uses 24 bytes per temp. Now that we emulate vector registers on most guests, it's not uncommon to have more than 100 used temps. This means we have initialize more than 2kB at least twice per TB, often more when there is a few goto_tb. Instead used a TCGTempSet bit array to track which temps are in used in the current basic block. This means there are only around 16 bytes to initialize. This improves the boot time of a MIPS guest on an x86-64 host by around 7% and moves out tcg_optimize from the the top of the profiler list. [rth: Handle TCG_CALL_DUMMY_ARG] Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:30 -07:00
Aurelien Jarno	29f3ff8d6c	tcg/optimize: fix constant signedness By convention, on a 64-bit host TCG internally stores 32-bit constants as sign-extended. This is not the case in the optimizer when a 32-bit constant is folded. This doesn't seem to have more consequences than suboptimal code generation. For instance the x86 backend assumes sign-extended constants, and in some rare cases uses a 32-bit unsigned immediate 0xffffffff instead of a 8-bit signed immediate 0xff for the constant -1. This is with a ppc guest: before ------ ---- 0x9f29cc movi_i32 tmp1,$0xffffffff movi_i32 tmp2,$0x0 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2 mov_i32 r10,tmp0 0x7fd8c7dfe90c: xor %ebp,%ebp 0x7fd8c7dfe90e: mov %ebp,%r11d 0x7fd8c7dfe911: mov 0x18(%r14),%r9d 0x7fd8c7dfe915: add %r9d,%r10d 0x7fd8c7dfe918: adc %ebp,%r11d 0x7fd8c7dfe91b: add $0xffffffff,%r10d 0x7fd8c7dfe922: adc %ebp,%r11d 0x7fd8c7dfe925: mov %r11d,0x134(%r14) 0x7fd8c7dfe92c: mov %r10d,0x28(%r14) after ----- ---- 0x9f29cc movi_i32 tmp1,$0xffffffffffffffff movi_i32 tmp2,$0x0 add2_i32 tmp0,CA,CA,tmp2,r6,tmp2 add2_i32 tmp0,CA,tmp0,CA,tmp1,tmp2 mov_i32 r10,tmp0 0x7f37010d490c: xor %ebp,%ebp 0x7f37010d490e: mov %ebp,%r11d 0x7f37010d4911: mov 0x18(%r14),%r9d 0x7f37010d4915: add %r9d,%r10d 0x7f37010d4918: adc %ebp,%r11d 0x7f37010d491b: add $0xffffffffffffffff,%r10d 0x7f37010d491f: adc %ebp,%r11d 0x7f37010d4922: mov %r11d,0x134(%r14) 0x7f37010d4929: mov %r10d,0x28(%r14) Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1436544211-2769-2-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-08-24 11:10:08 -07:00
Aurelien Jarno	c99d69694a	tcg/mips: fix add2 The add2 code in the tcg_out_addsub2 function doesn't take into account the case where rl == al == bl. In that case we can't compute the carry after the addition. As it corresponds to a multiplication by 2, the carry bit is the bit 31. While this is a corner case, this prevents x86-64 guests to boot on a MIPS host. Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2015-08-01 09:39:50 +02:00
Aurelien Jarno	3c8691f568	tcg/s390x: Mask TCGMemOp appropriately for indexing Commit `2b7ec66f` fixed TCGMemOp masking following the MO_AMASK addition, but two cases were forgotten in the TCG S390 backend. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2015-08-01 09:39:37 +02:00
Aurelien Jarno	4214a8cb7c	tcg/mips: Mask TCGMemOp appropriately for indexing Commit `2b7ec66f` fixed TCGMemOp masking following the MO_AMASK addition, but two cases were forgotten in the TCG MIPS backend. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2015-08-01 09:39:33 +02:00
Aurelien Jarno	e72c4fb81d	tcg/mips: fix TLB loading for BE host with 32-bit guests For 32-bit guest, we load a 32-bit address from the TLB, so there is no need to compensate for the low or high part. This fixes 32-bit guests on big-endian hosts. Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2015-08-01 09:38:36 +02:00
Aurelien Jarno	bbeb82395e	tcg: mark temps as mem_coherent = 0 for mov with a constant When a constant has to be loaded in a mov op, we fail to set mem_coherent = 0. This patch fixes that. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1437994568-7825-3-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-07-27 07:25:40 -07:00
Aurelien Jarno	7df69deadf	tcg: correctly mark dead inputs for mov with a constant When tcg_reg_alloc_mov propagate a constant, we failed to correctly mark a temp as dead if the liveness analysis hints so. This fixes the following assert when configure with --enable-debug-tcg: qemu-x86_64: tcg/tcg.c:1827: tcg_reg_alloc_bb_end: Assertion `ts->val_type == TEMP_VAL_DEAD' failed. Cc: Richard Henderson <rth@twiddle.net> Reported-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1437994568-7825-2-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-07-27 07:25:40 -07:00
Aurelien Jarno	961521261a	tcg/optimize: fix tcg_opt_gen_movi Due to a copy&paste, the new op value is tested against mov_i32 instead of movi_i32. The test is therefore always false. Fix that. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1436544211-2769-1-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-07-23 20:37:12 -07:00
Richard Henderson	80adb8fcad	tcg/aarch64: use 32-bit offset for 32-bit softmmu emulation Similar to the same fix for user-mode, except this instance occurs on the softmmu path. Again, the tlb addend must be the base register, while the guest address is the index. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-07-23 20:19:44 -07:00
Paolo Bonzini	ffc6372851	tcg/aarch64: use 32-bit offset for 32-bit user-mode emulation Thanks to the previous patch, it is now easy for tcg_out_qemu_ld and tcg_out_qemu_st to use a 32-bit zero extended offset. However, the guest base register x28 must be the base and addr_reg must be the index. Reported-by: Leon Alrae <leon.alrae@imgtec.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1436974021-28978-3-git-send-email-pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-07-23 15:09:12 -07:00
Paolo Bonzini	6c0f0c0f12	tcg/aarch64: add ext argument to tcg_out_insn_3310 The new argument lets you pick uxtw or uxtx mode for the offset register. For now, all callers pass TCG_TYPE_I64 so that uxtx is generated. The bits for uxtx are removed from I3312_TO_I3310. Reported-by: Leon Alrae <leon.alrae@imgtec.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1436974021-28978-2-git-send-email-pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-07-23 15:09:04 -07:00
Richard Henderson	ee8ba9e4d8	tcg/i386: Extend addresses for 32-bit guests Removing the ??? comment explaining why it (mostly) worked. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1437081950-7206-2-git-send-email-rth@twiddle.net>	2015-07-23 15:09:04 -07:00
Stefan Weil	6e3c0c6edb	tci: Fix regression with INDEX_op_qemu_st_i32, INDEX_op_qemu_st_i64 Commit `59227d5d45` did not update the code in tcg/tci/tcg-target.c for those two cases. Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-id: 1436556159-3002-1-git-send-email-sw@weilnetz.de Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-07-13 10:07:38 +01:00
James Hogan	a8f13961fd	tcg/mips: Fix build error from merged memop+mmu_idx parameter Commit `3972ef6f83` ("tcg: Push merged memop+mmu_idx parameter to softmmu routines") caused the following build errors when building TCG for MIPS: In file included from tcg/tcg.c:258:0: tcg/mips/tcg-target.c In function ‘tcg_out_qemu_ld_slow_path’: tcg/mips/tcg-target.c:1015:22: error: ‘lb’ undeclared (first use in this function) tcg/mips/tcg-target.c In function ‘tcg_out_qemu_st_slow_path’: tcg/mips/tcg-target.c:1058:22: error: ‘lb’ undeclared (first use in this function) It looks like lb was meant to refer to the TCGLabelQemuLdst *l parameter, so fix both references to lb to refer to just l. Fixes: `3972ef6f83` ("tcg: Push merged memop+mmu_idx parameter to softmmu routines") Signed-off-by: James Hogan <james.hogan@imgtec.com> Reviewed-by: Leon Alrae <leon.alrae@imgtec.com> Message-id: 1436433435-24898-2-git-send-email-james.hogan@imgtec.com Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Leon Alrae <leon.alrae@imgtec.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-07-09 13:51:27 +01:00
Aurelien Jarno	cd3b29b745	tcg/s390: fix branch target change during code retranslation Make sure to not modify the branch target. This ensure that the branch target is not corrupted during partial retranslation. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Tested-by: Alexander Graf <agraf@suse.de> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-07-07 17:51:47 +02:00
Peter Crosthwaite	6e0b07306d	cpu-defs: Move CPU_TEMP_BUF_NLONGS to tcg The usages of this define are pure TCG and there is no architecture specific variation of the value. Localise it to the TCG engine to remove another architecture agnostic piece from cpu-defs.h. This follows on from `a28177820a` where temp_buf was moved out of the CPU_COMMON obsoleting the need for the super early definition. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Message-Id: <498e8e5325c1a1aff79e5bcfc28cb760ef6b214e.1433052532.git.crosthwaite.peter@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-06-26 16:00:50 +02:00
Aurelien Jarno	36e60ef6ac	tcg/optimize: rename tcg_constant_folding The tcg_constant_folding folding ends up doing all the optimizations (which is a good thing to avoid looping on all ops multiple time), so make it clear and just rename it tcg_optimize. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1433447607-31184-6-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 07:00:56 -07:00
Aurelien Jarno	97a79eb70d	tcg/optimize: fold constant test in tcg_opt_gen_mov Most of the calls to tcg_opt_gen_mov are preceeded by a test to check if the source temp is a constant. Fold that into the tcg_opt_gen_mov function. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1433495958-9508-1-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 07:00:56 -07:00
Aurelien Jarno	5365718a9a	tcg/optimize: fold temp copies test in tcg_opt_gen_mov Each call to tcg_opt_gen_mov is preceeded by a test to check if the source and destination temps are copies. Fold that into the tcg_opt_gen_mov function. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1433447607-31184-4-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 07:00:56 -07:00
Aurelien Jarno	8d6a91602e	tcg/optimize: remove opc argument from tcg_opt_gen_mov We can get the opcode using the TCGOp pointer. It needs to be dereferenced, but it's anyway done a few lines below to write the new value. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1433447607-31184-3-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 07:00:56 -07:00
Aurelien Jarno	ebd27391b0	tcg/optimize: remove opc argument from tcg_opt_gen_movi We can get the opcode using the TCGOp pointer. It needs to be dereferenced, but it's anyway done a few lines below to write the new value. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1433447607-31184-2-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 07:00:56 -07:00
Aurelien Jarno	c19f47bf5e	tcg: fix dead computation for repeated input arguments When the same temp is used twice or more as an input argument to a TCG instruction, the dead computation code doesn't recognize the second use as a dead temp. This is because the temp is marked as live in the same loop where dead inputs are checked. The fix is to split the loop in two parts. This avoid emitting a move and using a register for the movcond instruction when used as "move if true" on x86-64. This might bring more improvements on RISC TCG targets which don't have outputs aliased to inputs. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1433447228-29425-3-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 06:42:27 -07:00
Aurelien Jarno	7e1df267a7	tcg: fix register allocation with two aliased dead inputs For TCG ops with two outputs registers (add2, sub2, div2, div2u), when the same input temp is used for the two inputs aliased to the two outputs, and when these inputs are both dead, the register allocation code wrongly assigned the same register to the same output. This happens for example with sub2 t1, t2, t3, t3, t4, t5, when t3 is not used anymore after the TCG op. In that case the same register is used for t1, t2 and t3. The fix is to look for already allocated aliased input when allocating a dead aliased input and check that the register is not already used. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-Id: <1433447228-29425-2-git-send-email-aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 06:42:27 -07:00
Richard Henderson	59c4b7e8df	tcg: Handle MO_AMASK in tcg_dump_ops Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com> Tested-by: Yongbok Kim <yongbok.kim@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 06:35:53 -07:00
Richard Henderson	2b7ec66f02	tcg: Mask TCGMemOp appropriately for indexing The addition of MO_AMASK means that places that used inverted masks need to be changed to use positive masks, and places that failed to mask the intended bits need updating. Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com> Tested-by: Yongbok Kim <yongbok.kim@imgtec.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-06-09 06:35:29 -07:00
Paolo Bonzini	006f8638c6	tcg: add TCG_TARGET_TLB_DISPLACEMENT_BITS This will be used to size the TLB when more than 8 MMU modes are used by the target. Limitations come from the limited size of the immediate fields (which sometimes, as in the case of Aarch64, extend to instructions that shift the immediate). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1424436345-37924-2-git-send-email-pbonzini@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:56 +02:00
Paolo Bonzini	5a58e884d1	tci: do not use CPUArchState in tcg-target.h tcg-target.h does not use any QEMU-specific symbols, save for tci's usage of CPUArchState. Pull that up to tcg/tcg.h. This will make it possible to include tcg-target.h in cpu-defs.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Alexander Graf <agraf@suse.de>	2015-06-03 23:56:55 +02:00
Richard Henderson	dfb3630562	tcg: Add MO_ALIGN, MO_UNALN These modifiers control, on a per-memory-op basis, whether unaligned memory accesses are allowed. The default setting reflects the target's definition of ALIGNED_ONLY. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-05-14 12:15:18 -07:00
Richard Henderson	3972ef6f83	tcg: Push merged memop+mmu_idx parameter to softmmu routines The extra information is not yet used but it is now available. This requires minor changes through all of the tcg backends. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-05-14 12:15:14 -07:00
Richard Henderson	59227d5d45	tcg: Merge memop and mmu_idx parameters to qemu_ld/st At the tcg opcode level, not at the tcg-op.h generator level. This requires minor changes through all of the tcg backends, but none of the cpu translators. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-05-14 12:14:55 -07:00
Emilio G. Cota	00c8fa9ffe	tcg: optimise memory layout of TCGTemp This brings down the size of the struct from 56 to 32 bytes on 64-bit, and to 20 bytes on 32-bit. This leads to memory savings: Before: $ find . -name 'tcg.o' \| xargs size text data bss dec hex filename 41131 29800 88 71019 1156b ./aarch64-softmmu/tcg/tcg.o 37969 29416 96 67481 10799 ./x86_64-linux-user/tcg/tcg.o 39354 28816 96 68266 10aaa ./arm-linux-user/tcg/tcg.o 40802 29096 88 69986 11162 ./arm-softmmu/tcg/tcg.o 39417 29672 88 69177 10e39 ./x86_64-softmmu/tcg/tcg.o After: $ find . -name 'tcg.o' \| xargs size text data bss dec hex filename 40883 29800 88 70771 11473 ./aarch64-softmmu/tcg/tcg.o 37473 29416 96 66985 105a9 ./x86_64-linux-user/tcg/tcg.o 38858 28816 96 67770 108ba ./arm-linux-user/tcg/tcg.o 40554 29096 88 69738 1106a ./arm-softmmu/tcg/tcg.o 39169 29672 88 68929 10d41 ./x86_64-softmmu/tcg/tcg.o Note that using an entire byte for some enums that need less than that wastes a few bits (noticeable in 32 bits, where we use 20 bytes instead of 16) but avoids extraction code, which overall is a win--I've tested several variations of the patch, and the appended is the best performer for OpenSSL's bntest by a very small margin: Before: $ taskset -c 0 perf stat -r 15 -- x86_64-linux-user/qemu-x86_64 img/bntest-x86_64 >/dev/null [...] Performance counter stats for 'x86_64-linux-user/qemu-x86_64 img/bntest-x86_64' (15 runs): 10538.479833 task-clock (msec) # 0.999 CPUs utilized ( +- 0.38% ) 772 context-switches # 0.073 K/sec ( +- 2.03% ) 0 cpu-migrations # 0.000 K/sec ( +-100.00% ) 2,207 page-faults # 0.209 K/sec ( +- 0.08% ) 10.552871687 seconds time elapsed ( +- 0.39% ) After: $ taskset -c 0 perf stat -r 15 -- x86_64-linux-user/qemu-x86_64 img/bntest-x86_64 >/dev/null Performance counter stats for 'x86_64-linux-user/qemu-x86_64 img/bntest-x86_64' (15 runs): 10459.968847 task-clock (msec) # 0.999 CPUs utilized ( +- 0.30% ) 739 context-switches # 0.071 K/sec ( +- 1.71% ) 0 cpu-migrations # 0.000 K/sec ( +- 68.14% ) 2,204 page-faults # 0.211 K/sec ( +- 0.10% ) 10.473900411 seconds time elapsed ( +- 0.30% ) Suggested-by: Stefan Weil <sw@weilnetz.de> Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-05-05 08:44:46 -07:00
Peter Crosthwaite	fee068e4f1	tcg: Delete unused cpu_pc_from_tb() No code uses the cpu_pc_from_tb() function. Delete from tricore and arm which each provide an unused implementation. Update the comment in tcg.h to reflect that this is obsoleted by synchronize_from_tb. Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2015-04-30 16:06:18 +03:00
Peter Maydell	cf811fff2a	tcg/tcg-op.c: Fix ld/st of 64 bit values on 32-bit bigendian hosts Commit `951c6300f7` out-of-lined the 32-bit-host versions of tcg_gen_{ld,st}_i64, but in the process it inadvertently changed an #ifdef HOST_WORDS_BIGENDIAN to #ifdef TCG_TARGET_WORDS_BIGENDIAN. Since the latter doesn't get defined anywhere this meant we always took the "LE host" codepath, and stored the two halves of the value in the wrong order on BE hosts. This typically breaks any 64-bit guest on a 32-bit BE host completely, and will have possibly more subtle effects even for 32-bit guests. Switch the ifdef back to HOST_WORDS_BIGENDIAN. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Tested-by: Andreas Färber <afaerber@suse.de> Message-id: 1428523029-13620-1-git-send-email-peter.maydell@linaro.org	2015-04-09 10:51:10 +01:00
Richard Henderson	2374c4b837	tcg/optimize: Handle or r,a,a with constant a As seen with ubuntu-5.10-live-powerpc.iso. Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-03-16 08:46:13 -07:00
Richard Henderson	37ed3bf1ee	tcg: Complete handling of ALWAYS and NEVER Missing from movcond, and brcondi_i32 (but not brcondi_i64). Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-03-13 13:08:05 -07:00
Richard Henderson	51e3972c41	tcg: Use tcg_malloc to allocate TCGLabel Pre-allocating 512 of them per TB is a waste. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-03-13 12:28:18 -07:00
Richard Henderson	bec1631100	tcg: Change generator-side labels to a pointer This is less about improved type checking than enabling a subsequent change to the representation of labels. Acked-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com> Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Stefan Weil <sw@weilnetz.de> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-03-13 12:28:18 -07:00
Richard Henderson	42a268c241	tcg: Change translator-side labels to a pointer This is improved type checking for the translators -- it's no longer possible to accidentally swap arguments to the branch functions. Note that the code generating backends still manipulate labels as int. With notable exceptions, the scope of the change is just a few lines for each target, so it's not worth building extra machinery to do this change in per-target increments. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com> Cc: Michael Walle <michael@walle.cc> Cc: Leon Alrae <leon.alrae@imgtec.com> Cc: Anthony Green <green@moxielogic.com> Cc: Jia Liu <proljc@gmail.com> Cc: Alexander Graf <agraf@suse.de> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-03-13 12:28:18 -07:00
Richard Henderson	3f626793a2	tcg-ia64: Use tcg_malloc to allocate TCGLabelQemuLdst Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-03-13 12:28:18 -07:00
Richard Henderson	686461c962	tcg: Use tcg_malloc to allocate TCGLabelQemuLdst Pre-allocating 640 of them per TB is a waste. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-03-13 12:28:18 -07:00
Richard Henderson	15fc7daa77	tcg: Remove unused opcodes We no longer need INDEX_op_end to terminate the list, nor do we need 5 forms of nop, since we just remove the TCGOp instead. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-02-12 21:21:38 -08:00
Richard Henderson	a4ce099a7a	tcg: Implement insert_op_before Rather reserving space in the op stream for optimization, let the optimizer add ops as necessary. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-02-12 21:21:38 -08:00
Richard Henderson	0c627cdca2	tcg: Remove opcodes instead of noping them out With the linked list scheme we need not leave nops in the stream that we need to process later. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-02-12 21:21:38 -08:00
Richard Henderson	c45cb8bb89	tcg: Put opcodes in a linked list The previous setup required ops and args to be completely sequential, and was error prone when it came to both iteration and optimization. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-02-12 21:21:38 -08:00
Richard Henderson	fe700adb3d	tcg: Introduce tcg_op_buf_count and tcg_op_buf_full The method by which we count the number of ops emitted is going to change. Abstract that away into some inlines. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-02-12 21:21:38 -08:00
Richard Henderson	3a13c3f34c	tcg: Reduce ifdefs in tcg-op.c Almost completely eliminates the ifdefs in this file, improving confidence in the lesser used 32-bit builds. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-02-12 21:21:38 -08:00
Richard Henderson	951c6300f7	tcg: Move some opcode generation functions out of line Some of these functions are really quite large. We have a number of things that ought to be circularly dependent, but we duplicated code to break that chain for the inlines. This saved 25% of the code size of one of the translators I examined. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2015-02-12 21:21:38 -08:00
Max Filippov	246ae24d7d	tcg: add separate monitor command to dump opcode counters Currently 'info jit' outputs half of the information to monitor and the rest to qemu log. Dumping opcode counts to monitor as a part of 'info jit' command doesn't sound useful. Add new monitor command 'info opcount' that only dumps opcode counters. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>	2014-12-17 05:49:32 +03:00
Aurelien Jarno	0a2923f848	tcg/mips: fix store softmmu slow path Commit `9d8bf2d1` moved the softmmu slow path out of line and introduce a regression at the same time by always calling tcg_out_tlb_load with is_load=1. This makes impossible to run any significant code under qemu-system-mips*. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2014-11-02 13:30:00 +01:00
Richard Henderson	b6c73a6d45	tcg: Always enable TCGv type checking Instead of using structures, which imply some amount of overhead on certain ABIs, use pointer types. This actually reduces the size of the binaries vs a NON-debug build on ppc64 and x86_64, due to a reduction in the number of sign-extension insns. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:28 -04:00
Richard Henderson	9c53889ba3	tcg-aarch64: Use 32-bit loads for qemu_ld_i32 The "old" qemu_ld opcode did not specify the size of the result, and so we had to assume full register width. With the new opcodes, we can narrow the result. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:28 -04:00
Richard Henderson	de8301e542	tcg-sparc: Use UMULXHI instruction Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:27 -04:00
Richard Henderson	c470b663f7	tcg-sparc: Rename ADDX/SUBX insns The pre-v9 ADDX/SUBX insns were renamed ADDC/SUBC for v9. Standardizing on the v9 name makes things less confusing. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:27 -04:00
Richard Henderson	9d6a7a8542	tcg-sparc: Use ADDXC in setcond_i64 Similar to the ADDC tricks we use in setcond_i32. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:27 -04:00
Richard Henderson	321b6c0585	tcg-sparc: Fix setcond_i32 uninitialized value We failed to swap c1 and c2 correctly for NE c2 == 0. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:27 -04:00
Richard Henderson	90379ca84e	tcg-sparc: Use ADDXC in addsub2_i64 On T4 and newer Sparc chips we have an add-with-carry insn that takes its input from %xcc instead of %icc. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:27 -04:00
Richard Henderson	609ac1e164	tcg-sparc: Support addsub2_i64 Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-09-29 14:55:26 -04:00
zhanghailiang	d70724cec8	tcg: dump op count into qemu log fopen() may fail and it does not check its return vaule here, it is better to dump op count to the normal log file. Signed-off-by: Li Liu <john.liuli@huawei.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2014-08-24 13:16:32 +04:00
Peter Maydell	1045fc0439	tcg/ppc: Fix support for 64-bit PPC MacOSX hosts Add back in the support for 64-bit PPC MacOSX hosts that was broken in the recent merge of the 32-bit and 64-bit TCG backends. Reported-by: Andreas Färber <andreas.faerber@web.de> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Tested-by: Andreas Färber <andreas.faerber@web.de>	2014-06-29 11:38:50 +01:00
Richard Henderson	d4cba13bdf	tcg/ppc: Fix failure in tcg_out_mem_long With rt != r0 on loads, we use rt for scratch. If we need an index register different from base, we can't use rt, but r0 is usable. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-id: 1403843160-30332-1-git-send-email-rth@twiddle.net Tested-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-06-27 13:23:41 +01:00
Peter Maydell	4196dca63b	tcg: mark tcg_out* and tcg_patch* with attribute 'unused' The tcg_out* and tcg_patch* functions are utility routines that may or may not be used by a particular backend; mark them with the 'unused' attribute to suppress spurious warnings if they aren't used. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2014-06-24 20:01:24 +04:00
Stefan Weil	5d831be272	Fix new typos (found by codespell) * accomodate -> accommodate * aquiring -> acquiring * beacuse -> because * loosing -> losing * prefering -> preferring * threshhold -> threshold Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2014-06-24 20:01:24 +04:00
Richard Henderson	a84ac4cbbb	tcg-ppc: Use the return address as a base pointer This can significantly reduce code size for generation of (some) 64-bit constants. With the side effect that we know for a fact that exit_tb can use the register to good effect. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:33 -07:00
Richard Henderson	224f9fd419	tcg-ppc: Merge cache-utils into the backend As a "utility", it only supported ppc, and in a way that other tcg backends provided directly in tcg-target.h. Removing this disparity is easier now that the two ppc backends are merged. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:30 -07:00
Richard Henderson	40d964b563	tcg-ppc: Rename the tcg/ppc64 backend The other tcg backends that support 32- and 64-bit modes use the 32-bit name for the port. Follow suit. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:23 -07:00
Richard Henderson	b38daef9d4	tcg-ppc: Remove the backend Vectoring the 32-bit build to the ppc64 directory. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:18 -07:00
Richard Henderson	a757e1eef0	tcg-ppc64: Merge ppc32 shifts Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:15 -07:00
Richard Henderson	8fa391a011	tcg-ppc64: Support mulsh_i32 Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:12 -07:00
Richard Henderson	dfca177874	tcg-ppc64: Merge ppc32 register usage Good enough to run some instructions before things go awry. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:09 -07:00
Richard Henderson	7f25c469c7	tcg-ppc64: Merge ppc32 qemu_ld/st Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:06 -07:00
Richard Henderson	abcf61c48e	tcg-ppc64: Merge ppc32 brcond2, setcond2, muluh Now passes tcg_add_target_add_op_defs assertions, but not complete enough to function. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:32:03 -07:00
Richard Henderson	796f1a689d	tcg-ppc64: Begin merging ppc32 with ppc64 Just enough to compile, assuming you edit config-host.mak manually. It will still abort at runtime, due to missing brcond2, setcond2, mulu2. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:58 -07:00
Richard Henderson	b31284cecf	tcg-ppc64: Fix sub2 implementation All sorts of confusion on argument ordering. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:56 -07:00
Richard Henderson	ffcfbecec3	tcg-ppc64: Merge 32-bit ABIs into the prologue / frame code Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:52 -07:00
Ulrich Weigand	77e58d0d60	tcg-ppc64: Adjust tcg_out_call for ELFv2 The new ELFv2 ABI, used by default on powerpc64le-linux hosts, introduced some changes that are incompatible with code currently generated by the ppc64 TGC target. In particular, we no longer use function descriptors. This patch adds support for the ELFv2 ABI in the ppc64 TGC function call and function prologue sequences. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:46 -07:00
Richard Henderson	a2a98f807b	tcg-ppc64: Support the ppc64 elfv2 ABI Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:43 -07:00
Richard Henderson	eaf7d1cfe0	tcg-ppc64: Use the correct test in tcg_out_call The correct test uses the _CALL_AIX macro, not a host-specific macro. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:38 -07:00
Richard Henderson	802ca56e1d	tcg-ppc64: Better parameterize the stack frame In preparation for supporting other ABIs. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:34 -07:00
Richard Henderson	5456788db7	tcg-ppc64: Fix TCG_TARGET_CALL_STACK_OFFSET The calling convention reserves space for the 8 register parameters on the stack, so using only 6*8=48 as the offset was wrong. We never saw this bug because we don't have any helpers with more than 5 parameters. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:29 -07:00
Richard Henderson	a921fddcc1	tcg-ppc64: Move call macros out of tcg-target.h These values are private to tcg.c; we don't need to expose this nonsense to the translators. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:26 -07:00
Richard Henderson	3bf4a1ed61	tcg-ppc64: Make TCG_AREG0 and TCG_REG_CALL_STACK enum constants Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:22 -07:00
Richard Henderson	4c3831a088	tcg-ppc64: Use tcg_out_{ld,st,cmp} internally Rather than using tcg_out32 and opcodes directly. This allows us to remove LD_ADDR and CMP_L macros. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:17 -07:00
Richard Henderson	de7761a39d	tcg-ppc64: Relax register restrictions in tcg_out_mem_long In order to be able to use tcg_out_ld/st sensibly with scratch registers, assert only when we'd incorrectly clobber a scratch. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:14 -07:00
Richard Henderson	d604f1a90d	tcg-ppc64: Move functions around Code movement only. This will allow us to make use of the other tcg_out_* functions in tidying their implementations. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:10 -07:00
Richard Henderson	de3d636d83	tcg-ppc64: Avoid some hard-codings of TCG_TYPE_I64 Using more appropriate _PTR or _REG where possible. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:31:07 -07:00
Richard Henderson	9171478c95	tcg-ppc: Use uintptr_t in ppc_tb_set_jmp_target Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-23 07:29:30 -07:00
Richard Henderson	bc8d688ff3	tcg/optimize: Don't special case TCG_OPF_CALL_CLOBBER With the "old" ldst ops we didn't know the real width of the result of the load, but with the "new" ldst ops we do. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-18 11:39:02 -07:00
Peter Maydell	31e25e3e57	Merge remote-tracking branch 'remotes/bonzini/softmmu-smap' into staging * remotes/bonzini/softmmu-smap: (33 commits) target-i386: cleanup x86_cpu_get_phys_page_debug target-i386: fix protection bits in the TLB for SMEP target-i386: support long addresses for 4MB pages (PSE-36) target-i386: raise page fault for reserved bits in large pages target-i386: unify reserved bits and NX bit check target-i386: simplify pte/vaddr calculation target-i386: raise page fault for reserved physical address bits target-i386: test reserved PS bit on PML4Es target-i386: set correct error code for reserved bit access target-i386: introduce support for 1 GB pages target-i386: introduce do_check_protect label target-i386: tweak handling of PG_NX_MASK target-i386: commonize checks for PAE and non-PAE target-i386: commonize checks for 4MB and 4KB pages target-i386: commonize checks for 2MB and 4KB pages target-i386: fix coding standards in x86_cpu_handle_mmu_fault target-i386: simplify SMAP handling in MMU_KSMAP_IDX target-i386: fix kernel accesses with SMAP and CPL = 3 target-i386: move check_io helpers to seg_helper.c target-i386: rename KSMAP to KNOSMAP ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-06-05 21:06:14 +01:00
Paolo Bonzini	c773828aa9	softmmu: move all load/store functions to cpu_ldst.h Unify pieces of cpu-all.h, exec-all.h, softmmu_exec.h and tcg/tcg.h into a single new header file with all helpers. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-06-05 16:10:33 +02:00
Alexander Graf	e3eb9806c7	TCG: Fix tcg_gen_extr_i64_tl for 32bit We expose a generic helper "tcg_gen_extr_i64_tl" for 64bit targets, but the same function for 32bit targets is a misnomer and refers to an invalid function name. Fix up the definition to point to the correct internal helper names instead. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-04 14:11:45 -07:00
Richard Henderson	3d1b2ff62c	tcg: Remove TCG_TARGET_HAS_new_ldst Since all backends have been converted, remove the compatibility code. Acked-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-04 14:10:26 -07:00
Richard Henderson	76782fab1c	tci: Convert to new ldst opcodes Tested-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-04 14:10:16 -07:00
Richard Henderson	0b91966730	tcg-i386: Fix win64 qemu store The first non-register argument isn't placed at offset 0. Cc: qemu-stable@nongnu.org Reviewed-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-06-04 13:58:39 -07:00
Richard Henderson	24666baf1f	tcg/optimize: Remember garbage high bits for 32-bit ops For a 64-bit host, the high bits of a register after a 32-bit operation are undefined. Adjust the temps mask for all 32-bit ops to reflect that. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:56 -07:00
Richard Henderson	a62f6f5600	tcg/optimize: Move updating of gen_opc_buf into tcg_opt_gen_mov* No functional change, just reduce a bit of redundancy. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:56 -07:00
Richard Henderson	ae18b28dd1	tcg-sparc: Make debug_frame const Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:56 -07:00
Richard Henderson	d2e16f2ce1	tcg-s390: Make debug_frame const Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:55 -07:00
Richard Henderson	1695974187	tcg-arm: Make debug_frame const Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:55 -07:00
Richard Henderson	3d9bddb30b	tcg-aarch64: Make debug_frame const Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:55 -07:00
Richard Henderson	e9a9a5b605	tcg-i386: Make debug_frame const Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:55 -07:00
Richard Henderson	2c90784abf	tcg: Allow the debug_frame data structure to be constant Adjust the FDE to point to the code_buffer after we've copied it to the image, rather than requiring that the backend set it prior. This allows the backend to use read-only storage for its data. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:55 -07:00
Richard Henderson	bbb8a1b455	tcg: Remove sizemask and flags arguments to tcg_gen_callN Take them from the TCGHelperInfo struct instead. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:55 -07:00
Richard Henderson	afb49896fa	tcg: Save flags and computed sizemask in TCGHelperInfo Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:54 -07:00
Richard Henderson	72866e823e	tcg: Register the helper info struct rather than the name This will let us find all the info from the hash table. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:54 -07:00
Richard Henderson	836d6ed96e	tcg: Inline tcg_gen_helperN Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:54 -07:00
Richard Henderson	c017230d9b	tcg: Use helper-gen.h in tcg-op.h No need to open-code the setup of the builtin helpers. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:54 -07:00
Richard Henderson	944eea962b	tcg: Push tcg-runtime routines into exec/helper-* Rather than special casing them, use the standard mechanisms for tcg helper generation. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:54 -07:00
Richard Henderson	2ef6175aa7	tcg: Invert the inclusion of helper.h Rather than include helper.h with N values of GEN_HELPER, include a secondary file that sets up the macros to include helper.h. This minimizes the files that must be rebuilt when changing the macros for file N. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:54 -07:00
Richard Henderson	a763551ad5	tcg: Optimize brcond2 and setcond2 ne/eq If either the high or low pair can be resolved, we can simplify to either a constant or to a 32-bit comparison. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-28 09:33:53 -07:00
Peter Maydell	27aa948502	Merge remote-tracking branch 'remotes/rth/tcg-mips' into staging * remotes/rth/tcg-mips: (24 commits) tcg-mips: Enable direct chaining of TBs tcg-mips: Simplify movcond tcg-mips: Simplify brcond2 tcg-mips: Improve setcond eq/ne vs zeros tcg-mips: Simplify setcond2 tcg-mips: Simplify brcond tcg-mips: Simplify setcond tcg-mips: Commonize opcode implementations tcg-mips: Improve add2/sub2 tcg-mips: Hoist args loads tcg-mips: Fix subtract immediate range tcg-mips: Name the opcode enumeration tcg-mips: Use EXT for AND on mips32r2 tcg-mips: Use T9 for TCG_TMP1 tcg-mips: Introduce TCG_TMP0, TCG_TMP1 tcg-mips: Rearrange register allocation tcg-mips: Convert to new_ldst tcg-mips: Convert to new qemu_l/st helpers tcg-mips: Move softmmu slow path out of line tcg-mips: Split large ldst offsets ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-05-27 18:31:02 +01:00
Richard Henderson	b6bfeea92a	tcg-mips: Enable direct chaining of TBs Now that the code_gen_buffer is constrained to not cross 256mb regions, we are assured that we can use J to reach another TB. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:48:37 -07:00
Richard Henderson	33fac20bb2	tcg-mips: Simplify movcond Use the same table to fold comparisons as with setcond. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:47:14 -07:00
Richard Henderson	3401fd259e	tcg-mips: Simplify brcond2 Emitting a single branch instead of (up to) 3, using setcond2 to generate the composite compare. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:47:08 -07:00
Richard Henderson	1db1c4d7d9	tcg-mips: Improve setcond eq/ne vs zeros The original code results in one too many insns per zero present in the input. And since comparing 64-bit numbers vs zero is common... Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:58 -07:00
Richard Henderson	9a2f0bfe32	tcg-mips: Simplify setcond2 Using tcg_unsigned_cond and tcg_high_cond. Also, move the function up in the file for future cleanups. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:53 -07:00
Richard Henderson	c068896f7f	tcg-mips: Simplify brcond Use the same table to fold comparisons as with setcond. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:49 -07:00
Richard Henderson	fd1cf66630	tcg-mips: Simplify setcond Use a table to fold comparisons to less-than. Also, move the function up in the file for futher simplifications. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:39 -07:00
Richard Henderson	4f048535cd	tcg-mips: Commonize opcode implementations Most opcodes fall in to one of a couple of patterns. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:32 -07:00
Richard Henderson	741f117d9a	tcg-mips: Improve add2/sub2 Reduce insn count from 5 to either 3 or 4. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:25 -07:00
Richard Henderson	22ee3a987d	tcg-mips: Hoist args loads Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:20 -07:00
Richard Henderson	070603f62b	tcg-mips: Fix subtract immediate range Since we must use ADDUI, we would generate incorrect code for -32768. Leaving off subtract of +32768 makes things easier for a follow-on patch. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:08 -07:00
Richard Henderson	ac0f3b1263	tcg-mips: Name the opcode enumeration And use it in the opcode emission functions. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:46:03 -07:00
Richard Henderson	1c4182687e	tcg-mips: Use EXT for AND on mips32r2 At the same time, tidy deposit by introducing tcg_out_opc_bf. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:56 -07:00
Richard Henderson	f216a35f36	tcg-mips: Use T9 for TCG_TMP1 T0 is an argument register for the n32 and n64 abis. T9 is the call address register for the abis, and is more directly under the control of the backend. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:48 -07:00
Richard Henderson	6c530e32f4	tcg-mips: Introduce TCG_TMP0, TCG_TMP1 Use these instead of hard-coding the registers to use for temporaries. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:44 -07:00
Richard Henderson	418839044e	tcg-mips: Rearrange register allocation Use FP (also known as S8) as a normal call-saved register. Include T0 in the allocation order and call-clobbered list even though it's currently used as a TCG temporary. Put the argument registers at the end of the allocation order. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:28 -07:00
Richard Henderson	fbef2cc80f	tcg-mips: Convert to new_ldst Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:24 -07:00
Richard Henderson	ce0236cfbd	tcg-mips: Convert to new qemu_l/st helpers In addition, fill delay slots calling the helpers and tail call to the store helpers. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:20 -07:00
Richard Henderson	9d8bf2d125	tcg-mips: Move softmmu slow path out of line At the same time, tidy up the call helpers, avoiding a memory reference. Split out several subroutines. Use TCGMemOp constants. Make endianness selectable at runtime. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:16 -07:00
Richard Henderson	f9a716325f	tcg-mips: Split large ldst offsets Use this to reduce goto_tb by one insn. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:13 -07:00
Richard Henderson	7dae901d2d	tcg-mips: Fill the exit_tb delay slot Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:09 -07:00
Richard Henderson	f8c9eddb2b	tcg-mips: Use J and JAL opcodes For userland builds calls will normally be in range, and for the exit_tb opcode the branch to the epilogue. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-24 08:45:05 -07:00
Richard Henderson	a3abb29292	tci: Fix tcg_out_call Broken since `dddbb2e1e3`. Do all the rest of the things that tcg_out_op did before and after the big switch statement. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-22 13:25:34 -07:00
Richard Henderson	a10c64e0df	tcg-s390: Implement direct chaining of TBs Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 09:22:32 -07:00
Richard Henderson	7b7066b1db	tcg-s390: Improve setcond There are a variety of common cases for which we can use carry tricks to avoid a conditional branch. On very new hardware, use LOAD ON CONDITION instead of a conditional branch. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 01:33:35 -04:00
Richard Henderson	ad19b35808	tcg-s390: Allow immediate operands to add2 and sub2 Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 01:33:29 -04:00
Richard Henderson	f167dc37da	tcg-s390: Implement tcg_register_jit Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 00:12:25 -04:00
Richard Henderson	547ec12141	tcg-s390: Use more risbg in the tlb sequence Elides two insns from the sequence. The resulting tlb compare sequence is satisfyingly minimal: risbg %r2,%r8,51,186,56 risbg %r3,%r8,61,178,0 cg %r3,904(%r10,%r2) lg %r2,920(%r10,%r2) jlh tlb_miss Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 00:10:42 -04:00
Richard Henderson	fb5964152d	tcg-s390: Move ldst helpers out of line That is, the old LDST_OPTIMIZATION. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 00:10:00 -04:00
Richard Henderson	f24efee41e	tcg-s390: Convert to new ldst opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 00:09:59 -04:00
Richard Henderson	b8dd88b85c	tcg-s390: Integrate endianness into TCGMemOp Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 00:09:59 -04:00
Richard Henderson	a5a04f2830	tcg-s390: Convert to TCGMemOp Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 00:09:59 -04:00
Richard Henderson	a175689654	tcg-s390: Fix off-by-one in wraparound andi Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-15 00:09:47 -04:00
Richard Henderson	450445d543	tcg: Fix tcg_reg_alloc_mov vs no-op truncation Commit `af3cbfbe80` hoisted some "common" loads of the temporary type, forgetting that the types could differ during truncating moves. This affects the correctness of the memory offset on big-endian hosts. Tested-by: Tom Musta <tommusta@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-14 09:56:13 -07:00
Richard Henderson	96d0ee7f09	tcg: Remove unreachable code in tcg_out_op and op_defs The INDEX_op_call case has just been obsoleted; the mov and movi cases have not been reachable for years. Attempt to document this both in each tcg_out_op switch, and via TCG_OPF_NOT_PRESENT. Because of the TCG_OPF_NOT_PRESENT change, this must be done for all targets in a single commit. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:13 -07:00
Richard Henderson	af3cbfbe80	tcg: Use tcg_target_available_regs in tcg_reg_alloc_mov The move opcodes are special in that their constraints must cover all available registers. So instead of checking the constraints, just use the available registers. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	cf06667428	tcg: Make call address a constant parameter Avoid allocating a tcg temporary to hold the constant address, and instead place it directly into the op_call arguments. At the same time, convert to the newly introduced tcg_out_call backend function, rather than invoking tcg_out_op for the call. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	dddbb2e1e3	tci: Create tcg_out_call Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	eb68a4fa4e	tcg-mips: Split out tcg_out_call Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	4e9cf8409a	tcg-sparc: Create tcg_out_call Rename the existing tcg_out_calli to tcg_out_call_nodelay. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	fdd8ec7184	tcg-ppc64: Rename tcg_out_calli to tcg_out_call Merge the existing tcg_out_call into tcg_out_op. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	00d7a1acab	tcg-ppc: Split out tcg_out_call Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	a8111212b3	tcg-s390: Rename tgen_calli to tcg_out_call Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:12 -07:00
Richard Henderson	6bf3e99747	tcg-i386: Rename tcg_out_calli to tcg_out_call Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 11:13:11 -07:00
Richard Henderson	5053361b3e	tcg: Require TCG_TARGET_INSN_UNIT_SIZE Now that all backends do define TCG_TARGET_INSN_UNIT_SIZE, remove the fallback definition. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:07:06 -07:00
Richard Henderson	a7f96f7666	tci: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:07:02 -07:00
Richard Henderson	ae0218e350	tcg-mips: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:06:58 -07:00
Richard Henderson	5588ff2921	tcg-ia64: Define TCG_TARGET_INSN_UNIT_SIZE Using a 16-byte aligned structure achieves best results, both for code cleanliness and compiled code size. However, this means that we can't use the trick of encoding the slot number into the low 2 bits. Thankfully, we only ever use slot2, so make that explicit in the names of the relocation functions, and drop the code for other slots. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:06:58 -07:00
Richard Henderson	8c081b1802	tcg-s390: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:06:58 -07:00
Richard Henderson	8587c30c3e	tcg-aarch64: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Acked-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:06:52 -07:00
Richard Henderson	267c931985	tcg-arm: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:06:29 -07:00
Richard Henderson	abce5964be	tcg-sparc: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Richard Henderson	38cf39f739	tcg-ppc: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Richard Henderson	e083c4a233	tcg-ppc64: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Richard Henderson	f6bff89d06	tcg-i386: Define TCG_TARGET_INSN_UNIT_SIZE And use tcg pointer differencing functions as appropriate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Richard Henderson	1813e1758d	tcg: Define tcg_insn_unit for code pointers To be defined by the tcg backend based on the elemental unit of the ISA. During the transition, allow TCG_TARGET_INSN_UNIT_SIZE to be undefined, which allows us to default tcg_insn_unit to the current uint8_t. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Richard Henderson	52a1f64ec5	tcg: Introduce byte pointer arithmetic helpers Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Peter Maydell	5c53bb8121	tcg: Avoid undefined behaviour patching code at unaligned addresses To avoid C undefined behaviour when patching generated code, provide wrappers tcg_patch8/16/32/64 which use the usual memcpy trick, and use them in the i386 backend. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Peter Maydell	4387345a96	tcg: Avoid stores to unaligned addresses Avoid stores to unaligned addresses in TCG code generation, by using the usual memcpy() approach. (Using bswap.h would drag a lot of QEMU baggage into TCG, so it's simpler just to do direct memcpy() here.) Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-05-12 10:03:04 -07:00
Richard Henderson	ebd0c614d7	tcg-sparc: Accept stores of zero Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	035b239826	tcg-sparc: Fix small 32-bit movi We tested imm13 before discarding garbage high bits. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	35e2da1556	tcg-sparc: Fixup function argument types Use TCGReg everywhere appropriate. Use int32_t for all arguments that may be registers or immediate constants. Merge tcg_out_addi into its only caller. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	b357f902bf	tcg-sparc: Hoist common argument loads in tcg_out_op Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	98b90bab3f	tcg-sparc: Don't handle mov/movi in tcg_out_op Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	425532d71d	tcg-sparc: Tidy check_fit_* tests Use sextract instead of raw bit shifting for the tests. Introduce a new check_fit_ptr macro to make it clear we're looking at pointers. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	f4c166619e	tcg-sparc: Implement muls2_i32 Using the 32-bit SMUL is a tad more efficient than resorting to extending and using the 64-bit MULX. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	8b66eefe0d	tcg-sparc: Use the RETURN instruction Saves one insn per TB exit over JMPL+RESTORE. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	34b1a49cb1	tcg-sparc: Use 64-bit registers with sparcv8plus Quite a lot of effort was spent composing and decomposing 64-bit quantities in registers, when we should just create them and leave them as one 64-bit register. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	a24fba935a	tcg-sparc: Support trunc_shr_i32 Unlike a 64-bit shift op, allows the output to be in %l or %i registers for sparcv8plus. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	9f44adc573	tcg-sparc: Remove most uses of TCG_TARGET_REG_BITS Replace with SPARC64 define. Soon even sparcv8plus will use 64-bit register as far as TCG is concerned. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:35 -07:00
Richard Henderson	4bb7a41ed6	tcg: Add INDEX_op_trunc_shr_i32 Let the backend do something special for truncation. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:34 -07:00
Richard Henderson	71b926992e	tcg: Fix missed pointer size != TCG_TARGET_REG_BITS changes Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-28 11:06:34 -07:00
Peter Maydell	ad600a4d49	Pull tcg 2014-04-22 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTVthUAAoJEK0ScMxN0Ceb5OUH/0Ri4engOQim3mV1jAOr2Vca 3zg+Hl8b+UioXD0se24Iipr6s+02G1DApbbLPX7DZAnoh9jEBvDtHOdde3pNbMkQ jcTpShTyT7OKSsklRN19ckvk0ffBch5W3Ekkw6/Hg6ys2HIvirRpEL6R58oJNlP6 xcCkQZISZVkakbv5xft8YQo1v8wnU5q2l85OaC1aaDB6g+Y6ZgoA1qkWjqlHkmQk 1asflfbC0r5ke+yx7vz6310f5xBDLSVv17dqsDUr70o1m/6bem6wQXMczwmYUfk5 99OCPiqdiCZLJyVFvIvfwSakL9Bq/nvnmywXTkrB7rovk5VZz3gc4sJkuUkL+KM= =URV1 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/tcg-next-20140422' into staging Pull tcg 2014-04-22 # gpg: Signature made Tue 22 Apr 2014 22:00:04 BST using RSA key ID 4DD0279B # gpg: Can't check signature: public key not found * remotes/rth/tags/tcg-next-20140422: tcg: Use HOST_WORDS_BIGENDIAN tcg: Fix fallback from muls2_i64 to mulu2_i64 tcg: Use tcg_gen_mulu2_i32 in tcg_gen_muls2_i32 tcg: Relax requirement for mulu2_i32 on 32-bit hosts tcg-s390: Remove W constraint tcg-sparc: Use the type parameter to tcg_target_const_match tcg-ppc64: Use the type parameter to tcg_target_const_match tcg-aarch64: Remove w constraint tcg: Add TCGType parameter to tcg_target_const_match tcg: Fix out of range shift in deposit optimizations tci: Mask shift counts to avoid undefined behavior tcg: Mask shift quantities while folding tcg: Use "unspecified behavior" for shifts tcg: Fix warning (1 bit signed bitfield entry) and replace int by bool Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2014-04-24 15:24:52 +01:00
Richard Henderson	02eb19d0ec	tcg: Use HOST_WORDS_BIGENDIAN Instead of rolling a local TCG_TARGET_WORDS_BIGENDIAN. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:37 -07:00
Richard Henderson	662deb908f	tcg: Fix fallback from muls2_i64 to mulu2_i64 Brown Bag sez, don't put the fallback code into the wrong function. Also, check for muluh_i64 and use tcg_gen_mulu2_i64 instead of raw ops. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:37 -07:00
Richard Henderson	f46fc4e6a9	tcg: Use tcg_gen_mulu2_i32 in tcg_gen_muls2_i32 Rather than hard-coding use of mulu2_i32, allow muluh_i32. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:37 -07:00
Richard Henderson	df9ebea53e	tcg: Relax requirement for mulu2_i32 on 32-bit hosts Instead require either mulu2_i32 or muluh_i32. The code in tcg-op.h already supports looking for both. Previous incomplete conversion? Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:37 -07:00
Richard Henderson	671c835b7d	tcg-s390: Remove W constraint Now redundant with the type parameter to tcg_target_const_match. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	4b304cfae1	tcg-sparc: Use the type parameter to tcg_target_const_match Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	1194dcba22	tcg-ppc64: Use the type parameter to tcg_target_const_match Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	170bf9315b	tcg-aarch64: Remove w constraint Now redundant with the type parameter to tcg_target_const_match. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	f6c6afc1d4	tcg: Add TCGType parameter to tcg_target_const_match Most 64-bit targets need to be able to ignore the high bits of a TCG_TYPE_I32 value. Suggested-by: Stuart Brady <sdb@zubnet.me.uk> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	d998e555d2	tcg: Fix out of range shift in deposit optimizations By inspection, for a deposit(x, y, 0, 64), we'd have a shift of (1<<64) and everything else falls apart. But we can reuse the existing deposit logic to get this right. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	50c5c4d125	tcg: Mask shift quantities while folding The TCG result would be undefined, but we can at least produce one plausible result and avoid triggering the wrath of analysis tools. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	20022fa15f	tcg: Use "unspecified behavior" for shifts Change the definition such that shifts are not allowed to crash for any input. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Stefan Weil	ad5171dbd4	tcg: Fix warning (1 bit signed bitfield entry) and replace int by bool Static code analyzers complain about signed bitfields with only a single bit. is_ld is used as a boolean value, so make it bool. ppc64 already used bool for the 2nd argument is_ld of the local function add_qemu_ldst_label. Modify all other TCG targets to do follow this example. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-18 16:57:36 -07:00
Richard Henderson	0374f5089a	tcg-ia64: Convert to new ldst opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:20 -04:00
Richard Henderson	3bf16cb31a	tcg-ia64: Move part of softmmu slow path out of line Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:20 -04:00
Richard Henderson	4bdd547aaa	tcg-ia64: Convert to new ldst helpers Still inline, but updated to the new routines. Always use the LE helpers, reusing the bswap between the fast and slot paths. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:19 -04:00
Richard Henderson	af9fe31070	tcg-ia64: Reduce code duplication in tcg_out_qemu_ld The only differences were in the bswap insns emitted. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:19 -04:00
Richard Henderson	1f91f39219	tcg-ia64: Move tlb addend load into tlb read Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:18 -04:00
Richard Henderson	b672cf66c3	tcg-ia64: Move bswap for store into tlb load Saving at least two cycles per store, and cleaning up the code. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:17 -04:00
Richard Henderson	4c186ee2cf	tcg-ia64: Re-bundle the tlb load This sequencing requires 5 stop bits instead of 6, and has room left over to pre-load the tlb addend, and bswap data prior to being stored. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:17 -04:00
Richard Henderson	dcf91778ca	tcg-ia64: Optimize small arguments to exit_tb Saves one bundle for the common case of exit_tb 0. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-17 16:56:16 -04:00
Richard Henderson	b825025f08	tcg-aarch64: Use tcg_out_mov in preference to tcg_out_movr It's the more canonical interface. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:02 -04:00
Richard Henderson	a056c9faa4	tcg-aarch64: Prefer unsigned offsets before signed offsets for ldst The assembler seems to prefer them, perhaps we should too. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:01 -04:00
Richard Henderson	3d4299f425	tcg-aarch64: Introduce tcg_out_insn_3312, _3310, _3313 Replace aarch64_ldst_op_data with AArch64LdstType, as it wasn't encoded for the proper shift for the field and was confusing. Merge aarch64_ldst_op_data, AArch64LdstType, and a few stray opcode bits into a single I3312_* argument, eliminating some magic numbers from the helper functions. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:01 -04:00
Richard Henderson	dc73dfd4bc	tcg-aarch64: Merge aarch64_ldst_get_data/type into tcg_out_op Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:01 -04:00
Richard Henderson	edd8824cd4	tcg-aarch64: Introduce tcg_out_insn_3507 Cleaning up the implementation of REV and REV16 at the same time. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:01 -04:00
Richard Henderson	e81864a109	tcg-aarch64: Support stores of zero Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:01 -04:00
Richard Henderson	de61d14fa7	tcg-aarch64: Implement TCG_TARGET_HAS_new_ldst Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:01 -04:00
Richard Henderson	667b1cdd4e	tcg-aarch64: Pass qemu_ld/st arguments directly Instead of passing them the "args" array. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:00 -04:00
Richard Henderson	9e4177ad6d	tcg-aarch64: Use TCGMemOp in qemu_ld/st Making the bswap conditional on the memop instead of a compile-time test. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:00 -04:00
Richard Henderson	dc0c8aaf2c	tcg-aarch64: Use ADR to pass the return address to the ld/st helpers Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:00 -04:00
Richard Henderson	ae7ab46aa8	tcg-aarch64: Use tcg_out_call for qemu_ld/st In some cases, a direct branch will be in range. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:00 -04:00
Richard Henderson	6f4724672c	tcg-aarch64: Avoid add with zero in tlb load Some guest env are small enough to reach the tlb with only a 12-bit addition. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:00 -04:00
Richard Henderson	38d195aa05	tcg-aarch64: Implement tcg_register_jit Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:13:00 -04:00
Richard Henderson	95f72aa90a	tcg-aarch64: Introduce tcg_out_insn_3314 Combines 4 other inline functions and tidies the prologue. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:59 -04:00
Richard Henderson	d82b78e48b	tcg-aarch64: Reuse LR in translated code It's obviously call-clobbered, but is otherwise unused. Repurpose it as the TCG temporary. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:59 -04:00
Richard Henderson	3d9e69a238	tcg-aarch64: Use CBZ and CBNZ A compare and branch against zero happens at the start of every single TB. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:59 -04:00
Richard Henderson	cae1f6f3e6	tcg-aarch64: Create tcg_out_brcond Rearrange code to put the compare and branch in the same place. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:59 -04:00
Richard Henderson	81d8a5ee19	tcg-aarch64: Use symbolic names for branches Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:59 -04:00
Richard Henderson	c6e310d938	tcg-aarch64: Use adrp in tcg_out_movi Loading an qemu pointer as an immediate happens often. E.g. - exit_tb $0x7fa8140013 + exit_tb $0x7f81ee0013 ... - : d2800260 mov x0, #0x13 - : f2b50280 movk x0, #0xa814, lsl #16 - : f2c00fe0 movk x0, #0x7f, lsl #32 + : 90ff1000 adrp x0, 0x7f81ee0000 + : 91004c00 add x0, x0, #0x13 Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:58 -04:00
Richard Henderson	d8918df577	tcg-aarch64: Special case small constants in tcg_out_movi Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:58 -04:00
Richard Henderson	4ec4f0bd56	tcg-aarch64: Use ORRI in tcg_out_movi The subset of logical immediates that we support is quite quick to test, and such constants are quite common to want to load. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:58 -04:00
Richard Henderson	dfeb5fe770	tcg-aarch64: Use MOVN in tcg_out_movi When profitable, initialize the register with MOVN instead of MOVZ, before setting the remaining lanes with MOVK. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:58 -04:00
Richard Henderson	929f8b5550	tcg-aarch64: Use TCGType and TCGMemOp constants Rather than raw constants that could mean anything. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:58 -04:00
Richard Henderson	8bf56493f1	tcg-aarch64: Use intptr_t apropriately As opposed to tcg_target_long. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-04-16 12:12:58 -04:00
Richard Henderson	1a8e80d7e8	tcg-arm: Avoid ldrd/strd for user-only emulation The arm ldrd/strd insns must cause alignment traps, whereas at least for armv7 ldr/str must handle unaligned operations. While this is hardly the only problem facing user-only emu, this solves one problem for i386 on armv7 emulation. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reported-by: Huw Davies <huw@codeweavers.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-27 16:33:01 -04:00
Richard Henderson	cab0a7ea00	tcg-sparc: Convert to new ldst opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	7ea5d7256d	tcg-sparc: Convert to new ldst helpers All of the helpers with the explicit big/little endian option require the return address as a parameter. Acquire this via a trampoline. Move the load of areg0 into the trampoline. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	a8b12c108c	tcg-sparc: Tidy tcg_out_tlb_load interface Pass address registers explicitly, rather than as indicies of args[]. It's two argument registers either way. Use more TCGReg as appropriate. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	eef0d9e740	tcg-sparc: Use TCGMemOp within qemu_ldst routines Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	a9c7d27bd1	tcg-sparc: Improve tcg_out_movi If bits 31:13 are zero, reduce the insn count by one. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	1d0a60681a	tcg-sparc: Dont handle constant arguments to ext32 ops Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	5f9eb02555	tcg-sparc: Don't handle remainder The generic fallback is exactly what we implemented. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	c8fc56cedd	tcg-sparc: Use intptr_t as appropriate Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:26 -07:00
Richard Henderson	aad2f06a7f	tcg-sparc: Tidy call+jump patterns Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:25 -07:00
Richard Henderson	d801a8f2ce	tcg-sparc: Fix tlb read We were computing the full address into %o0 and then not using it. Adjust some of the computation to rely less on having to pull immediate values into registers. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:25 -07:00
Richard Henderson	e7bc9004e7	tcg-sparc: Fix ld64 for 32-bit mode Since were not using an annulled branch, we need to put a nop in the delay slot. Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-17 11:13:25 -07:00
Richard Henderson	582ab779c5	tcg-aarch64: Introduce tcg_out_insn_3405 Cleaning up the implementation of tcg_out_movi at the same time. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 11:00:15 -07:00
Richard Henderson	8678b71ce6	tcg-aarch64: Support div, rem Clean up multiply at the same time. For remainder, generic code will produce mul+sub, whereas we can implement with msub. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 11:00:10 -07:00
Richard Henderson	1fcc9ddfb3	tcg-aarch64: Support muluh, mulsh Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 11:00:07 -07:00
Richard Henderson	c6e929e784	tcg-aarch64: Support add2, sub2 Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 11:00:04 -07:00
Richard Henderson	b3c56df769	tcg-aarch64: Support deposit Also tidy the implementation of ubfm, sbfm, extr in order to share code. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 11:00:01 -07:00
Richard Henderson	ed7a0aa8bc	tcg-aarch64: Use tcg_out_insn for setcond Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:58 -07:00
Richard Henderson	04ce397b33	tcg-aarch64: Support movcond Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:55 -07:00
Richard Henderson	14b155ddc4	tcg-aarch64: Support andc, orc, eqv, not, neg Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:52 -07:00
Richard Henderson	e029f29385	tcg-aarch64: Handle constant operands to and, or, xor Handle a simplified set of logical immediates for the moment. The way gcc and binutils do it, with 52k worth of tables, and a binary search depth of log2(5334) = 13, seems slow for the most common cases. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:47 -07:00
Richard Henderson	90f1cd9138	tcg-aarch64: Handle constant operands to add, sub, and compare Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:44 -07:00
Richard Henderson	7d11fc7c2b	tcg-aarch64: Implement mov with tcg_out_insn Avoid the magic numbers in the current implementation. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:41 -07:00
Richard Henderson	096c46c0ff	tcg-aarch64: Introduce tcg_out_insn_3401 This merges the implementation of tcg_out_addi and tcg_out_subi. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:38 -07:00
Richard Henderson	df9351e372	tcg-aarch64: Convert shift insns to tcg_out_insn Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:35 -07:00
Richard Henderson	50573c66eb	tcg-aarch64: Introduce tcg_out_insn Converting the add/sub (3.5.2) and logical shifted (3.5.10) instruction groups to the new scheme. Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Tested-by: Claudio Fontana <claudio.fontana@huawei.com>	2014-03-14 10:59:13 -07:00
Richard Henderson	f8e2484389	tcg-aarch64: Remove nop from qemu_st slow path Commit `023261ef85` failed to remove a nop that's no longer required. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:25 -08:00
Richard Henderson	523fdc08cc	tcg-aarch64: Simplify tcg_out_ldst_9 encoding At first glance the code appears to be using 1's compliment encoding, a-la AArch32. Except that the constant is "off", creating a complicated split field 2's compliment encoding. Much clearer to just use a normal mask and shift. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:25 -08:00
Richard Henderson	017a86f7ad	tcg-aarch64: Use intptr_t apropriately As opposed to tcg_target_long. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:25 -08:00
Richard Henderson	2e796c7621	tcg-aarch64: Remove the shift_imm parameter from tcg_out_cmp It was unused. Let's not overcomplicate things before we need them. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:25 -08:00
Richard Henderson	8d8db193f2	tcg-aarch64: Hoist common argument loads in tcg_out_op This reduces the code size of the function significantly. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:16 -08:00
Richard Henderson	a51a6b6ad5	tcg-aarch64: Don't handle mov/movi in tcg_out_op Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:15 -08:00
Richard Henderson	f029341494	tcg-aarch64: Set ext based on TCG_OPF_64BIT Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:09 -08:00
Richard Henderson	7763ffa017	tcg-aarch64: Change all ext variables to TCGType We assert that the values for _I32 and _I64 are 0 and 1 respectively. This will make a couple of functions declared by tcg.c cleaner. Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:09 -08:00
Richard Henderson	3353d0dcc3	tcg-aarch64: Remove redundant CPU_TLB_ENTRY_BITS check Removed from other targets in `56bbc2f967`. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-03-08 21:23:02 -08:00
Stefan Weil	c5d3c49896	tcg: Fix typo in comment (dependancies -> dependencies) Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2014-03-02 17:12:51 +04:00
Peter Maydell	774d566cdb	tcg/i386: Fix build for systems without working cpuid.h (MacOSX, Win32) Win32 doesn't have a cpuid.h, and MacOSX may have one but without the __cpuid() function we use, which means that commit `9d2eec20` broke the build for those platforms. Fix this by tightening up our configure cpuid.h check to test that the functions we need are present, and adding some missing #ifdef guards in tcg/i386/tcg-target.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>	2014-02-21 10:39:10 +00:00
Richard Henderson	6399ab3325	tcg/i386: Use SHLX/SHRX/SARX instructions These three-operand shift instructions do not require the shift count to be placed into ECX. This reduces the number of mov insns required, with the mere addition of a new register constraint. Don't attempt to get rid of the matching constraint, as that's impossible to manipulate with just a new constraint. In addition, constant shifts still need the matching constraint. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-02-17 10:12:29 -06:00
Richard Henderson	9d2eec202f	tcg/i386: Use ANDN instruction Note that the optimizer cannot simplify ANDC X,Y,C to AND X,Y,~C so we must handle constants in the implementation of andc. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2014-02-17 10:12:29 -06:00

... 18 19 20 21 22 ...

2962 Commits