- 17 12月, 2018 5 次提交
-
-
由 Richard Henderson 提交于
This preserves the invariant that all TCG_TYPE_I32 values are zero-extended in the 64-bit host register. Reviewed-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Reviewed-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
This will move the assert for success from within (subroutines of) patch_reloc into the callers. It will also let new code do something different when a relocation is out of range. For the moment, all backends are trivially converted to return true. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 27 9月, 2018 1 次提交
-
-
由 Roman Kapl 提交于
The TCG backend uses LOWREGMASK to get the low 3 bits of register numbers. This was defined as no-op for 32-bit x86, with the assumption that we have eight registers anyway. This assumption is not true once we have xmm regs. Since LOWREGMASK was a no-op, xmm register indidices were wrong in opcodes and have overflown into other opcode fields, wreaking havoc. To trigger these problems, you can try running the "movi d8, #0x0" AArch64 instruction on 32-bit x86. "vpxor %xmm0, %xmm0, %xmm0" should be generated, but instead TCG generated "vpxor %xmm0, %xmm0, %xmm2". Fixes: 770c2fc7 ("Add vector operations") Signed-off-by: NRoman Kapl <rka@sysgo.com> Message-Id: <20180824131734.18557-1-rka@sysgo.com> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 24 7月, 2018 1 次提交
-
-
由 Richard Henderson 提交于
When host vector registers and operations were introduced, I failed to mark the registers call clobbered as required by the ABI. Fixes: 770c2fc7 Cc: qemu-stable@nongnu.org Reported-by: NJason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 16 6月, 2018 2 次提交
-
-
由 Richard Henderson 提交于
Also, assert that we don't overflow any of two different offsets into the TB. Both unwind and goto_tb both record a uint16_t for later use. This fixes an arm-softmmu test case utilizing NEON in which there is a TB generated that runs to 7800 opcodes, and compiles to 96k on an x86_64 host. This overflows the 16-bit offset in which we record the goto_tb reset offset. Because of that overflow, we install a jump destination that goes to neverland. Boom. With this reduced op count, the same TB compiles to about 48k for aarch64, ppc64le, and x86_64 hosts, and neither assertion fires. Cc: qemu-stable@nongnu.org Reported-by: N"Jason A. Donenfeld" <Jason@zx2c4.com> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 John Arbuckle 提交于
The assembler in most versions of Mac OS X is pretty old and does not support the xgetbv instruction. To go around this problem, the raw encoding of the instruction is used instead. Signed-off-by: NJohn Arbuckle <programmingkidx@gmail.com> Message-Id: <20180604215102.11002-1-programmingkidx@gmail.com> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 09 5月, 2018 1 次提交
-
-
由 Peter Maydell 提交于
The VPUNPCKLD* instructions are all "non-destructive source", indicated by "NDS" in the encoding string in the x86 ISA manual. This means that they take two source operands, one of which is encoded in the VEX.vvvv field. We were incorrectly treating them as if they were destructive-source and passing 0 as the 'v' argument of tcg_out_vex_modrm(). This meant we were always using %xmm0 as one of the source operands, causing incorrect results if the register allocator happened to want to use something else. For instance the input AArch64 insn: DUP v26.16b, w21 which becomes TCG IR ops: dup_vec v128,e8,tmp2,x21 st_vec v128,e8,tmp2,env,$0xa40 was assembled to: 0x607c568c: c4 c1 7a 7e 86 e8 00 00 vmovq 0xe8(%r14), %xmm0 0x607c5694: 00 0x607c5695: c5 f9 60 c8 vpunpcklbw %xmm0, %xmm0, %xmm1 0x607c5699: c5 f9 61 c9 vpunpcklwd %xmm1, %xmm0, %xmm1 0x607c569d: c5 f9 70 c9 00 vpshufd $0, %xmm1, %xmm1 0x607c56a2: c4 c1 7a 7f 8e 40 0a 00 vmovdqu %xmm1, 0xa40(%r14) 0x607c56aa: 00 when the vpunpcklwd insn should be "%xmm1, %xmm1, %xmm1". This resulted in our incorrectly setting the output vector to q26=0000320000003200:0000320000003200 when given an input of x21 == 0000000002803200 rather than the expected all-zeroes. Pass the correct source register number to tcg_out_vex_modrm() for these insns. Fixes: 770c2fc7 Cc: qemu-stable@nongnu.org Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Message-Id: <20180504153431.5169-1-peter.maydell@linaro.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 16 3月, 2018 1 次提交
-
-
由 Richard Henderson 提交于
Unknown why -m32 was passing with gcc but not clang; it should have failed for both. This would be used for tcg_gen_dup_i64_vec, and visible with the right TB and an aarch64 guest. Reported-by: NMax Reitz <mreitz@redhat.com> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 08 2月, 2018 1 次提交
-
-
由 Richard Henderson 提交于
The x86 vector instruction set is extremely irregular. With newer editions, Intel has filled in some of the blanks. However, we don't get many 64-bit operations until SSE4.2, introduced in 2009. The subsequent edition was for AVX1, introduced in 2011, which added three-operand addressing, and adjusts how all instructions should be encoded. Given the relatively narrow 2 year window between possible to support and desirable to support, and to vastly simplify code maintainence, I am only planning to support AVX1 and later cpus. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 10 10月, 2017 1 次提交
-
-
由 Emilio G. Cota 提交于
Reviewed-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 17 9月, 2017 2 次提交
-
-
由 Richard Henderson 提交于
It's not even clear what the interface REG and VAL32 were supposed to mean. All uses had REG = 0 and VAL32 was the bitset assigned to the destination. Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 08 9月, 2017 2 次提交
-
-
由 Richard Henderson 提交于
Already it saves 2 bytes per call, but also the constant pool entry may well be shared across multiple calls. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Dispense with TCGBackendData, as it has never been used for more than holding a single pointer. Use a define in the cpu/tcg-target.h to signal requirement for TCGLabelQemuLdst, so that we can drop the no-op tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 24 7月, 2017 1 次提交
-
-
由 Richard Henderson 提交于
Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the supplied <cpuid.h> does not contain the bit_AVX2 define that we use when detecting whether the routine can be enabled. Introduce a qemu-specific header that uses the compiler's definition of __cpuid et al, but supplies any missing bit_* definitions needed. This avoids introducing any extra ifdefs to util/bufferiszero.c, and allows quite a few to be removed from tcg/i386/tcg-target.inc.c. Signed-off-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Message-id: 20170719044018.18063-1-rth@twiddle.net Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 06 6月, 2017 1 次提交
-
-
由 Emilio G. Cota 提交于
Suggested-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NEmilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-6-git-send-email-cota@braap.org> [rth: Reuse goto_ptr epilogue for exit_tb 0.] Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 18 1月, 2017 2 次提交
-
-
由 Richard Henderson 提交于
I think this is cleaner than sometimes using BSF. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This reverts commit 4ac76910. This fixes http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03062.html While I think we could get away with relying on the undocumented behaviour, the tcg constraint system isn't powerful enough to properly describe the required (non-)overlap conditions. Reported-by: NEduardo Habkost <ehabkost@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 11 1月, 2017 8 次提交
-
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
The ISA manual documents the output is undefined if the input was zero. However, we document in target-i386 that the behavior of real silicon is to preserve the contents of the output register. We also mention that there are real applications that depend on this. That this is baked into silicon is mentioned as a potential cause for some false sharing behaviour wrt lzcnt/tzcnt. Taking advantage of this allows us to save 2 insns in the normal case, and 4 insns for i686 emulating a 64-bit clz. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Previously we could not have different constraints for different ISA levels, which prevented us from eliding the matching constraint for shifts. We do now have to make sure that the operands match for constant shifts. We can also handle some small left shifts via lea. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Use a switch instead of searching a table. Share constraints between 32-bit and 64-bit, when at all possible. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This will let us choose how to interpret a given constraint depending on whether the opcode is 32- or 64-bit. Which will let us share more constraint combinations between opcodes. At the same time, change the interface to return the advanced pointer instead of passing it in/out by reference. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This will allow the target to tailor the constraints to the auto-detected ISA extensions. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 10 1月, 2017 1 次提交
-
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 21 9月, 2016 1 次提交
-
-
由 Richard Henderson 提交于
TARGET_PAGE_MASK, as defined, has type "int". We need to extend that to the proper target width before oring in an "unsigned". Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 16 9月, 2016 2 次提交
-
-
由 Pranith Kumar 提交于
Generate a 'lock orl $0,0(%esp)' instruction for ordering instead of mfence which has similar ordering semantics. Signed-off-by: NPranith Kumar <bobby.prani@gmail.com> Message-Id: <20160714202026.9727-3-bobby.prani@gmail.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Previously we allowed fully unaligned operations, but not operations that are aligned but with less alignment than the operation size. In addition, arm32, ia64, mips, and sparc had been omitted from the previous overalignment patch, which would have led to that alignment being enforced. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 06 7月, 2016 2 次提交
-
-
由 Sergey Sorokin 提交于
Some architectures (e.g. ARMv8) need the address which is aligned to a size more than the size of the memory access. To support such check it's enough the current costless alignment check implementation in QEMU, but we need to support an alignment size specifying. Signed-off-by: NSergey Sorokin <afarallax@yandex.ru> Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru> Signed-off-by: NRichard Henderson <rth@twiddle.net> [rth: Assert in tcg_canonicalize_memop. Leave get_alignment_bits available for, though unused by, user-mode. Retain logging difference based on ALIGNED_ONLY.]
-
由 Richard Henderson 提交于
While we can store constants via constrants on INDEX_op_st_i32 et al, we weren't able to spill constants to backing store. Add a new backend interface, tcg_out_sti, which may store the constant (and is allowed to fail). Rearrange the temp_* helpers so that we only attempt to directly store a constant when the temp is becoming dead/free. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 13 5月, 2016 2 次提交
-
-
由 Sergey Fedorov 提交于
Briefly describe in a comment how direct block chaining is done. It should help in understanding of the following data fields. Rename some fields in TranslationBlock and TCGContext structures to better reflect their purpose (dropping excessive 'tb_' prefix in TranslationBlock but keeping it in TCGContext): tb_next_offset => jmp_reset_offset tb_jmp_offset => jmp_insn_offset tb_next => jmp_target_addr jmp_next => jmp_list_next jmp_first => jmp_list_first Avoid using a magic constant as an invalid offset which is used to indicate that there's no n-th jump generated. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
Ensure direct jump patching in i386 is atomic by: * naturally aligning a location of direct jump address; * using atomic_read()/atomic_set() for code patching. tcg_out_nopn() implementation: Suggested-by: Richard Henderson <rth@twiddle.net>. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 21 4月, 2016 2 次提交
-
-
由 Aurelien Jarno 提交于
Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Message-id: 1461228530-14852-2-git-send-email-aurelien@aurel32.net Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
由 Aurelien Jarno 提交于
The TCG code is quite performance sensitive, but at the same time can also be quite tricky. That is why asserts that can be enabled with the --enable-debug-tcg configure option. This used to work the following way: | #include "config.h" | | ... | | #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG) | /* define it to suppress various consistency checks (faster) */ | #define NDEBUG | #endif | | ... | | #include <assert.h> Since commit 757e725b (tcg: Clean up includes) "config.h" as been replaced by "qemu/osdep.h" which itself includes <assert.h>. As a consequence the assertions are always enabled, even when using --disable-debug-tcg, causing a performance regression, especially on targets with many registers. For instance on qemu-system-ppc the speed difference is about 15%. tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already uses in some places. This patch replaces all the calls to assert into calss to tcg_debug_assert. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Message-id: 1461228530-14852-1-git-send-email-aurelien@aurel32.net Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 24 2月, 2016 1 次提交
-
-
由 Peter Maydell 提交于
Commit 757e725b added a number of #include "qemu/osdep.h" files to the tcg-target.c files (as they were named at the time). These are unnecessary because these files are not standalone C files, and the tcg/tcg.c file which includes them will have already included osdep.h on their behalf. Remove the unneeded include directives. Reviewed-by: NEric Blake <eblake@redhat.com> Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Message-Id: <1456238983-10160-4-git-send-email-peter.maydell@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-