- 25 10月, 2017 7 次提交
-
-
由 Richard Henderson 提交于
This avoids having to allocate external memory for each temporary. Reviewed-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
At the same time, drop the TCGContext argument and use tcg_ctx instead. Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This avoids needing to test the index of a temp against nb_globals. Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reviewed-by: NEmilio G. Cota <cota@braap.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reviewed-by: NEmilio G. Cota <cota@braap.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reviewed-by: NEmilio G. Cota <cota@braap.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Rather than have a separate buffer of 10*max_ops entries, give each opcode 10 entries. The result is actually a bit smaller and should have slightly more cache locality. Reviewed-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 11 10月, 2017 1 次提交
-
-
由 Emilio G. Cota 提交于
Will come in handy very soon. Reviewed-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 10 10月, 2017 2 次提交
-
-
由 Emilio G. Cota 提交于
Groundwork for supporting multiple TCG contexts. The hash table becomes read-only after it is filled in, so we can save space by keeping just a global pointer to it. Reviewed-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Emilio G. Cota 提交于
In preparation for adding tc.size to be able to keep track of TB's using the binary search tree implementation from glib. Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 17 9月, 2017 4 次提交
-
-
由 Richard Henderson 提交于
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
由 Richard Henderson 提交于
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
-
- 08 9月, 2017 2 次提交
-
-
由 Richard Henderson 提交于
A new shared header tcg-pool.inc.c adds new_pool_label, for registering a tcg_target_ulong to be emitted after the generated code, plus relocation data to install a pointer to the data. A new pointer is added to the TCGContext, so that we dump the constant pool as data, not code. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Dispense with TCGBackendData, as it has never been used for more than holding a single pointer. Use a define in the cpu/tcg-target.h to signal requirement for TCGLabelQemuLdst, so that we can drop the no-op tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 20 6月, 2017 1 次提交
-
-
由 Emilio G. Cota 提交于
Allocating an arbitrarily-sized array of tbs results in either (a) a lot of memory wasted or (b) unnecessary flushes of the code cache when we run out of TB structs in the array. An obvious solution would be to just malloc a TB struct when needed, and keep the TB array as an array of pointers (recall that tb_find_pc() needs the TB array to run in O(log n)). Perhaps a better solution, which is implemented in this patch, is to allocate TB's right before the translated code they describe. This results in some memory waste due to padding to have code and TBs in separate cache lines--for instance, I measured 4.7% of padding in the used portion of code_gen_buffer when booting aarch64 Linux on a host with 64-byte cache lines. However, it can allow for optimizations in some host architectures, since TCG backends could safely assume that the TB and the corresponding translated code are very close to each other in memory. See this message by rth for a detailed explanation: https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg05172.html Subject: Re: GSoC 2017 Proposal: TCG performance enhancements Message-ID: <1e67644b-4b30-887e-d329-1848e94c9484@twiddle.net> Suggested-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NPranith Kumar <bobby.prani@gmail.com> Signed-off-by: NEmilio G. Cota <cota@braap.org> Message-Id: <1496790745-314-3-git-send-email-cota@braap.org> [rth: Simplify the arithmetic in tcg_tb_alloc] Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 06 6月, 2017 1 次提交
-
-
由 Emilio G. Cota 提交于
Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NEmilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org> [rth: Squashed 4 related commits.] Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 11 1月, 2017 4 次提交
-
-
由 Richard Henderson 提交于
This allows an output operand to match an input operand only when the input operand needs a register. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This will let us choose how to interpret a given constraint depending on whether the opcode is 32- or 64-bit. Which will let us share more constraint combinations between opcodes. At the same time, change the interface to return the advanced pointer instead of passing it in/out by reference. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This will allow the target to tailor the constraints to the auto-detected ISA extensions. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This is the same concept as, and same markup as, the early clobber markup in gcc. Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 02 11月, 2016 1 次提交
-
-
由 Richard Henderson 提交于
Reuse the existing locking provided by stdio to keep in_asm, cpu, op, op_opt, op_ind, and out_asm as contiguous blocks. While it isn't possible to interleave e.g. in_asm or op_opt logs because of the TB lock protecting all code generation, it is possible to interleave cpu logs, or to interleave a cpu dump with an out_asm dump. For mingw32, we appear to have no viable solution for this. The locking functions are not properly exported from the system runtime library. Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 24 10月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
This comes from free from unifying tcg_reg_alloc_mov and tcg_reg_alloc_movi's handling of TEMP_VAL_CONST. It triggers often on moves to cc_dst, such as the following translation of "sub $0x3c,%esp": before: after: subl $0x3c,%ebp subl $0x3c,%ebp movl %ebp,0x10(%r14) movl %ebp,0x10(%r14) movl $0x3c,%ebx movl $0x3c,0x2c(%r14) movl %ebx,0x2c(%r14) Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Message-Id: <1473945360-13663-1-git-send-email-pbonzini@redhat.com> Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 15 9月, 2016 1 次提交
-
-
由 Thomas Huth 提交于
host-utils.h and timer.h are included twice in tcg.c. One time should be enough. Signed-off-by: NThomas Huth <thuth@redhat.com> Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>
-
- 06 8月, 2016 7 次提交
-
-
由 Richard Henderson 提交于
Rather than rely on recursion during the middle of register allocation, lower indirect registers to loads and stores off the indirect base into plain temps. For an x86_64 host, with sufficient registers, this results in identical code, modulo the actual register assignments. For an i686 host, with insufficient registers, this means that temps can be (temporarily) spilled to the stack in order to satisfy an allocation. This as opposed to the possibility of not being able to spill, to allocate a register for the indirect base, in order to perform a spill. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
We only need two bits per temporary. Fold the two bytes into one, and reduce the memory and cachelines required during compilation. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reduce the size of other bitfields to make room. This reduces the cache footprint of compilation. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Instead of using -1 as end of chain, use 0, and link through the 0 entry as a fully circular double-linked list. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
This reduces both memory usage and per-insn cacheline usage during code generation. Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 06 7月, 2016 3 次提交
-
-
由 Sergey Sorokin 提交于
Some architectures (e.g. ARMv8) need the address which is aligned to a size more than the size of the memory access. To support such check it's enough the current costless alignment check implementation in QEMU, but we need to support an alignment size specifying. Signed-off-by: NSergey Sorokin <afarallax@yandex.ru> Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru> Signed-off-by: NRichard Henderson <rth@twiddle.net> [rth: Assert in tcg_canonicalize_memop. Leave get_alignment_bits available for, though unused by, user-mode. Retain logging difference based on ALIGNED_ONLY.]
-
由 Richard Henderson 提交于
While we can store constants via constrants on INDEX_op_st_i32 et al, we weren't able to spill constants to backing store. Add a new backend interface, tcg_out_sti, which may store the constant (and is allowed to fail). Rearrange the temp_* helpers so that we only attempt to directly store a constant when the temp is becoming dead/free. Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Richard Henderson 提交于
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 19 5月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
exec-all.h contains TCG-specific definitions. It is not needed outside TCG-specific files such as translate.c, exec.c or *helper.c. One generic function had snuck into include/exec/exec-all.h; move it to include/qom/cpu.h. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 21 4月, 2016 2 次提交
-
-
由 Aurelien Jarno 提交于
Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Message-id: 1461228530-14852-2-git-send-email-aurelien@aurel32.net Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
由 Aurelien Jarno 提交于
The TCG code is quite performance sensitive, but at the same time can also be quite tricky. That is why asserts that can be enabled with the --enable-debug-tcg configure option. This used to work the following way: | #include "config.h" | | ... | | #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG) | /* define it to suppress various consistency checks (faster) */ | #define NDEBUG | #endif | | ... | | #include <assert.h> Since commit 757e725b (tcg: Clean up includes) "config.h" as been replaced by "qemu/osdep.h" which itself includes <assert.h>. As a consequence the assertions are always enabled, even when using --disable-debug-tcg, causing a performance regression, especially on targets with many registers. For instance on qemu-system-ppc the speed difference is about 15%. tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already uses in some places. This patch replaces all the calls to assert into calss to tcg_debug_assert. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: NAurelien Jarno <aurelien@aurel32.net> Message-id: 1461228530-14852-1-git-send-email-aurelien@aurel32.net Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 23 3月, 2016 2 次提交
-
-
由 Alex Bennée 提交于
This ensures the code generation debug code will honour -dfilter if set. For the "exec" tracing I've added a new inline macro for efficiency's sake. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Reviewed-by: NAurelien Jarno <aurelien@aureL32.net> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-8-git-send-email-alex.bennee@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alex Bennée 提交于
My later debugging patches need access to the origin PC which is held in the TranslationBlock structure. Pass down the whole structure as it also holds the information about the code start point. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-3-git-send-email-alex.bennee@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-