- 26 10月, 2016 1 次提交
-
-
由 Richard Henderson 提交于
When we cannot emulate an atomic operation within a parallel context, this exception allows us to stop the world and try again in a serial context. Reviewed-by: NEmilio G. Cota <cota@braap.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 04 10月, 2016 1 次提交
-
-
由 Alex Bennée 提交于
ThreadSanitizer picks up potential races although we already use barriers to ensure things are in the correct order when processing exit requests. For true C11 defined behaviour across threads we need to use relaxed atomic_set/atomic_read semantics to reassure tsan. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Message-Id: <20160930213106.20186-9-alex.bennee@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 27 9月, 2016 1 次提交
-
-
由 Sergey Fedorov 提交于
Use async_safe_run_on_cpu() to make tb_flush() thread safe. This is possible now that code generation does not happen in the middle of execution. It can happen that multiple threads schedule a safe work to flush the translation buffer. To keep statistics and debugging output sane, always check if the translation buffer has already been flushed. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> [AJB: minor re-base fixes] Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Message-Id: <1470158864-17651-13-git-send-email-alex.bennee@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 16 9月, 2016 1 次提交
-
-
由 Richard Henderson 提交于
Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 14 9月, 2016 8 次提交
-
-
由 Sergey Fedorov 提交于
In fact, this function does not exactly perform a lookup by physical address as it is descibed for comment on get_page_addr_code(). Thus it may be a bit confusing to have "physical" in it's name. So rename it to tb_htable_lookup() to better reflect its actual functionality. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <20160715175852.30749-13-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sergey Fedorov 提交于
These functions are not too big and can be merged together. This makes locking scheme more clear and easier to follow. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Message-Id: <20160715175852.30749-12-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sergey Fedorov 提交于
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Message-Id: <20160715175852.30749-11-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Alex Bennée 提交于
Lock contention in the hot path of moving between existing patched TranslationBlocks is the main drag in multithreaded performance. This patch pushes the tb_lock() usage down to the two places that really need it: - code generation (tb_gen_code) - jump patching (tb_add_jump) The rest of the code doesn't really need to hold a lock as it is either using per-CPU structures, atomically updated or designed to be used in concurrent read situations (qht_lookup). To keep things simple I removed the #ifdef CONFIG_USER_ONLY stuff as the locks become NOPs anyway until the MTTCG work is completed. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <20160715175852.30749-10-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
When invalidating a translation block, set an invalid flag into the TranslationBlock structure first. It is also necessary to check whether the target TB is still valid after acquiring 'tb_lock' but before calling tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in future. Note that we don't have to check 'last_tb'; an already invalidated TB will not be executed anyway and it is thus safe to patch it. Suggested-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sergey Fedorov 提交于
Ensure atomicity and ordering of CPU's 'tb_flushed' access for future translation block lookup out of 'tb_lock'. This field can only be touched from another thread by tb_flush() in user mode emulation. So the only access to be sequential atomic is: * a single write in tb_flush(); * reads/writes out of 'tb_lock'. In future, before enabling MTTCG in system mode, tb_flush() must be safe and this field becomes unnecessary. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Message-Id: <20160715175852.30749-5-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sergey Fedorov 提交于
Ensure atomicity of CPU's 'tb_jmp_cache' access for future translation block lookup out of 'tb_lock'. Note that this patch does *not* make CPU's TLB invalidation safe if it is done from some other thread while the CPU is in its execution loop. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Message-Id: <20160715175852.30749-4-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Sergey Fedorov 提交于
This is a small clean up. tb_find_fast() is a final consumer of this variable so no need to pass it by reference. 'last_tb' is always updated by subsequent cpu_loop_exec_tb() in cpu_exec(). This change also simplifies calling cpu_exec_nocache() in cpu_handle_exception(). Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <20160715175852.30749-3-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 17 7月, 2016 1 次提交
-
-
由 Sergey Fedorov 提交于
This will fix a compiler warning with -Wclobbered: http://lists.nongnu.org/archive/html/qemu-devel/2016-07/msg03347.htmlReported-by: NStefan Weil <sw@weilnetz.de> Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <20160715193123.28113-1-sergey.fedorov@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 12 6月, 2016 2 次提交
-
-
由 Emilio G. Cota 提交于
Having a fixed-size hash table for keeping track of all translation blocks is suboptimal: some workloads are just too big or too small to get maximum performance from the hash table. The MRU promotion policy helps improve performance when the hash table is a little undersized, but it cannot make up for severely undersized hash tables. Furthermore, frequent MRU promotions result in writes that are a scalability bottleneck. For scalability, lookups should only perform reads, not writes. This is not a big deal for now, but it will become one once MTTCG matures. The appended fixes these issues by using qht as the implementation of the TB hash table. This solution is superior to other alternatives considered, namely: - master: implementation in QEMU before this patchset - xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU. - xxhash-rcu: fixed buckets + xxhash + RCU list + MRU. MRU is implemented here by adding an intermediate struct that contains the u32 hash and a pointer to the TB; this allows us, on an MRU promotion, to copy said struct (that is not at the head), and put this new copy at the head. After a grace period, the original non-head struct can be eliminated, and after another grace period, freed. - qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize + no MRU for lookups; MRU for inserts. The appended solution is the following: - qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize + no MRU for lookups; MRU for inserts. The plots below compare the considered solutions. The Y axis shows the boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis sweeps the number of buckets (or initial number of buckets for qht-autoresize). The plots in PNG format (and with errorbars) can be seen here: http://imgur.com/a/Awgnq Each test runs 5 times, and the entire QEMU process is pinned to a single core for repeatability of results. Host: Intel Xeon E5-2690 28 ++------------+-------------+-------------+-------------+------------++ A***** + + + master **A*** + 27 ++ * xxhash ##B###++ | A******A****** xxhash-rcu $$C$$$ | 26 C$$ A******A****** qht-fixed-nomru*%%D%%%++ D%%$$ A******A******A*qht-dyn-mru A*E****A 25 ++ %%$$ qht-dyn-nomru &&F&&&++ B#####% | 24 ++ #C$$$$$ ++ | B### $ | | ## C$$$$$$ | 23 ++ # C$$$$$$ ++ | B###### C$$$$$$ %%%D 22 ++ %B###### C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C | D%%%%%%B###### @E@@@@@@ %%%D%%%@@@E@@@@@@E 21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B + E@@@ F&&& + E@ + F&&& + + 20 ++------------+-------------+-------------+-------------+------------++ 14 16 18 20 22 24 log2 number of buckets Host: Intel i7-4790K 14.5 ++------------+------------+-------------+------------+------------++ A** + + + master **A*** + 14 ++ ** xxhash ##B###++ 13.5 ++ ** xxhash-rcu $$C$$$++ | qht-fixed-nomru %%D%%% | 13 ++ A****** qht-dyn-mru @@E@@@++ | A*****A******A****** qht-dyn-nomru &&F&&& | 12.5 C$$ A******A******A*****A****** ***A 12 ++ $$ A*** ++ D%%% $$ | 11.5 ++ %% ++ B### %C$$$$$$ | 11 ++ ## D%%%%% C$$$$$ ++ | # % C$$$$$$ | 10.5 F&&&&&&B######D%%%%% C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$ $$$C 10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B + F&& D%%%%%%B######B######B#####B###@@@D%%% + 9.5 ++------------+------------+-------------+------------+------------++ 14 16 18 20 22 24 log2 number of buckets Note that the original point before this patch series is X=15 for "master"; the little sensitivity to the increased number of buckets is due to the poor hashing function in master. xxhash-rcu has significant overhead due to the constant churn of allocating and deallocating intermediate structs for implementing MRU. An alternative would be do consider failed lookups as "maybe not there", and then acquire the external lock (tb_lock in this case) to really confirm that there was indeed a failed lookup. This, however, would not be enough to implement dynamic resizing--this is more complex: see "Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming" by Triplett, McKenney and Walpole. This solution was discarded due to the very coarse RCU read critical sections that we have in MTTCG; resizing requires waiting for readers after every pointer update, and resizes require many pointer updates, so this would quickly become prohibitive. qht-fixed-nomru shows that MRU promotion is advisable for undersized hash tables. However, qht-dyn-mru shows that MRU promotion is not important if the hash table is properly sized: there is virtually no difference in performance between qht-dyn-nomru and qht-dyn-mru. Before this patch, we're at X=15 on "xxhash"; after this patch, we're at X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we can achieve with optimum sizing of the hash table, while keeping the hash table scalable for readers. The improvement we get before and after this patch for booting debian jessie with arm-softmmu is: - Intel Xeon E5-2690: 10.5% less time - Intel i7-4790K: 5.2% less time We could get this same improvement _for this particular workload_ by statically increasing the size of the hash table. But this would hurt workloads that do not need a large hash table. The dynamic (upward) resizing allows us to start small and enlarge the hash table as needed. A quick note on downsizing: the table is resized back to 2**15 buckets on every tb_flush; this makes sense because it is not guaranteed that the table will reach the same number of TBs later on (e.g. most bootup code is thrown away after boot); it makes sense to grow the hash table as more code blocks are translated. This also avoids the complication of having to build downsizing hysteresis logic into qht. Reviewed-by: NSergey Fedorov <serge.fedorov@linaro.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NEmilio G. Cota <cota@braap.org> Message-Id: <1465412133-3029-15-git-send-email-cota@braap.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Emilio G. Cota 提交于
For some workloads such as arm bootup, tb_phys_hash is performance-critical. The is due to the high frequency of accesses to the hash table, originated by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's. More info: https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html To dig further into this I modified an arm image booting debian jessie to immediately shut down after boot. Analysis revealed that quite a bit of time is unnecessarily spent in tb_phys_hash: the cause is poor hashing that results in very uneven loading of chains in the hash table's buckets; the longest observed chain had ~550 elements. The appended addresses this with two changes: 1) Use xxhash as the hash table's hash function. xxhash is a fast, high-quality hashing function. 2) Feed the hashing function with not just tb_phys, but also pc and flags. This improves performance over using just tb_phys for hashing, since that resulted in some hash buckets having many TB's, while others getting very few; with these changes, the longest observed chain on a single hash bucket is brought down from ~550 to ~40. Tests show that the other element checked for in tb_find_physical, cs_base, is always a match when tb_phys+pc+flags are a match, so hashing cs_base is wasteful. It could be that this is an ARM-only thing, though. UPDATE: On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote: > The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB > consisting of only a delay slot). > It may well still turn out to be reasonable to ignore cs_base for hashing. BTW, after this change the hash table should not be called "tb_hash_phys" anymore; this is addressed later in this series. This change gives consistent bootup time improvements. I tested two host machines: - Intel Xeon E5-2690: 11.6% less time - Intel i7-4790K: 19.2% less time Increasing the number of hash buckets yields further improvements. However, using a larger, fixed number of buckets can degrade performance for other workloads that do not translate as many blocks (600K+ for debian-jessie arm bootup). This is dealt with later in this series. Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NEmilio G. Cota <cota@braap.org> Message-Id: <1465412133-3029-8-git-send-email-cota@braap.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
- 26 5月, 2016 1 次提交
-
-
由 Sergey Fedorov 提交于
It is not safe to make a direct jump to a TB spanning two pages in system emulation because the mapping for the second page can get changed but we don't take care of direct jumps in this case. However in user mode emulation, this is not the case because there's only static address translation and TBs are always invalidated properly. Fixes: 5b053a4a ("tcg: Clean up direct block chaining safety checks") Reported-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Tested-by: NMax Filippov <jcmvbkbc@gmail.com> Message-id: 1463404380-29302-1-git-send-email-sergey.fedorov@linaro.org Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
-
- 19 5月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
exec-all.h contains TCG-specific definitions. It is not needed outside TCG-specific files such as translate.c, exec.c or *helper.c. One generic function had snuck into include/exec/exec-all.h; move it to include/qom/cpu.h. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 13 5月, 2016 15 次提交
-
-
由 Sergey Fedorov 提交于
Suggested-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1463071937-26607-1-git-send-email-sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1462962111-32237-6-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
Simplify cpu_exec() by extracting TB execution code outside of cpu_exec() into a new static inline function cpu_loop_exec_tb(). Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1462962111-32237-5-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
Simplify cpu_exec() by extracting interrupt handling code outside of cpu_exec() into a new static inline function cpu_handle_interrupt(). Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1462962111-32237-4-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
Simplify cpu_exec() by extracting exception handling code out of cpu_exec() into a new static inline function cpu_handle_exception(). Also make cpu_handle_debug_exception() inline as it is used only once. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1462962111-32237-3-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
Simplify cpu_exec() by extracting CPU halt state handling code out of cpu_exec() into a new static inline function cpu_handle_halt(). Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1462962111-32237-2-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
This comment should have been deleted by commit 0ac087f1 ("removed unused code") but somehow it is still here. There's no point to keep it. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1462286050-21778-1-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
This field was used for telling cpu_interrupt() to unlink a chain of TBs being executed when it worked that way. Now, cpu_interrupt() don't do this anymore. So we don't need this field anymore. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Message-Id: <1462273462-14036-1-git-send-email-sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
Move tb_add_jump() call and surrounding code from cpu_exec() into tb_find_fast(). That simplifies cpu_exec() a little by hiding the direct chaining optimization details into tb_find_fast(). It also allows to move tb_lock()/tb_unlock() pair into tb_find_fast(), putting it closer to tb_find_slow() which also manipulates the lock. Suggested-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net> [rth: Fixed rebase typo in nochain test.]
-
由 Sergey Fedorov 提交于
'tb_invalidated_flag' was meant to catch two events: * some TB has been invalidated by tb_phys_invalidate(); * the whole translation buffer has been flushed by tb_flush(). Then it was checked: * in cpu_exec() to ensure that the last executed TB can be safely linked to directly call the next one; * in cpu_exec_nocache() to decide if the original TB should be provided for further possible invalidation along with the temporarily generated TB. It is always safe to patch an invalidated TB since it is not going to be used anyway. It is also safe to call tb_phys_invalidate() for an already invalidated TB. Thus, setting this flag in tb_phys_invalidate() is simply unnecessary. Moreover, it can prevent from pretty proper linking of TBs, if any arbitrary TB has been invalidated. So just don't touch it in tb_phys_invalidate(). If this flag is only used to catch whether tb_flush() has been called then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using only 'true' and 'false' to set its value. Also, instead of setting it in tb_gen_code(), just after tb_flush() has been called, do it right inside of tb_flush(). In cpu_exec(), this flag is used to track if tb_flush() has been called and have made 'next_tb' (a reference to the last executed TB) invalid for linking it to directly call the next TB. tb_flush() can be called during the CPU execution loop from tb_gen_code(), during TB execution or by another thread while 'tb_lock' is released. Catch for translation buffer flush reliably by resetting this flag once before first TB lookup and each time we find it set before trying to add a direct jump. Don't touch in in tb_find_physical(). Each vCPU has its own execution loop in multithreaded mode and thus should have its own copy of the flag to be able to reset it with its own 'next_tb' and don't affect any other vCPU execution thread. So make this flag per-vCPU and move it to CPUState. In cpu_exec_nocache(), we only need to check if tb_flush() has been called from tb_gen_code() called by cpu_exec_nocache() itself. To do this reliably, preserve the old value of the flag, reset it before calling tb_gen_code(), check afterwards, and combine the saved value back to the flag. This patch is based on the patch "tcg: move tb_invalidated_flag to CPUState" from Paolo Bonzini <pbonzini@redhat.com>. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
The value returned from tcg_qemu_tb_exec() is the value passed to the corresponding tcg_gen_exit_tb() at translation time of the last TB attempted to execute. It is a little confusing to store it in a variable named 'next_tb'. In fact, it is a combination of 4-byte aligned pointer and additional information in its two least significant bits. Break it down right away into two variables named 'last_tb' and 'tb_exit' which are a pointer to the last TB attempted to execute and the TB exit reason, correspondingly. This simplifies the code and improves its readability. Correct a misleading documentation comment for tcg_qemu_tb_exec() and fix logging in cpu_tb_exec(). Also rename a misleading 'next_tb' in another couple of places. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> [Alex Bennée: #ifndef replay code to match elided functions] Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Alex Bennée 提交于
Put some comments and improve code structure. This should help reading the code. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> [Sergey Fedorov: provide commit message; bring back resetting of tb_invalidated_flag] Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NRichard Henderson <rth@twiddle.net> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Sergey Fedorov 提交于
We don't take care of direct jumps when address mapping changes. Thus we must be sure to generate direct jumps so that they always keep valid even if address mapping changes. Luckily, we can only allow to execute a TB if it was generated from the pages which match with current mapping. Document tcg_gen_goto_tb() declaration and note the reason for destination PC limitations. Some targets with variable length instructions allow TB to straddle a page boundary. However, we make sure that both of TB pages match the current address mapping when looking up TBs. So it is safe to do direct jumps into the both pages. Correct the checks for some of those targets. Given that, we can safely patch a TB which spans two pages. Remove the unnecessary check in cpu_exec() and allow such TBs to be patched. Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com> Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-
由 Emilio G. Cota 提交于
We are inconsistent with the type of tb->flags: usage varies loosely between int and uint64_t. Settle to uint32_t everywhere, which is superior to both: at least one target (aarch64) uses the most significant bit in the u32, and uint64_t is wasteful. Compile-tested for all targets. Suggested-by: NLaurent Desnogues <laurent.desnogues@gmail.com> Suggested-by: NRichard Henderson <rth@twiddle.net> Tested-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: NLaurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: NEmilio G. Cota <cota@braap.org> Signed-off-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
-
- 23 3月, 2016 2 次提交
-
-
由 Alex Bennée 提交于
This ensures the code generation debug code will honour -dfilter if set. For the "exec" tracing I've added a new inline macro for efficiency's sake. Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> Reviewed-by: NAurelien Jarno <aurelien@aureL32.net> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-8-git-send-email-alex.bennee@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Peter Maydell 提交于
Improve the TB execution logging so that it is easier to identify what is happening from trace logs: * move the "Trace" logging of executed TBs into cpu_tb_exec() so that it is emitted if and only if we actually execute a TB, and for consistency for the CPU state logging * log when we link two TBs together via tb_add_jump() * log when cpu_tb_exec() returns early from a chain of TBs The new style logging looks like this: Trace 0x7fb7cc822ca0 [ffffffc0000dce00] Linking TBs 0x7fb7cc822ca0 [ffffffc0000dce00] index 0 -> 0x7fb7cc823110 [ffffffc0000dce10] Trace 0x7fb7cc823110 [ffffffc0000dce10] Trace 0x7fb7cc823420 [ffffffc000302688] Trace 0x7fb7cc8234a0 [ffffffc000302698] Trace 0x7fb7cc823520 [ffffffc0003026a4] Trace 0x7fb7cc823560 [ffffffc0000dce44] Linking TBs 0x7fb7cc823560 [ffffffc0000dce44] index 1 -> 0x7fb7cc8235d0 [ffffffc0000dce70] Trace 0x7fb7cc8235d0 [ffffffc0000dce70] Stopped execution of TB chain before 0x7fb7cc8235d0 [ffffffc0000dce70] Trace 0x7fb7cc8235d0 [ffffffc0000dce70] Trace 0x7fb7cc822fd0 [ffffffc0000dd52c] Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NAlex Bennée <alex.bennee@linaro.org> [AJB: reword patch title, Abandoned->Stopped] Reviewed-by: NAurelien Jarno <aurelien@aurel32.net> Reviewed-by: NRichard Henderson <rth@twiddle.net> Message-Id: <1458052224-9316-6-git-send-email-alex.bennee@linaro.org> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 03 2月, 2016 1 次提交
-
-
由 Paolo Bonzini 提交于
Split the bits that require it to exec/log.h. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NDenis V. Lunev <den@openvz.org> Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com> Message-id: 1452174932-28657-8-git-send-email-den@openvz.org Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
-
- 29 1月, 2016 1 次提交
-
-
由 Peter Maydell 提交于
Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: NPeter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-4-git-send-email-peter.maydell@linaro.org
-
- 06 11月, 2015 1 次提交
-
-
由 Pavel Dovgalyuk 提交于
This patch includes modifications of common cpu files. All interrupts and exceptions occured during recording are written into the replay log. These events allow correct replaying the execution by kicking cpu thread when one of these events is found in the log. Signed-off-by: NPavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20150917162416.8676.57647.stgit@PASHA-ISP.def.inno> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 05 11月, 2015 1 次提交
-
-
由 Pavel Dovgalyuk 提交于
This patch is required for deterministic replay to generate an exception by trying executing an instruction without changing icount. It adds new flag to TB for disabling icount while translating it. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NPavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Message-Id: <20150917162359.8676.77011.stgit@PASHA-ISP.def.inno> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 04 11月, 2015 1 次提交
-
-
由 Stefan Weil 提交于
Reloading of local variables after sigsetjmp is only needed for some buggy compilers. The code which should reload these variables causes compiler warnings with gcc 4.7 when compiler optimizations are enabled: cpu-exec.c:204:15: error: variable ‘cpu’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] cpu-exec.c:207:15: error: variable ‘cc’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] cpu-exec.c:202:28: error: argument ‘env’ might be clobbered by ‘longjmp’ or ‘vfork’ [-Werror=clobbered] Now this code is only used for compilers which need it (and gcc 4.5.x, x > 0 which does not need it but won't give warnings). There were bug reports for clang and gcc 4.5.0, while gcc 4.5.1 was reported to work fine without the reload code. For clang it is not clear which versions are affected, so simply keep the status quo for all clang compilations. This can be improved later. Signed-off-by: NStefan Weil <sw@weilnetz.de> Message-Id: <1443266606-21400-1-git-send-email-sw@weilnetz.de> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 20 10月, 2015 1 次提交
-
-
由 Richard Henderson 提交于
Respect it to avoid linking TBs together. Reviewed-by: NPeter Maydell <peter.maydell@linaro.org> Signed-off-by: NRichard Henderson <rth@twiddle.net>
-