提交 · 3fb53fb4d12f2e7833bd1659e6013237b130ef20 · openeuler / qemu

20 6月, 2017 1 次提交
- R
  tcg/arm: Use indirect branch for goto_tb · 3fb53fb4
  由 Richard Henderson 提交于 6月 05, 2017
```
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
  3fb53fb4
06 6月, 2017 1 次提交

tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr · cedbcb01

由 Emilio G. Cota 提交于 4月 26, 2017

Instead of exporting goto_ptr directly to TCG frontends, export
tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer
returned by the lookup_tb_ptr() helper. This is the only use case
we have for goto_ptr and lookup_tb_ptr, so having this function is
very convenient. Furthermore, it trivially allows us to avoid calling
the lookup helper if goto_ptr is not implemented by the backend.
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org>
Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org>
[rth: Squashed 4 related commits.]
Signed-off-by: NRichard Henderson <rth@twiddle.net>

cedbcb01

24 2月, 2017 5 次提交

cputlb: introduce tlb_flush_*_all_cpus[_synced] · c3b9a07a

由 Alex Bennée 提交于 2月 23, 2017

This introduces support to the cputlb API for flushing all CPUs TLBs
with one call. This avoids the need for target helpers to iterate
through the vCPUs themselves.

An additional variant of the API (_synced) will cause the source vCPUs
work to be scheduled as "safe work". The result will be all the flush
operations will be complete by the time the originating vCPU executes
its safe work. The calling implementation can either end the TB
straight away (which will then pick up the cpu->exit_request on
entering the next block) or defer the exit until the architectural
sync point (usually a barrier instruction).
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>

c3b9a07a

cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap · 0336cbf8

由 Alex Bennée 提交于 2月 23, 2017

While the vargs approach was flexible the original MTTCG ended up
having munge the bits to a bitmap so the data could be used in
deferred work helpers. Instead of hiding that in cputlb we push the
change to the API to make it take a bitmap of MMU indexes instead.

For ARM some the resulting flushes end up being quite long so to aid
readability I've tended to move the index shifting to a new line so
all the bits being or-ed together line up nicely, for example:

    tlb_flush_page_by_mmuidx(other_cs, pageaddr,
                             (1 << ARMMMUIdx_S1SE1) |
                             (1 << ARMMMUIdx_S1SE0));
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
[AT: SPARC parts only]
Reviewed-by: NArtyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
[PM: ARM parts only]
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>

0336cbf8

cputlb: introduce tlb_flush_* async work. · e3b9ca81

由 KONRAD Frederic 提交于 2月 23, 2017

Some architectures allow to flush the tlb of other VCPUs. This is not a problem
when we have only one thread for all VCPUs but it definitely needs to be an
asynchronous work when we are in true multithreaded work.

We take the tb_lock() when doing this to avoid racing with other threads
which may be invalidating TB's at the same time. The alternative would
be to use proper atomic primitives to clear the tlb entries en-mass.

This patch doesn't do anything to protect other cputlb function being
called in MTTCG mode making cross vCPU changes.
Signed-off-by: NKONRAD Frederic <fred.konrad@greensocs.com>
[AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>

e3b9ca81

tcg: remove global exit_request · e5143e30

由 Alex Bennée 提交于 2月 23, 2017

There are now only two uses of the global exit_request left.

The first ensures we exit the run_loop when we first start to process
pending work and in the kick handler. This is just as easily done by
setting the first_cpu->exit_request flag.

The second use is in the round robin kick routine. The global
exit_request ensured every vCPU would set its local exit_request and
cause a full exit of the loop. Now the iothread isn't being held while
running we can just rely on the kick handler to push us out as intended.

We lightly re-factor the main vCPU thread to ensure cpu->exit_requests
cause us to exit the main loop and process any IO requests that might
come along. As an cpu->exit_request may legitimately get squashed
while processing the EXCP_INTERRUPT exception we also check
cpu->queued_work_first to ensure queued work is expedited as soon as
possible.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>

e5143e30

tcg: rename tcg_current_cpu to tcg_current_rr_cpu · 791158d9

由 Alex Bennée 提交于 2月 23, 2017

..and make the definition local to cpus. In preparation for MTTCG the
concept of a global tcg_current_cpu will no longer make sense. However
we still need to keep track of it in the single-threaded case to be able
to exit quickly when required.

qemu_cpu_kick_no_halt() moves and becomes qemu_cpu_kick_rr_cpu() to
emphasise its use-case. qemu_cpu_kick now kicks the relevant cpu as
well as qemu_kick_rr_cpu() which will become a no-op in MTTCG.

For the time being the setting of the global exit_request remains.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NPranith Kumar <bobby.prani@gmail.com>

791158d9

16 2月, 2017 1 次提交

cpu-exec: fix icount out-of-bounds access · 43d70ddf

由 Paolo Bonzini 提交于 1月 29, 2017

When icount is active, tb_add_jump is surprisingly called with an
out of bounds basic block index.  I have no idea how that can work,
but it does not seem like a good idea.  Clear *last_tb for all
TB_EXIT_ICOUNT_EXPIRED cases, even when all you have to do is
refill icount_extra.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

43d70ddf

13 1月, 2017 1 次提交

cputlb: drop flush_global flag from tlb_flush · d10eb08f

由 Alex Bennée 提交于 11月 14, 2016

We have never has the concept of global TLB entries which would avoid
the flush so we never actually use this flag. Drop it and make clear
that tlb_flush is the sledge-hammer it has always been.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
[DG: ppc portions]
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>

d10eb08f

31 10月, 2016 2 次提交

tcg: comment on which functions have to be called with tb_lock held · 7d7500d9

由 Paolo Bonzini 提交于 10月 27, 2016

softmmu requires more functions to be thread-safe, because translation
blocks can be invalidated from e.g. notdirty callbacks.  Probably the
same holds for user-mode emulation, it's just that no one has ever
tried to produce a coherent locking there.

This patch will guide the introduction of more tb_lock and tb_unlock
calls for system emulation.

Note that after this patch some (most) of the mentioned functions are
still called outside tb_lock/tb_unlock.  The next one will rectify this.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

7d7500d9

translate-all: add DEBUG_LOCKING asserts · 301e40ed

由 Alex Bennée 提交于 10月 27, 2016

This adds asserts to check the locking on the various translation
engines structures. There are two sets of structures that are protected
by locks.

The first the l1map and PageDesc structures used to track which
translation blocks are associated with which physical addresses. In
user-mode this is covered by the mmap_lock.

The second case are TB context related structures which are protected by
tb_lock which is also user-mode only.

Currently the asserts do nothing in SoftMMU mode but this will change
for MTTCG.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-4-alex.bennee@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

301e40ed

26 10月, 2016 1 次提交

tcg: Add EXCP_ATOMIC · fdbc2b57

由 Richard Henderson 提交于 6月 29, 2016

When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

fdbc2b57

25 10月, 2016 1 次提交

exec: move cpu_exec_init() calls to realize functions · ce5b1bbf

由 Laurent Vivier 提交于 10月 20, 2016

Modify all CPUs to call it from XXX_cpu_realizefn() function.

Remove all the cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27)

for arm:

Setting of cpu->mp_affinity is moved from arm_cpu_initfn()
to arm_cpu_realizefn() as setting of cpu_index is now done
in cpu_exec_realizefn(). To avoid to overwrite an user defined
value, we set it to an invalid value by default, and update
it in realize function only if the value is still invalid.
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Reviewed-by: NAndrew Jones <drjones@redhat.com>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

ce5b1bbf

27 9月, 2016 1 次提交

cpus-common: move CPU list management to common code · 267f685b

由 Paolo Bonzini 提交于 8月 28, 2016

Add a mutex for the CPU list to system emulation, as it will be used to
manage safe work.  Abstract manipulation of the CPU list in new functions
cpu_list_add and cpu_list_remove.
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

267f685b

16 9月, 2016 1 次提交

tcg: Merge GETPC and GETRA · 01ecaf43

由 Richard Henderson 提交于 7月 26, 2016

The return address argument to the softmmu template helpers was
confused.  In the legacy case, we wanted to indicate that there
is no return address, and so passed in NULL.  However, we then
immediately subtracted GETPC_ADJ from NULL, resulting in a non-zero
value, indicating the presence of an (invalid) return address.

Push the GETPC_ADJ subtraction down to the only point it's required:
immediately before use within cpu_restore_state_from_tb, after all
NULL pointer checks have been completed.

This makes GETPC and GETRA identical.  Remove GETRA as the lesser
used macro, replacing all uses with GETPC.
Signed-off-by: NRichard Henderson <rth@twiddle.net>

01ecaf43

14 9月, 2016 1 次提交

tcg: Prepare TB invalidation for lockless TB lookup · 6d21e420

由 Paolo Bonzini 提交于 7月 19, 2016

When invalidating a translation block, set an invalid flag into the
TranslationBlock structure first. It is also necessary to check whether
the target TB is still valid after acquiring 'tb_lock' but before calling
tb_add_jump() since TB lookup is to be performed out of 'tb_lock' in
future. Note that we don't have to check 'last_tb'; an already invalidated
TB will not be executed anyway and it is thus safe to patch it.
Suggested-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6d21e420

27 7月, 2016 1 次提交

exec: Reduce CONFIG_USER_ONLY ifdeffenery · 1bc7e522

由 Igor Mammedov 提交于 7月 25, 2016

Signed-off-by: NIgor Mammedov <imammedo@redhat.com>
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NEduardo Habkost <ehabkost@redhat.com>

1bc7e522

12 7月, 2016 2 次提交

Clean up ill-advised or unusual header guards · 2a6a4076

由 Markus Armbruster 提交于 6月 29, 2016

Cleaned up with scripts/clean-header-guards.pl.
Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
Reviewed-by: NRichard Henderson <rth@twiddle.net>

2a6a4076

Fix confusing argument names in some common functions · b35399bb

由 Sergey Sorokin 提交于 6月 14, 2016

There are functions tlb_fill(), cpu_unaligned_access() and
do_unaligned_access() that are called with access type and mmu index
arguments. But these arguments are named 'is_write' and 'is_user' in their
declarations. The patches fix the arguments to avoid a confusion.
Signed-off-by: NSergey Sorokin <afarallax@yandex.ru>
Reviewed-by: NEduardo Habkost <ehabkost@redhat.com>
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Message-id: 1465907177-1399402-1-git-send-email-afarallax@yandex.ru
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

b35399bb

12 6月, 2016 1 次提交

tb hash: track translated blocks with qht · 909eaac9

由 Emilio G. Cota 提交于 6月 08, 2016

Having a fixed-size hash table for keeping track of all translation blocks
is suboptimal: some workloads are just too big or too small to get maximum
performance from the hash table. The MRU promotion policy helps improve
performance when the hash table is a little undersized, but it cannot
make up for severely undersized hash tables.

Furthermore, frequent MRU promotions result in writes that are a scalability
bottleneck. For scalability, lookups should only perform reads, not writes.
This is not a big deal for now, but it will become one once MTTCG matures.

The appended fixes these issues by using qht as the implementation of
the TB hash table. This solution is superior to other alternatives considered,
namely:

- master: implementation in QEMU before this patchset
- xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU.
- xxhash-rcu: fixed buckets + xxhash + RCU list + MRU.
              MRU is implemented here by adding an intermediate struct
              that contains the u32 hash and a pointer to the TB; this
              allows us, on an MRU promotion, to copy said struct (that is not
              at the head), and put this new copy at the head. After a grace
              period, the original non-head struct can be eliminated, and
              after another grace period, freed.
- qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize +
                   no MRU for lookups; MRU for inserts.
The appended solution is the following:
- qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize +
                 no MRU for lookups; MRU for inserts.

The plots below compare the considered solutions. The Y axis shows the
boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis
sweeps the number of buckets (or initial number of buckets for qht-autoresize).
The plots in PNG format (and with errorbars) can be seen here:
  http://imgur.com/a/Awgnq

Each test runs 5 times, and the entire QEMU process is pinned to a
single core for repeatability of results.

                            Host: Intel Xeon E5-2690

  28 ++------------+-------------+-------------+-------------+------------++
     A*****        +             +             +             master **A*** +
  27 ++    *                                                 xxhash ##B###++
     |      A******A******                               xxhash-rcu $$C$$$ |
  26 C$$                  A******A******            qht-fixed-nomru*%%D%%%++
     D%%$$                              A******A******A*qht-dyn-mru A*E****A
  25 ++ %%$$                                          qht-dyn-nomru &&F&&&++
     B#####%                                                               |
  24 ++    #C$$$$$                                                        ++
     |      B###  $                                                        |
     |          ## C$$$$$$                                                 |
  23 ++           #       C$$$$$$                                         ++
     |             B######       C$$$$$$                                %%%D
  22 ++                  %B######       C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C
     |                    D%%%%%%B######      @E@@@@@@    %%%D%%%@@@E@@@@@@E
  21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B
     +             E@@@   F&&&   +      E@     +      F&&&   +             +
  20 ++------------+-------------+-------------+-------------+------------++
     14            16            18            20            22            24
                             log2 number of buckets

                                 Host: Intel i7-4790K

  14.5 ++------------+------------+-------------+------------+------------++
       A**           +            +             +            master **A*** +
    14 ++ **                                                 xxhash ##B###++
  13.5 ++   **                                           xxhash-rcu $$C$$$++
       |                                            qht-fixed-nomru %%D%%% |
    13 ++     A******                                   qht-dyn-mru @@E@@@++
       |             A*****A******A******             qht-dyn-nomru &&F&&& |
  12.5 C$$                               A******A******A*****A******    ***A
    12 ++ $$                                                        A***  ++
       D%%% $$                                                             |
  11.5 ++  %%                                                             ++
       B###  %C$$$$$$                                                      |
    11 ++  ## D%%%%% C$$$$$                                               ++
       |     #      %      C$$$$$$                                         |
  10.5 F&&&&&&B######D%%%%%       C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$    $$$C
    10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B
       +             F&&          D%%%%%%B######B######B#####B###@@@D%%%   +
   9.5 ++------------+------------+-------------+------------+------------++
       14            16           18            20           22            24
                              log2 number of buckets

Note that the original point before this patch series is X=15 for "master";
the little sensitivity to the increased number of buckets is due to the
poor hashing function in master.

xxhash-rcu has significant overhead due to the constant churn of allocating
and deallocating intermediate structs for implementing MRU. An alternative
would be do consider failed lookups as "maybe not there", and then
acquire the external lock (tb_lock in this case) to really confirm that
there was indeed a failed lookup. This, however, would not be enough
to implement dynamic resizing--this is more complex: see
"Resizable, Scalable, Concurrent Hash Tables via Relativistic
Programming" by Triplett, McKenney and Walpole. This solution was
discarded due to the very coarse RCU read critical sections that we have
in MTTCG; resizing requires waiting for readers after every pointer update,
and resizes require many pointer updates, so this would quickly become
prohibitive.

qht-fixed-nomru shows that MRU promotion is advisable for undersized
hash tables.

However, qht-dyn-mru shows that MRU promotion is not important if the
hash table is properly sized: there is virtually no difference in
performance between qht-dyn-nomru and qht-dyn-mru.

Before this patch, we're at X=15 on "xxhash"; after this patch, we're at
X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we
can achieve with optimum sizing of the hash table, while keeping the hash
table scalable for readers.

The improvement we get before and after this patch for booting debian jessie
with arm-softmmu is:

- Intel Xeon E5-2690: 10.5% less time
- Intel i7-4790K: 5.2% less time

We could get this same improvement _for this particular workload_ by
statically increasing the size of the hash table. But this would hurt
workloads that do not need a large hash table. The dynamic (upward)
resizing allows us to start small and enlarge the hash table as needed.

A quick note on downsizing: the table is resized back to 2**15 buckets
on every tb_flush; this makes sense because it is not guaranteed that the
table will reach the same number of TBs later on (e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.
Reviewed-by: NSergey Fedorov <serge.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-15-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

909eaac9

09 6月, 2016 1 次提交

cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc() · 6886b980

由 Peter Maydell 提交于 5月 17, 2016

The function cpu_resume_from_signal() is now always called with a
NULL puc argument, and is rather misnamed since it is never called
from a signal handler. It is essentially forcing an exit to the
top level cpu loop but without raising any exception, so rename
it to cpu_loop_exit_noexc() and drop the useless unused argument.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: NEduardo Habkost <ehabkost@redhat.com>
Acked-by: NRiku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-4-git-send-email-peter.maydell@linaro.org

6886b980

19 5月, 2016 2 次提交

cpu: move exec-all.h inclusion out of cpu.h · 63c91552

由 Paolo Bonzini 提交于 3月 15, 2016

exec-all.h contains TCG-specific definitions.  It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

63c91552

exec: extract exec/tb-context.h · 00f6da6a

由 Paolo Bonzini 提交于 3月 15, 2016

TCG backends do not need most of exec-all.h; extract what they actually
need to a separate file or move it directly to tcg.h.  The next patch
will stop including exec-all.h from everywhere.
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

00f6da6a

13 5月, 2016 10 次提交

tcg: Rework tb_invalidated_flag · 6f789be5

由 Sergey Fedorov 提交于 4月 13, 2016

'tb_invalidated_flag' was meant to catch two events:
 * some TB has been invalidated by tb_phys_invalidate();
 * the whole translation buffer has been flushed by tb_flush().

Then it was checked:
 * in cpu_exec() to ensure that the last executed TB can be safely
   linked to directly call the next one;
 * in cpu_exec_nocache() to decide if the original TB should be provided
   for further possible invalidation along with the temporarily
   generated TB.

It is always safe to patch an invalidated TB since it is not going to be
used anyway. It is also safe to call tb_phys_invalidate() for an already
invalidated TB. Thus, setting this flag in tb_phys_invalidate() is
simply unnecessary. Moreover, it can prevent from pretty proper linking
of TBs, if any arbitrary TB has been invalidated. So just don't touch it
in tb_phys_invalidate().

If this flag is only used to catch whether tb_flush() has been called
then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using
only 'true' and 'false' to set its value. Also, instead of setting it in
tb_gen_code(), just after tb_flush() has been called, do it right inside
of tb_flush().

In cpu_exec(), this flag is used to track if tb_flush() has been called
and have made 'next_tb' (a reference to the last executed TB) invalid
for linking it to directly call the next TB. tb_flush() can be called
during the CPU execution loop from tb_gen_code(), during TB execution or
by another thread while 'tb_lock' is released. Catch for translation
buffer flush reliably by resetting this flag once before first TB lookup
and each time we find it set before trying to add a direct jump. Don't
touch in in tb_find_physical().

Each vCPU has its own execution loop in multithreaded mode and thus
should have its own copy of the flag to be able to reset it with its own
'next_tb' and don't affect any other vCPU execution thread. So make this
flag per-vCPU and move it to CPUState.

In cpu_exec_nocache(), we only need to check if tb_flush() has been
called from tb_gen_code() called by cpu_exec_nocache() itself. To do
this reliably, preserve the old value of the flag, reset it before
calling tb_gen_code(), check afterwards, and combine the saved value
back to the flag.

This patch is based on the patch "tcg: move tb_invalidated_flag to
CPUState" from Paolo Bonzini <pbonzini@redhat.com>.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

6f789be5

tcg: Clarify thread safety check in tb_add_jump() · 9962c478

由 Sergey Fedorov 提交于 4月 20, 2016

The check is to make sure that another thread hasn't already done the
same while we were outside of tb_lock. Mention this in a comment.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

9962c478

tcg: Use uintptr_t type for jmp_list_{next|first} fields of TB · c37e6d7e

由 Sergey Fedorov 提交于 3月 21, 2016

These fields do not contain pure pointers to a TranslationBlock
structure. So uintptr_t is the most appropriate type for them.
Also put some asserts to assure that the two least significant bits of
the pointer are always zero before assigning it to jmp_list_first.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

c37e6d7e

tcg: Clean up direct block chaining data fields · f309101c

由 Sergey Fedorov 提交于 4月 10, 2016

Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.

Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
   tb_next_offset  =>  jmp_reset_offset
   tb_jmp_offset   =>  jmp_insn_offset
   tb_next         =>  jmp_target_addr
   jmp_next        =>  jmp_list_next
   jmp_first       =>  jmp_list_first

Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

f309101c

tcg: Note requirement on atomic direct jump patching · 10b4f485

由 Sergey Fedorov 提交于 4月 22, 2016

Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Message-Id: <1461341333-19646-12-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

10b4f485

tcg/arm: Make direct jump patching thread-safe · 7d14e0e2

由 Sergey Fedorov 提交于 4月 22, 2016

Ensure direct jump patching in ARM is atomic by using
atomic_read()/atomic_set() for code patching.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-8-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

7d14e0e2

tcg/s390: Make direct jump patching thread-safe · ed3d51ec

由 Sergey Fedorov 提交于 4月 22, 2016

Ensure direct jump patching in s390 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-7-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

ed3d51ec

tcg/i386: Make direct jump patching thread-safe · 0d07abf0

由 Sergey Fedorov 提交于 4月 22, 2016

Ensure direct jump patching in i386 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.

tcg_out_nopn() implementation:
Suggested-by: Richard Henderson <rth@twiddle.net>.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

0d07abf0

tci: Make direct jump patching thread-safe · 76442a93

由 Sergey Fedorov 提交于 4月 22, 2016

Ensure direct jump patching in TCI is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() to load/store the address.
Signed-off-by: NSergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: NSergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

76442a93

tb: consistently use uint32_t for tb->flags · 89fee74a

由 Emilio G. Cota 提交于 4月 07, 2016

We are inconsistent with the type of tb->flags: usage varies loosely
between int and uint64_t. Settle to uint32_t everywhere, which is
superior to both: at least one target (aarch64) uses the most significant
bit in the u32, and uint64_t is wasteful.

Compile-tested for all targets.
Suggested-by: NLaurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: NRichard Henderson <rth@twiddle.net>
Tested-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: NLaurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>

89fee74a

23 3月, 2016 2 次提交

qemu-log: dfilter-ise exec, out_asm, op and opt_op · d977e1c2

由 Alex Bennée 提交于 3月 15, 2016

This ensures the code generation debug code will honour -dfilter if set.
For the "exec" tracing I've added a new inline macro for efficiency's
sake.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NAurelien Jarno <aurelien@aureL32.net>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-Id: <1458052224-9316-8-git-send-email-alex.bennee@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

d977e1c2

qemu-log: Improve the "exec" TB execution logging · 1a830635

由 Peter Maydell 提交于 3月 15, 2016

Improve the TB execution logging so that it is easier to identify
what is happening from trace logs:
 * move the "Trace" logging of executed TBs into cpu_tb_exec()
   so that it is emitted if and only if we actually execute a TB,
   and for consistency for the CPU state logging
 * log when we link two TBs together via tb_add_jump()
 * log when cpu_tb_exec() returns early from a chain of TBs

The new style logging looks like this:

Trace 0x7fb7cc822ca0 [ffffffc0000dce00]
Linking TBs 0x7fb7cc822ca0 [ffffffc0000dce00] index 0 -> 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823110 [ffffffc0000dce10]
Trace 0x7fb7cc823420 [ffffffc000302688]
Trace 0x7fb7cc8234a0 [ffffffc000302698]
Trace 0x7fb7cc823520 [ffffffc0003026a4]
Trace 0x7fb7cc823560 [ffffffc0000dce44]
Linking TBs 0x7fb7cc823560 [ffffffc0000dce44] index 1 -> 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Stopped execution of TB chain before 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc8235d0 [ffffffc0000dce70]
Trace 0x7fb7cc822fd0 [ffffffc0000dd52c]
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
[AJB: reword patch title, Abandoned->Stopped]
Reviewed-by: NAurelien Jarno <aurelien@aurel32.net>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-Id: <1458052224-9316-6-git-send-email-alex.bennee@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

1a830635

21 1月, 2016 5 次提交

exec.c: Add cpu_get_address_space() · 651a5bc0

由 Peter Maydell 提交于 1月 21, 2016

Add a function to return the AddressSpace for a CPU based on
its numerical index. (Callers outside exec.c don't have access
to the CPUAddressSpace struct so can't just fish it out of the
CPUState struct directly.)
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Acked-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

651a5bc0

exec.c: Pass MemTxAttrs to iotlb_to_region so it uses the right AS · a54c87b6

由 Peter Maydell 提交于 1月 21, 2016

Pass the MemTxAttrs for the memory access to iotlb_to_region(); this
allows it to determine the correct AddressSpace to use for the lookup.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Acked-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

a54c87b6

cputlb.c: Use correct address space when looking up MemoryRegionSection · d7898cda

由 Peter Maydell 提交于 1月 21, 2016

When looking up the MemoryRegionSection for the new TLB entry in
tlb_set_page_with_attrs(), use cpu_asidx_from_attrs() to determine
the correct address space index for the lookup, and pass it into
address_space_translate_for_iotlb().
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Acked-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

d7898cda

exec-all.h: Document tlb_set_page_with_attrs, tlb_set_page · 1787cc8e

由 Peter Maydell 提交于 1月 21, 2016

Add documentation comments for tlb_set_page_with_attrs()
and tlb_set_page().
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Acked-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

1787cc8e

exec.c: Allow target CPUs to define multiple AddressSpaces · 12ebc9a7

由 Peter Maydell 提交于 1月 21, 2016

Allow multiple calls to cpu_address_space_init(); each
call adds an entry to the cpu->ases array at the specified
index. It is up to the target-specific CPU code to actually use
these extra address spaces.

Since this multiple AddressSpace support won't work with
KVM, add an assertion to avoid confusing failures.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Acked-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

12ebc9a7