提交 · 194125e3ebd553acb02aaf3797a4f0387493fe94 · openeuler / qemu

16 6月, 2018 14 次提交

translate-all: protect TB jumps with a per-destination-TB lock · 194125e3

由 Emilio G. Cota 提交于 8月 02, 2017

This applies to both user-mode and !user-mode emulation.

Instead of relying on a global lock, protect the list of incoming
jumps with tb->jmp_lock. This lock also protects tb->cflags,
so update all tb->cflags readers outside tb->jmp_lock to use
atomic reads via tb_cflags().

In order to find the destination TB (and therefore its jmp_lock)
from the origin TB, we introduce tb->jmp_dest[].

I considered not using a linked list of jumps, which simplifies
code and makes the struct smaller. However, it unnecessarily increases
memory usage, which results in a performance decrease. See for
instance these numbers booting+shutting down debian-arm:
Time (s) Rel. err (%) Abs. err (s) Rel. slowdown (%)
------------------------------------------------------------------------------
before 20.88 0.74 0.154512 0.
after 20.81 0.38 0.079078 -0.33524904
GTree 21.02 0.28 0.058856 0.67049808
GHashTable + xxhash 21.63 1.08 0.233604 3.5919540

Using a hash table or a binary tree to keep track of the jumps
doesn't really pay off, not only due to the increased memory usage,
but also because most TBs have only 0 or 1 jumps to them. The maximum
number of jumps when booting debian-arm that I measured is 35, but
as we can see in the histogram below a TB with that many incoming jumps
is extremely rare; the average TB has 0.80 incoming jumps.

n_jumps: 379208; avg jumps/tb: 0.801099
dist: [0.0,1.0)|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁ ▁▁▁ ▁|[34.0,35.0]
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

194125e3

translate-all: discard TB when tb_link_page returns an existing matching TB · 95590e24

由 Emilio G. Cota 提交于 8月 01, 2017

Use the recently-gained QHT feature of returning the matching TB if it
already exists. This allows us to get rid of the lookup we perform
right after acquiring tb_lock.
Suggested-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

95590e24

translate-all: introduce assert_no_pages_locked · faa9372c

由 Emilio G. Cota 提交于 2月 22, 2018

The appended adds assertions to make sure we do not longjmp with page
locks held. Note that user-mode has nothing to check, since page_locks
are !user-mode only.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

faa9372c

translate-all: add page_locked assertions · 6d9abf85

由 Emilio G. Cota 提交于 4月 05, 2018

This is only compiled under CONFIG_DEBUG_TCG to avoid
bloating the binary.

In user-mode, assert_page_locked is equivalent to assert_mmap_lock.

Note: There are some tb_lock assertions left that will be
removed by later patches.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Suggested-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

6d9abf85

translate-all: use per-page locking in !user-mode · 0b5c91f7

由 Emilio G. Cota 提交于 7月 26, 2017

Groundwork for supporting parallel TCG generation.

Instead of using a global lock (tb_lock) to protect changes
to pages, use fine-grained, per-page locks in !user-mode.
User-mode stays with mmap_lock.

Sometimes changes need to happen atomically on more than one
page (e.g. when a TB that spans across two pages is
added/invalidated, or when a range of pages is invalidated).
We therefore introduce struct page_collection, which helps
us keep track of a set of pages that have been locked in
the appropriate locking order (i.e. by ascending page index).

This commit first introduces the structs and the function helpers,
to then convert the calling code to use per-page locking. Note
that tb_lock is not removed yet.

While at it, rename tb_alloc_page to tb_page_add, which pairs with
tb_page_remove.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

0b5c91f7

translate-all: move tb_invalidate_phys_page_range up in the file · 45c73de5

由 Emilio G. Cota 提交于 8月 05, 2017

This greatly simplifies next commit's diff.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

45c73de5

translate-all: work page-by-page in tb_invalidate_phys_range_1 · ae5486e2

由 Emilio G. Cota 提交于 8月 05, 2017

So that we pass a same-page range to tb_invalidate_phys_page_range,
instead of always passing an end address that could be on a different
page.

As discussed with Peter Maydell on the list [1], tb_invalidate_phys_page_range
doesn't actually do much with 'end', which explains why we have never
hit a bug despite going against what the comment on top of
tb_invalidate_phys_page_range requires:

> * Invalidate all TBs which intersect with the target physical address range
> * [start;end[. NOTE: start and end must refer to the *same* physical page.

The appended honours the comment, which avoids confusion.

While at it, rework the loop into a for loop, which is less error prone
(e.g. "continue" won't result in an infinite loop).

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg09165.htmlReviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

ae5486e2

translate-all: remove hole in PageDesc · 94da9aec

由 Emilio G. Cota 提交于 7月 29, 2017

Groundwork for supporting parallel TCG generation.

Move the hole to the end of the struct, so that a u32
field can be added there without bloating the struct.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

94da9aec

translate-all: make l1_map lockless · 78722ed0

由 Emilio G. Cota 提交于 7月 26, 2017

Groundwork for supporting parallel TCG generation.

We never remove entries from the radix tree, so we can use cmpxchg
to implement lockless insertions.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

78722ed0

translate-all: iterate over TBs in a page with PAGE_FOR_EACH_TB · 1e05197f

由 Emilio G. Cota 提交于 8月 03, 2017

This commit does several things, but to avoid churn I merged them all
into the same commit. To wit:

- Use uintptr_t instead of TranslationBlock * for the list of TBs in a page.
  Just like we did in (c37e6d7e "tcg: Use uintptr_t type for
  jmp_list_{next|first} fields of TB"), the rationale is the same: these
  are tagged pointers, not pointers. So use a more appropriate type.

- Only check the least significant bit of the tagged pointers. Masking
  with 3/~3 is unnecessary and confusing.

- Introduce the TB_FOR_EACH_TAGGED macro, and use it to define
  PAGE_FOR_EACH_TB, which improves readability. Note that
  TB_FOR_EACH_TAGGED will gain another user in a subsequent patch.

- Update tb_page_remove to use PAGE_FOR_EACH_TB. In case there
  is a bug and we attempt to remove a TB that is not in the list, instead
  of segfaulting (since the list is NULL-terminated) we will reach
  g_assert_not_reached().
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

1e05197f

tcg: move tb_ctx.tb_phys_invalidate_count to tcg_ctx · 128ed227

由 Emilio G. Cota 提交于 8月 01, 2017

Thereby making it per-TCGContext. Once we remove tb_lock, this will
avoid an atomic increment every time a TB is invalidated.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

128ed227

tcg: track TBs with per-region BST's · be2cdc5e

由 Emilio G. Cota 提交于 7月 26, 2017

This paves the way for enabling scalable parallel generation of TCG code.

Instead of tracking TBs with a single binary search tree (BST), use a
BST for each TCG region, protecting it with a lock. This is as scalable
as it gets, since each TCG thread operates on a separate region.

The core of this change is the introduction of struct tcg_region_tree,
which contains a pointer to a GTree and an associated lock to serialize
accesses to it. We then allocate an array of tcg_region_tree's, adding
the appropriate padding to avoid false sharing based on
qemu_dcache_linesize.

Given a tc_ptr, we first find the corresponding region_tree. This
is done by special-casing the first and last regions first, since they
might be of size != region.size; otherwise we just divide the offset
by region.stride. I was worried about this division (several dozen
cycles of latency), but profiling shows that this is not a fast path.
Note that region.stride is not required to be a power of two; it
is only required to be a multiple of the host's page size.

Note that with this design we can also provide consistent snapshots
about all region trees at once; for instance, tcg_tb_foreach
acquires/releases all region_tree locks before/after iterating over them.
For this reason we now drop tb_lock in dump_exec_info().

As an alternative I considered implementing a concurrent BST, but this
can be tricky to get right, offers no consistent snapshots of the BST,
and performance and scalability-wise I don't think it could ever beat
having separate GTrees, given that our workload is insert-mostly (all
concurrent BST designs I've seen focus, understandably, on making
lookups fast, which comes at the expense of convoluted, non-wait-free
insertions/removals).
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

be2cdc5e

qht: return existing entry when qht_insert fails · 32359d52

由 Emilio G. Cota 提交于 7月 11, 2017

The meaning of "existing" is now changed to "matches in hash and
ht->cmp result". This is saner than just checking the pointer value.
Suggested-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

32359d52

qht: require a default comparison function · 61b8cef1

由 Emilio G. Cota 提交于 7月 11, 2017

qht_lookup now uses the default cmp function. qht_lookup_custom is defined
to retain the old behaviour, that is a cmp function is explicitly provided.

qht_insert will gain use of the default cmp in the next patch.

Note that we move qht_lookup_custom's @func to be the last argument,
which makes the new qht_lookup as simple as possible.
Instead of this (i.e. keeping @func 2nd):
0000000000010750 <qht_lookup>:
   10750:       89 d1                   mov    %edx,%ecx
   10752:       48 89 f2                mov    %rsi,%rdx
   10755:       48 8b 77 08             mov    0x8(%rdi),%rsi
   10759:       e9 22 ff ff ff          jmpq   10680 <qht_lookup_custom>
   1075e:       66 90                   xchg   %ax,%ax

We get:
0000000000010740 <qht_lookup>:
   10740:       48 8b 4f 08             mov    0x8(%rdi),%rcx
   10744:       e9 37 ff ff ff          jmpq   10680 <qht_lookup_custom>
   10749:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

61b8cef1

15 6月, 2018 3 次提交

exec.c: Handle IOMMUs in address_space_translate_for_iotlb() · 1f871c5e

由 Peter Maydell 提交于 6月 15, 2018

Currently we don't support board configurations that put an IOMMU
in the path of the CPU's memory transactions, and instead just
assert() if the memory region fonud in address_space_translate_for_iotlb()
is an IOMMUMemoryRegion.

Remove this limitation by having the function handle IOMMUs.
This is mostly straightforward, but we must make sure we have
a notifier registered for every IOMMU that a transaction has
passed through, so that we can flush the TLB appropriately
when any of the IOMMUs change their mappings.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Message-id: 20180604152941.20374-5-peter.maydell@linaro.org

1f871c5e

cputlb: Pass cpu_transaction_failed() the correct physaddr · 2d54f194

由 Peter Maydell 提交于 6月 15, 2018

The API for cpu_transaction_failed() says that it takes the physical
address for the failed transaction. However we were actually passing
it the offset within the target MemoryRegion. We don't currently
have any target CPU implementations of this hook that require the
physical address; fix this bug so we don't get confused if we ever
do add one.
Suggested-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20180611125633.32755-3-peter.maydell@linaro.org

2d54f194

cpu-defs.h: Document CPUIOTLBEntry 'addr' field · ace41090

由 Peter Maydell 提交于 6月 15, 2018

The 'addr' field in the CPUIOTLBEntry struct has a rather non-obvious
use; add a comment documenting it (reverse-engineered from what
the code that sets it is doing).
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20180611125633.32755-2-peter.maydell@linaro.org

ace41090

01 6月, 2018 1 次提交

accel: Do not include "exec/address-spaces.h" if it is not necessary · df924c06

由 Philippe Mathieu-Daudé 提交于 5月 28, 2018

Code change produced with:
    $ git grep '#include "exec/address-spaces.h"' accel | \
      cut -d: -f-1 | \
      xargs egrep -L "(get_system_|address_space_)" | \
      xargs sed -i.bak '/#include "exec\/address-spaces.h"/d'
Signed-off-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-3-f4bug@amsat.org>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

df924c06

31 5月, 2018 2 次提交

Make address_space_translate{, _cached}() take a MemTxAttrs argument · bc6b1cec

由 Peter Maydell 提交于 5月 31, 2018

As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to address_space_translate()
and address_space_translate_cached(). Callers either have an
attrs value to hand, or don't care and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20180521140402.23318-4-peter.maydell@linaro.org

bc6b1cec

Make tb_invalidate_phys_addr() take a MemTxAttrs argument · c874dc4f

由 Peter Maydell 提交于 5月 31, 2018

As part of plumbing MemTxAttrs down to the IOMMU translate method,
add MemTxAttrs as an argument to tb_invalidate_phys_addr().
Its callers either have an attrs value to hand, or don't care
and can use MEMTXATTRS_UNSPECIFIED.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Message-id: 20180521140402.23318-3-peter.maydell@linaro.org

c874dc4f

20 5月, 2018 1 次提交

Remove unnecessary variables for function return value · 4a4ff4c5

由 Laurent Vivier 提交于 3月 23, 2018

Re-run Coccinelle script scripts/coccinelle/return_directly.cocci
Signed-off-by: NLaurent Vivier <lvivier@redhat.com>
ppc part
Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NMichael Tokarev <mjt@tls.msk.ru>

4a4ff4c5

15 5月, 2018 1 次提交

tcg: Optionally log FPU state in TCG -d cpu logging · ae765180

由 Peter Maydell 提交于 5月 15, 2018

Usually the logging of the CPU state produced by -d cpu is sufficient
to diagnose problems, but sometimes you want to see the state of
the floating point registers as well. We don't want to enable that
by default as it adds a lot of extra data to the log; instead,
allow it to be optionally enabled via -d fpu.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20180510130024.31678-1-peter.maydell@linaro.org

ae765180

11 5月, 2018 2 次提交

tcg: Use GEN_ATOMIC_HELPER_FN for opposite endian atomic add · 58edf9ee

由 Richard Henderson 提交于 5月 10, 2018

Suggested-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20180508151437.4232-6-richard.henderson@linaro.org
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

58edf9ee

tcg: Introduce atomic helpers for integer min/max · 5507c2bf

由 Richard Henderson 提交于 5月 10, 2018

Given that this atomic operation will be used by both risc-v
and aarch64, let's not duplicate code across the two targets.
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20180508151437.4232-5-richard.henderson@linaro.org
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

5507c2bf

10 5月, 2018 1 次提交

translator: merge max_insns into DisasContextBase · b542683d

由 Emilio G. Cota 提交于 2月 19, 2018

While at it, use int for both num_insns and max_insns to make
sure we have same-type comparisons.
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: NMichael Clark <mjc@sifive.com>
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

b542683d

11 4月, 2018 1 次提交

icount: fix cpu_restore_state_from_tb for non-tb-exit cases · afd46fca

由 Pavel Dovgalyuk 提交于 4月 09, 2018

In icount mode, instructions that access io memory spaces in the middle
of the translation block invoke TB recompilation.  After recompilation,
such instructions become last in the TB and are allowed to access io
memory spaces.

When the code includes instruction like i386 'xchg eax, 0xffffd080'
which accesses APIC, QEMU goes into an infinite loop of the recompilation.

This instruction includes two memory accesses - one read and one write.
After the first access, APIC calls cpu_report_tpr_access, which restores
the CPU state to get the current eip.  But cpu_restore_state_from_tb
resets the cpu->can_do_io flag which makes the second memory access invalid.
Therefore the second memory access causes a recompilation of the block.
Then these operations repeat again and again.

This patch moves resetting cpu->can_do_io flag from
cpu_restore_state_from_tb to cpu_loop_exit* functions.

It also adds a parameter for cpu_restore_state which controls restoring
icount.  There is no need to restore icount when we only query CPU state
without breaking the TB.  Restoring it in such cases leads to the
incorrect flow of the virtual time.

In most cases new parameter is true (icount should be recalculated).
But there are two cases in i386 and openrisc when the CPU state is only
queried without the need to break the TB.  This patch fixes both of
these cases.
Signed-off-by: NPavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20180409091320.12504.35329.stgit@pasha-VirtualBox>
[rth: Make can_do_io setting unconditional; move from cpu_exec;
make cpu_loop_exit_{noexc,restore} call cpu_loop_exit.]
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

afd46fca

06 4月, 2018 1 次提交

tcg: Fix out-of-line generic vector compares · 6cb1d3b8

由 Richard Henderson 提交于 4月 06, 2018

A mistake in the type passed to sizeof, that happens to work
when the out-of-line fallback itself is using host vectors,
but fails when using only the base types.
Tested-by: NEmilio G. Cota <cota@braap.org>
Reported-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

6cb1d3b8

26 3月, 2018 1 次提交

tcg: Really fix cpu_io_recompile · 87f963be

由 Richard Henderson 提交于 3月 19, 2018

We have confused the number of instructions that have been
executed in the TB with the number of instructions needed
to repeat the I/O instruction.

We have used cpu_restore_state_from_tb, which means that
the guest pc is pointing to the I/O instruction.  The only
time the answer to the later question is not 1 is when
MIPS or SH4 need to re-execute the branch for the delay
slot as well.

We must rely on cpu->cflags_next_tb to generate the next TB,
as otherwise we have a race condition with other guest cpus
within the TB cache.

Fixes: 0790f868Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Message-Id: <20180319031545.29359-1-richard.henderson@linaro.org>
Tested-by: NPavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

87f963be

13 3月, 2018 1 次提交

tcg: fix cpu_io_recompile · 0790f868

由 Pavel Dovgalyuk 提交于 2月 27, 2018

cpu_io_recompile() function was broken by
the commit 9b990ee5. Instead of regenerating
the block starting from PC of the original block, it just set the instruction
counter for TCG. In most cases this was unnoticed, but in icount mode
there was an exception for incorrect usage of CF_LAST_IO flag.
This patch recovers recompilation of the original block and also
configures translation for executing single IO instruction which
caused a recompilation.
Signed-off-by: NPavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20180227095338.1060.27385.stgit@pasha-VirtualBox>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

0790f868

12 3月, 2018 1 次提交

cpu-exec: fix exception_index handling · 5f3bdfd4

由 Pavel Dovgalyuk 提交于 2月 27, 2018

Function cpu_handle_interrupt calls cc->cpu_exec_interrupt to process
pending hardware interrupts. Under the hood cpu_exec_interrupt uses
cpu->exception_index to pass information to the internal function which
is usually common for exception and interrupt processing.
But this value is not reset after return and may be processed again
by cpu_handle_exception. This does not happen due to overwriting
the exception_index at the end of cpu_handle_interrupt.
But this branch may also overwrite the valid exception_index in some cases.
Therefore this patch:
 1. resets exception_index just after the call to cpu_exec_interrupt
 2. prevents overwriting the meaningful value of exception_index
Signed-off-by: NPavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095140.1060.61357.stgit@pasha-VirtualBox>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NPavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>

5f3bdfd4

08 2月, 2018 6 次提交

tcg: Add generic vector helpers with a scalar operand · 22fc3527

由 Richard Henderson 提交于 12月 21, 2017

Use dup to convert a non-constant scalar to a third vector.

Add addition, multiplication, and logical operations with an immediate.
Add addition, subtraction, multiplication, and logical operations with
a non-constant scalar.  Allow for the front-end to build operations in
which the scalar operand comes first.
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

22fc3527

tcg: Add generic helpers for saturating arithmetic · f49b12c6

由 Richard Henderson 提交于 12月 14, 2017

No vector ops as yet. SSE only has direct support for 8- and 16-bit
saturation; handling 32- and 64-bit saturation is much more expensive.
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

f49b12c6

tcg: Add generic vector ops for multiplication · 3774030a

由 Richard Henderson 提交于 11月 21, 2017

Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

3774030a

tcg: Add generic vector ops for comparisons · 212be173

由 Richard Henderson 提交于 11月 17, 2017

Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

212be173

tcg: Add generic vector ops for constant shifts · d0ec9796

由 Richard Henderson 提交于 11月 17, 2017

Opcodes are added for scalar and vector shifts, but considering the
varied semantics of these do not expose them to the front ends. Do
go ahead and provide them in case they are needed for backend expansion.
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

d0ec9796

tcg: Add generic vector expanders · db432672

由 Richard Henderson 提交于 9月 15, 2017

Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>

db432672

06 2月, 2018 1 次提交

Drop remaining bits of ia64 host support · b1cef6d0

由 Peter Maydell 提交于 1月 25, 2018

We dropped support for ia64 host CPUs in the 2.11 release (removing
the TCG backend for it, and advertising the support as being
completely removed in the changelog). However there are a few bits
and pieces of code still floating about. Remove those, too.

We can drop the check in configure for "ia64 or hppa host?"
entirely, because we don't support hppa hosts either any more.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Message-Id: <1516897189-11035-1-git-send-email-peter.maydell@linaro.org>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

b1cef6d0

25 1月, 2018 1 次提交

accel/tcg: add size paremeter in tlb_fill() · 98670d47

由 Laurent Vivier 提交于 1月 18, 2018

The MC68040 MMU provides the size of the access that
triggers the page fault.

This size is set in the Special Status Word which
is written in the stack frame of the access fault
exception.

So we need the size in m68k_cpu_unassigned_access() and
m68k_cpu_handle_mmu_fault().

To be able to do that, this patch modifies the prototype of
handle_mmu_fault handler, tlb_fill() and probe_write().
do_unassigned_access() already includes a size parameter.

This patch also updates handle_mmu_fault handlers and
tlb_fill() of all targets (only parameter, no code change).
Signed-off-by: NLaurent Vivier <laurent@vivier.eu>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Message-Id: <20180118193846.24953-2-laurent@vivier.eu>

98670d47

23 1月, 2018 2 次提交

page_unprotect(): handle calls to pages that are PAGE_WRITE · 9c4bbee9

由 Peter Maydell 提交于 11月 28, 2017

If multiple guest threads in user-mode emulation write to a
page which QEMU has marked read-only because of cached TCG
translations, the threads can race in page_unprotect:

 * threads A & B both try to do a write to a page with code in it at
   the same time (ie which we've made non-writeable, so SEGV)
 * they race into the signal handler with this faulting address
 * thread A happens to get to page_unprotect() first and takes the
   mmap lock, so thread B sits waiting for it to be done
 * A then finds the page, marks it PAGE_WRITE and mprotect()s it writable
 * A can then continue OK (returns from signal handler to retry the
   memory access)
 * ...but when B gets the mmap lock it finds that the page is already
   PAGE_WRITE, and so it exits page_unprotect() via the "not due to
   protected translation" code path, and wrongly delivers the signal
   to the guest rather than just retrying the access

In particular, this meant that trying to run 'javac' in user-mode
emulation would fail with a spurious guest SIGSEGV.

Handle this by making page_unprotect() assume that a call for a page
which is already PAGE_WRITE is due to a race of this sort and return
a "fault handled" indication.

Since this would cause an infinite loop if we ever called
page_unprotect() for some other kind of fault than "write failed due
to bad access permissions", tighten the condition in
handle_cpu_signal() to check the signal number and si_code, and add a
comment so that if somebody does ever find themselves debugging an
infinite loop of faults they have some clue about why.

(The trick for identifying the correct setting for
current_tb_invalidated for thread B (needed to handle the precise-SMC
case) is due to Richard Henderson.  Paolo Bonzini suggested just
relying on si_code rather than trying anything more complicated.)
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Message-Id: <1511879725-9576-3-git-send-email-peter.maydell@linaro.org>
Signed-off-by: NLaurent Vivier <laurent@vivier.eu>

9c4bbee9

linux-user: Propagate siginfo_t through to handle_cpu_signal() · a78b1299

由 Peter Maydell 提交于 11月 28, 2017

Currently all the architecture/OS specific cpu_signal_handler()
functions call handle_cpu_signal() without passing it the
siginfo_t. We're going to want that so we can look at the si_code
to determine whether this is a SEGV_ACCERR access violation or
some other kind of fault, so change the functions to pass through
the pointer to the siginfo_t rather than just the si_addr value.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Message-Id: <1511879725-9576-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: NLaurent Vivier <laurent@vivier.eu>

a78b1299