提交 · 37e29a64254bf82a1901784fcca17c25f8164c2f · openeuler / qemu

15 9月, 2017 2 次提交

target/arm: Avoid an extra temporary for store_exclusive · 37e29a64

由 Richard Henderson 提交于 9月 14, 2017

Instead of copying addr to a local temp, reuse the value (which we
have just compared as equal) already saved in cpu_exclusive_addr.
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Message-id: 20170908163859.29820-1-richard.henderson@linaro.org
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

37e29a64

AArch64: Fix single stepping of ERET instruction · dddbba99

由 Jaroslaw Pelczar 提交于 9月 14, 2017

Previously when single stepping through ERET instruction via GDB
would result in debugger entering the "next" PC after ERET instruction.
When debugging in kernel mode, this will also cause unintended behavior,
because debugger will try to access memory from EL0 point of view.
Signed-off-by: NJaroslaw Pelczar <j.pelczar@samsung.com>
Message-id: 001c01d32895$483027f0$d89077d0$@samsung.com
Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

dddbba99

06 9月, 2017 10 次提交

target/arm: [a64] Move page and ss checks to init_disas_context · dcc3a212

由 Richard Henderson 提交于 7月 14, 2017

Since AArch64 uses a fixed-width ISA, we can pre-compute the number of
insns remaining on the page.  Also, we can check for single-step once.
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

dcc3a212

target/arm: [tcg] Port to generic translation framework · 23169224

由 Lluís Vilanova 提交于 7月 14, 2017

Tested-by: NEmilio G. Cota <cota@braap.org>
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002631325.22386.10348327185029496649.stgit@frigg.lan>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

23169224

target/arm: [tcg,a64] Port to disas_log · 58350fa4

由 Lluís Vilanova 提交于 7月 14, 2017

Incrementally paves the way towards using the generic instruction translation
loop.
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002606914.22386.15524101311003685068.stgit@frigg.lan>
[rth: Move tb->size computation and use that result.]
Signed-off-by: NRichard Henderson <rth@twiddle.net>

58350fa4

target/arm: [tcg,a64] Port to tb_stop · be407964

由 Lluís Vilanova 提交于 7月 14, 2017

Incrementally paves the way towards using the generic instruction translation
loop.
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002558503.22386.1149037590886263349.stgit@frigg.lan>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

be407964

target/arm: [tcg,a64] Port to translate_insn · 24299c89

由 Lluís Vilanova 提交于 7月 14, 2017

Incrementally paves the way towards using the generic instruction translation
loop.
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002510079.22386.10164419868911710218.stgit@frigg.lan>
[rth: Adjust for translate_insn interface change.]
Signed-off-by: NRichard Henderson <rth@twiddle.net>

24299c89

target/arm: [tcg,a64] Port to breakpoint_check · 0cb56b37

由 Lluís Vilanova 提交于 7月 14, 2017

Incrementally paves the way towards using the generic instruction translation
loop.
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002461630.22386.14827196109258040543.stgit@frigg.lan>
[rth: Use DISAS_TOO_MANY for "execute only one more" after bp.]
Signed-off-by: NRichard Henderson <rth@twiddle.net>

0cb56b37

target/arm: [tcg,a64] Port to insn_start · a68956ad

由 Lluís Vilanova 提交于 7月 14, 2017

Incrementally paves the way towards using the generic instruction translation
loop.
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Benneé <alex.benee@linaro.org>
Message-Id: <150002413187.22386.156315485813606121.stgit@frigg.lan>
[rth: Use DISAS_TOO_MANY for "execute only one more" after bp.]
Signed-off-by: NRichard Henderson <rth@twiddle.net>

a68956ad

target/arm: [tcg,a64] Port to init_disas_context · 5c039906

由 Lluís Vilanova 提交于 7月 14, 2017

Incrementally paves the way towards using the generic instruction translation
loop.
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Benneé <alex.benee@linaro.org>
Message-Id: <150002340430.22386.10889954302345646107.stgit@frigg.lan>
[rth: Adjust for max_insns interface change.]
Signed-off-by: NRichard Henderson <rth@twiddle.net>

5c039906

target/arm: [tcg] Port to DisasContextBase · dcba3a8d

由 Lluís Vilanova 提交于 7月 14, 2017

Incrementally paves the way towards using the generic
instruction translation loop.
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Benneé <alex.benee@linaro.org>
Message-Id: <150002291931.22386.11441154993010495674.stgit@frigg.lan>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

dcba3a8d

target/arm: Use DISAS_NORETURN · a0c231e6

由 Richard Henderson 提交于 7月 14, 2017

Fold DISAS_EXC and DISAS_TB_JUMP into DISAS_NORETURN.

In both cases all following code is dead.  In the first
case because we have exited the TB via exception; in the
second case because we have exited the TB via goto_tb
and its associated machinery.
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

a0c231e6

05 9月, 2017 1 次提交

target/arm: Fix aa64 ldp register writeback · 3e4d91b9

由 Richard Henderson 提交于 9月 04, 2017

For "ldp x0, x1, [x0]", if the second load is on a second page and
the second page is unmapped, the exception would be raised with x0
already modified.  This means the instruction couldn't be restarted.

Cc: qemu-arm@nongnu.org
Cc: qemu-stable@nongnu.org
Reported-by: NAndrew <andrew@fubar.geek.nz>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20170825224833.4463-1-richard.henderson@linaro.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1713066Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
[PMM: tweaked comment format]
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

3e4d91b9

16 8月, 2017 2 次提交

target/arm: Require alignment for load exclusive · 4a2fdb78

由 Alistair Francis 提交于 8月 15, 2017

According to the ARM ARM exclusive loads require the same alignment as
exclusive stores. Let's update the memops used for the load to match
that of the store. This adds the alignment requirement to the memops.
Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: NAlistair Francis <alistair.francis@xilinx.com>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20170815145714.17635-4-richard.henderson@linaro.org
[rth: Require 16-byte alignment for 64-bit LDXP.]
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

4a2fdb78

target/arm: Correct load exclusive pair atomicity · 19514cde

由 Richard Henderson 提交于 8月 15, 2017

We are not providing the required single-copy atomic semantics for
the 64-bit operation that is the 32-bit paired load.

At the same time, leave the entire 64-bit value in cpu_exclusive_val
and stop writing to cpu_exclusive_high.  This means that we do not
have to re-assemble the 64-bit quantity when it comes time to store.

At the same time, drop a redundant temporary and perform all loads
directly into the cpu_exclusive_* globals.
Tested-by: NAlistair Francis <alistair.francis@xilinx.com>
Reviewed-by: NAlistair Francis <alistair.francis@xilinx.com>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20170815145714.17635-3-richard.henderson@linaro.org
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

19514cde

15 8月, 2017 1 次提交

target/arm: Correct exclusive store cmpxchg memop mask · 955fd0ad

由 Alistair Francis 提交于 8月 15, 2017

When we perform the atomic_cmpxchg operation we want to perform the
operation on a pair of 32-bit registers. Previously we were just passing
the register size in which was set to MO_32. This would result in the
high register to be ignored. To fix this issue we hardcode the size to
be 64-bits long when operating on 32-bit pairs.
Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: NPortia Stephens <portia.stephens@xilinx.com>
Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: NAlistair Francis <alistair.francis@xilinx.com>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Message-id: 20170815145714.17635-2-richard.henderson@linaro.org
Message-Id: <bc18dddca56e8c2ea4a3def48d33ceb5d21d1fff.1502488636.git.alistair.francis@xilinx.com>
Signed-off-by: NRichard Henderson <richard.henderson@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

955fd0ad

25 7月, 2017 1 次提交

target/arm: fix TCG temp leak in aarch64 rev16 · e4256c3c

由 Emilio G. Cota 提交于 7月 24, 2017

Fix a TCG temporary leak in the new aarch64 rev16 handling.
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

e4256c3c

20 7月, 2017 2 次提交

tcg: Pass generic CPUState to gen_intermediate_code() · 9c489ea6

由 Lluís Vilanova 提交于 7月 14, 2017

Needed to implement a target-agnostic gen_intermediate_code()
in the future.
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NAlex Benneé <alex.benee@linaro.org>
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NLluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002025498.22386.18051908483085660588.stgit@frigg.lan>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

9c489ea6

target/arm: Optimize aarch64 rev16 · abb1066d

由 Richard Henderson 提交于 7月 17, 2017

It is much shorter to reverse all 4 half-words in parallel
than extract, reverse, and deposit each in turn.
Suggested-by: NAurelien Jarno <aurelien@aurel32.net>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

abb1066d

17 7月, 2017 3 次提交

target/arm: use DISAS_EXIT for eret handling · b29fd33d

由 Alex Bennée 提交于 7月 17, 2017

Previously DISAS_JUMP did ensure this but with the optimisation of
8a6b28c7 (optimize indirect branches) we might not leave the loop.
This means if any pending interrupts are cleared by changing IRQ flags
we might never get around to servicing them. You usually notice this
by seeing the lookup_tb_ptr() helper gainfully chaining TBs together
while cpu->interrupt_request remains high and the exit_request has not
been set.

This breaks amongst other things the OPTEE test suite which executes
an eret from the secure world after a non-secure world IRQ has gone
pending which then never gets serviced.

Instead of using the previously implied semantics of DISAS_JUMP we use
DISAS_EXIT which will always exit the run-loop.

CC: Etienne Carriere <etienne.carriere@linaro.org>
CC: Joakim Bech <joakim.bech@linaro.org>
CC: Jaroslaw Pelczar <j.pelczar@samsung.com>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Emilio G. Cota <cota@braap.org>
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-id: 20170713141928.25419-7-alex.bennee@linaro.org
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

b29fd33d

target/arm: use gen_goto_tb for ISB handling · 0b609cc1

由 Alex Bennée 提交于 7月 17, 2017

While an ISB will ensure any raised IRQs happen on the next
instruction it doesn't cause any to get raised by itself. We can
therefore use a simple tb exit for ISB instructions and rely on the
exit_request check at the top of each TB to deal with exiting if
needed.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-id: 20170713141928.25419-6-alex.bennee@linaro.org
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

0b609cc1

target/arm/translate: make DISAS_UPDATE match declared semantics · e8d52302

由 Alex Bennée 提交于 7月 17, 2017

DISAS_UPDATE should be used when the wider CPU state other than just
the PC has been updated and we should therefore exit the TCG runtime
and return to the main execution loop rather assuming DISAS_JUMP would
do that.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Message-id: 20170713141928.25419-3-alex.bennee@linaro.org
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

e8d52302

20 6月, 2017 1 次提交

target/arm: Exit after clearing aarch64 interrupt mask · 8da54b25

由 Richard Henderson 提交于 6月 14, 2017

Exit to cpu loop so we reevaluate cpu_arm_hw_interrupts.
Tested-by: NEmilio G. Cota <cota@braap.org>
Tested-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NEmilio G. Cota <cota@braap.org>
Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

8da54b25

06 6月, 2017 2 次提交

target/aarch64: optimize indirect branches · e75449a3

由 Emilio G. Cota 提交于 4月 28, 2017

Measurements:

[Baseline performance is that before applying this and the previous commit]

-                                    NBench, aarch64-softmmu. Host: Intel i7-4790K @ 4.00GHz

 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |   cross                                                                                                          |
 1.6x +cross+jr.................................................####...................................................+-+
      |                                                         #++#                                                     |
      |                                                         #  #                                                     |
 1.5x +-+...................................................*****..#...................................................+-+
      |                                                     *+++*  #                                                     |
      |                                                     *   *  #                                                     |
 1.4x +-+...................................................*...*..#...................................................+-+
      |                                                     *   *  #                                                     |
      |                                     #####           *   *  #                                                     |
 1.3x +-+................................****+++#...........*...*..#...................................................+-+
      |                                  *++*   #           *   *  #                                                     |
      |                                  *  *   #           *   *  #                                                     |
 1.2x +-+................................*..*...#...........*...*..#...................................................+-+
      |                                  *  *   #           *   *  #                                                     |
      |                            ####  *  *   #           *   *  #                                                     |
 1.1x +-+.......................+++#..#..*..*...#...........*...*..#...................................................+-+
      |                         ****  #  *  *   #           *   *  #                                        ****####     |
      |                         *  *  #  *  *   #           *   *  #  ****###   +++####            ****###  *  *   #     |
   1x +-++-++++++-++++****###++-*++*++#++*++*+-+#++****+++++*+++*++#++*++*-+#++*****++#++****###-++*++*-+#++*+-*+++#+-++-+
      |     *****###  *  *  #   *  *  #  *  *   #  *++*###  *   *  #  *  *  #  *   *  #  *  *++#   *  *  #  *  *   #     |
      |     *   *++#  *  *  #   *  *  #  *  *   #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #   *  *  #  *  *   #     |
 0.9x +-+---*****###--****###---****###--****####--****###--*****###--****###--*****###--****###---****###--****####---+-+
      ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONNEURAL NUMERIC SORSTRING SORT    hmean
  png: http://imgur.com/qO9ubtk
NB. cross here represents the previous commit.

-                            SPECint06 (test set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

 1.5x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                       *****                                      |
      |                                                                       *+++*                           jr         |
      |                                                                       *   *                                      |
 1.4x +-+.....................................................................*...*.....................+++............+-+
      |                                                                       *   *                      |               |
      |                                      *****                            *   *                      |               |
      |                                      *   *                            *   *                    *****             |
 1.3x +-+....................................*...*............................*...*....................*.|.*...........+-+
      |                       +++            *   *                            *   *                    * | *             |
      |                      *****           *   *                            *   *                    *+++*             |
      |                      *   *           *   *                            *   *                    *   *             |
 1.2x +-+....................*...*...........*...*............................*...*...........*****....*...*...........+-+
      |     *****            *   *           *   *                            *   *           *   *    *   *    +++      |
      |     *   *            *   *           *   *                            *   *           *   *    *   *   *****     |
      |     *   *            *   *   *****   *   *                            *   *           *   *    *   *   *   *     |
 1.1x +-+...*...*............*...*...*...*...*...*............................*...*....+++....*...*....*...*...*...*...+-+
      |     *   *            *   *   *   *   *   *                            *   *   *****   *   *    *   *   *   *     |
      |     *   *            *   *   *   *   *   *   *****                    *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *****    *   *   *   *   *   *   *   *   ******           *   *   *   *   *   *    *   *   *   *     |
   1x +-++-+*+++*-++*+++*++++*+-+*+++*-++*+++*-++*+++*+++*++-*++++*-++*****+++*++-*+++*++-*+++*+-+*++++*+++*++-*+++*+-++-+
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *+++*   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
 0.9x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/3Dp4vvq

-                           SPECint06 (train set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |                                                                                                       jr         |
 1.6x +-+...............................................................................................+++............+-+
      |                                                                                                *****             |
      |                                                                                                *+++*             |
      |                                                                                                *   *             |
 1.5x +-+..............................................................................................*...*...........+-+
      |                                                                        +++                     *   *             |
      |                                                                       *****                    *   *             |
 1.4x +-+.....................................................................*+++*....................*...*...........+-+
      |                                                                       *   *                    *   *             |
      |                                      *****                            *   *                    *   *             |
      |                                      *   *                            *   *   *****            *   *             |
 1.3x +-+....................................*...*............................*...*...*...*............*...*...........+-+
      |                       +++            *   *                            *   *   *   *            *   *             |
      |                      *****           *   *                            *   *   *   *   *****    *   *             |
 1.2x +-+....................*...*...........*...*............................*...*...*...*...*+++*....*...*...*****...+-+
      |                      *   *           *   *                            *   *   *   *   *   *    *   *   *+++*     |
      |     *****            *   *   *****   *   *                            *   *   *   *   *   *    *   *   *   *     |
      |     *   *            *   *   *+++*   *   *                            *   *   *   *   *   *    *   *   *   *     |
 1.1x +-+...*...*............*...*...*...*...*...*............................*...*...*...*...*...*....*...*...*...*...+-+
      |     *   *   *****    *   *   *   *   *   *                    *****   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *    +++    ******   *+++*   *   *   *   *   *   *    *   *   *   *     |
   1x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/vRrdc9jSigned-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

e75449a3

target/aarch64: optimize cross-page direct jumps in softmmu · e7872236

由 Emilio G. Cota 提交于 4月 28, 2017

Perf numbers in next commit's log.
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

e7872236

02 6月, 2017 1 次提交

arm: Add support for M profile CPUs having different MMU index semantics · 8bd5c820

由 Peter Maydell 提交于 6月 02, 2017

The M profile CPU's MPU has an awkward corner case which we
would like to implement with a different MMU index.

We can avoid having to bump the number of MMU modes ARM
uses, because some of our existing MMU indexes are only
used by non-M-profile CPUs, so we can borrow one.
To avoid that getting too confusing, clean up the code
to try to keep the two meanings of the index separate.

Instead of ARMMMUIdx enum values being identical to core QEMU
MMU index values, they are now the core index values with some
high bits set. Any particular CPU always uses the same high
bits (so eventually A profile cores and M profile cores will
use different bits). New functions arm_to_core_mmu_idx()
and core_to_arm_mmu_idx() convert between the two.

In general core index values are stored in 'int' types, and
ARM values are stored in ARMMMUIdx types.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Message-id: 1493122030-32191-3-git-send-email-peter.maydell@linaro.org

8bd5c820

28 2月, 2017 1 次提交

Add missing fp_access_check() to aarch64 crypto instructions · a4f5c5b7

由 Nick Reilly 提交于 2月 28, 2017

The aarch64 crypto instructions for AES and SHA are missing the
check for if the FPU is enabled.
Signed-off-by: NNick Reilly <nreilly@blackberry.com>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

a4f5c5b7

24 2月, 2017 1 次提交

target-arm: don't generate WFE/YIELD calls for MTTCG · c22edfeb

由 Alex Bennée 提交于 2月 23, 2017

The WFE and YIELD instructions are really only hints and in TCG's case
they were useful to move the scheduling on from one vCPU to the next. In
the parallel context (MTTCG) this just causes an unnecessary cpu_exit
and contention of the BQL.
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Reviewed-by: NRichard Henderson <rth@twiddle.net>
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>

c22edfeb

08 2月, 2017 1 次提交

target/arm: A32, T32: Create Instruction Syndromes for Data Aborts · 9bb6558a

由 Peter Maydell 提交于 2月 07, 2017

Add support for generating the ISS (Instruction Specific Syndrome)
for Data Abort exceptions taken from AArch32. These syndromes are
used by hypervisors for example to trap and emulate memory accesses.

This is the equivalent for AArch32 guests of the work done for AArch64
guests in commit aaa1f954.
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
Reviewed-by: NEdgar E. Iglesias <edgar.iglesias@xilinx.com>

9bb6558a

14 1月, 2017 1 次提交

target/arm: Fix ubfx et al for aarch64 · 86c9ab27

由 Richard Henderson 提交于 1月 13, 2017

The patch in 59a71b4c suffered from a merge failure
when compared to the original patch in

http://lists.nongnu.org/archive/html/qemu-devel/2016-12/msg00137.htmlSigned-off-by: NRichard Henderson <rth@twiddle.net>

86c9ab27

11 1月, 2017 3 次提交

target-arm: Use clrsb helper · bc21dbcc

由 Richard Henderson 提交于 11月 16, 2016

Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

bc21dbcc

target-arm: Use clz opcode · 7539a012

由 Richard Henderson 提交于 11月 16, 2016

Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

7539a012

R
target-arm: Use new deposit and extract ops · 59a71b4c
由 Richard Henderson 提交于 10月 15, 2016
```
Use the new primitives for UBFX and SBFX.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
```
59a71b4c

27 12月, 2016 2 次提交

target-arm: Fix aarch64 disas_ldst_single_struct · 0a97c40f

由 Richard Henderson 提交于 12月 27, 2016

We add s->be_data within do_vec_ld/st.  Adding it here means that
we have the wrong bits set in SIZE for a big-endian host, leading
to g_assert_not_reached in write_vec_element and read_vec_element.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Message-id: 1481085020-2614-3-git-send-email-rth@twiddle.net
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

0a97c40f

target-arm: Fix aarch64 vec_reg_offset · 416d72b9

由 Richard Henderson 提交于 12月 27, 2016

Since CPUARMState.vfp.regs is not 16 byte aligned, the ^ 8 fixup used
for a big-endian host doesn't do what's intended.  Fix this by adding
in the vfp.regs offset after computing the inter-register offset.
Signed-off-by: NRichard Henderson <rth@twiddle.net>
Message-id: 1481085020-2614-2-git-send-email-rth@twiddle.net
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

416d72b9

21 12月, 2016 1 次提交

Move target-* CPU file into a target/ folder · fcf5ef2a

由 Thomas Huth 提交于 10月 11, 2016

We've currently got 18 architectures in QEMU, and thus 18 target-xxx
folders in the root folder of the QEMU source tree. More architectures
(e.g. RISC-V, AVR) are likely to be included soon, too, so the main
folder of the QEMU sources slowly gets quite overcrowded with the
target-xxx folders.
To disburden the main folder a little bit, let's move the target-xxx
folders into a dedicated target/ folder, so that target-xxx/ simply
becomes target/xxx/ instead.

Acked-by: Laurent Vivier <laurent@vivier.eu> [m68k part]
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [tricore part]
Acked-by: Michael Walle <michael@walle.cc> [lm32 part]
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x part]
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [s390x part]
Acked-by: Eduardo Habkost <ehabkost@redhat.com> [i386 part]
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com> [sparc part]
Acked-by: Richard Henderson <rth@twiddle.net> [alpha part]
Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa part]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ppc part]
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> [cris&microblaze part]
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [unicore32 part]
Signed-off-by: NThomas Huth <thuth@redhat.com>

fcf5ef2a

06 12月, 2016 1 次提交

target-arm/translate-a64: fix gen_load_exclusive · 5460da50

由 Alex Bennée 提交于 12月 02, 2016

While testing rth's latest TCG patches with risu I found ldaxp was
broken. Investigating further I found it was broken by 1dd089d0 when
the cmpxchg atomic work was merged. As part of that change the code
attempted to be clever by doing a single 64 bit load and then shuffle
the data around to set the two 32 bit registers.

As I couldn't quite follow the endian magic I've simply partially
reverted the change to the original code gen_load_exclusive code. This
doesn't affect the cmpxchg functionality as that is all done on in
gen_store_exclusive part which is untouched.

I've also restored the comment that was removed (with a slight tweak
to mention cmpxchg).
Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
Acked-by: NRichard Henderson <rth@twiddle.net>
Message-id: 20161202173454.19179-1-alex.bennee@linaro.org
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

5460da50

02 11月, 2016 1 次提交

log: Add locking to large logging blocks · 1ee73216

由 Richard Henderson 提交于 9月 22, 2016

Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.

While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.

For mingw32, we appear to have no viable solution for this.  The locking
functions are not properly exported from the system runtime library.
Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

1ee73216

26 10月, 2016 1 次提交

target-arm: emulate aarch64's LL/SC using cmpxchg helpers · 1dd089d0

由 Emilio G. Cota 提交于 6月 27, 2016

Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.

The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.

Hi-res plots: http://imgur.com/a/JVc8Y

                atomic_add-bench: 1000000 ops/thread, [0,1] range

  18 ++---------+----------+---------+----------+----------+----------+---++
     +cmpxchg +-E--+       +         +          +          +          +    |
  16 ++master +-H--+                                                      ++
     ||                                                                    |
  14 ++                                                                   ++
     | |                                                                   |
  12 ++|                                                                  ++
     | |                                                                   |
  10 ++++                                                                 ++
   8 ++E                                                                  ++
     |+++                                                                  |
   6 ++ |                                                                 ++
     |  |                                                                  |
   4 ++ |                                                                 ++
     |   |                                                                 |
   2 +H++E+---                                                            ++
     + |     +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E|
   0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
     0          10         20        30         40         50         60
                                Number of threads

                atomic_add-bench: 1000000 ops/thread, [0,2] range

  18 ++---------+----------+---------+----------+----------+----------+---++
     +cmpxchg +-E--+       +         +          +          +          +    |
  16 ++master +-H--+                                                      ++
     | |                                                                   |
  14 ++E                                                                  ++
     | |                                                                   |
  12 ++|                                                                  ++
     |+++                                                                  |
  10 ++ |                                                                 ++
   8 ++ |                                                                 ++
     |  |                                                                  |
   6 ++ |                                                                 ++
     |   |                                                                 |
   4 ++  |                                                                ++
     |  +E+---                                                             |
   2 +H+     +E+-----+++              +++      +++   ---+E+-----+E+------+++
     +++        +    +E+---+--+E+----++E+------+E+---   ++++    +++   +  +E|
   0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
     0          10         20        30         40         50         60
                                Number of threads

               atomic_add-bench: 1000000 ops/thread, [0,128] range

  70 ++---------+----------+---------+----------+----------+----------+---++
     +cmpxchg +-E--+       +         +          +          +          +    |
  60 ++master +-H--+                  +++            ---+E+-----+E+------+E+
     |                        +E+------E-------+E+---                      |
     |                     ---        +++                                  |
  50 ++              +++---                                               ++
     |              -+E+                                                   |
  40 ++      +++----                                                      ++
     |        E-                                                           |
     |      --|                                                            |
  30 ++   -- +++                                                          ++
     |  +E+                                                                |
  20 ++E+                                                                 ++
     |E+                                                                   |
     |                                                                     |
  10 ++                                                                   ++
     +          +          +         +          +          +          +    |
   0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
     0          10         20        30         40         50         60
                                Number of threads

              atomic_add-bench: 1000000 ops/thread, [0,1024] range

  160 ++---------+---------+----------+---------+----------+----------+---++
      +cmpxchg +-E--+      +          +         +          +          +    |
  140 ++master +-H--+                                           +++      +++
      |                                                -+E+-----+E+-------E|
  120 ++                                       +++ ----                  +++
      |                                +++  ----E--                        |
  100 ++                              --E---   +++                        ++
      |                       +++ ---- +++                                 |
   80 ++                     --E--                                        ++
      |                  ---- +++                                          |
      |              -+E+                                                  |
   60 ++         ---- +++                                                 ++
      |      +E+-                                                          |
   40 ++   --                                                             ++
      |  +E+                                                               |
   20 +EE+                                                                ++
      +++        +         +          +         +          +          +    |
    0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
      0          10        20         30        40         50         60
                                Number of threads

[rth: Rearrange 128-bit cmpxchg helper.  Enforce alignment on LL.]
Signed-off-by: NEmilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-28-git-send-email-cota@braap.org>
Signed-off-by: NRichard Henderson <rth@twiddle.net>

1dd089d0

18 10月, 2016 1 次提交

target-arm: Comments added to identify cases in a switch · 957956b3

由 Thomas Hanson 提交于 10月 17, 2016

3 cases in a switch in disas_exc() require reference to the
ARM ARM spec in order to determine what case they're handling.
Signed-off-by: NThomas Hanson <thomas.hanson@linaro.org>
Message-id: 1476301853-15774-5-git-send-email-thomas.hanson@linaro.org
Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>

957956b3