1. 06 9月, 2017 3 次提交
  2. 05 9月, 2017 1 次提交
  3. 16 8月, 2017 2 次提交
  4. 15 8月, 2017 1 次提交
  5. 25 7月, 2017 1 次提交
  6. 20 7月, 2017 2 次提交
  7. 17 7月, 2017 3 次提交
  8. 20 6月, 2017 1 次提交
  9. 06 6月, 2017 2 次提交
    • E
      target/aarch64: optimize indirect branches · e75449a3
      Emilio G. Cota 提交于
      Measurements:
      
      [Baseline performance is that before applying this and the previous commit]
      
      -                                    NBench, aarch64-softmmu. Host: Intel i7-4790K @ 4.00GHz
      
       1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
            |                                                                                                                  |
            |   cross                                                                                                          |
       1.6x +cross+jr.................................................####...................................................+-+
            |                                                         #++#                                                     |
            |                                                         #  #                                                     |
       1.5x +-+...................................................*****..#...................................................+-+
            |                                                     *+++*  #                                                     |
            |                                                     *   *  #                                                     |
       1.4x +-+...................................................*...*..#...................................................+-+
            |                                                     *   *  #                                                     |
            |                                     #####           *   *  #                                                     |
       1.3x +-+................................****+++#...........*...*..#...................................................+-+
            |                                  *++*   #           *   *  #                                                     |
            |                                  *  *   #           *   *  #                                                     |
       1.2x +-+................................*..*...#...........*...*..#...................................................+-+
            |                                  *  *   #           *   *  #                                                     |
            |                            ####  *  *   #           *   *  #                                                     |
       1.1x +-+.......................+++#..#..*..*...#...........*...*..#...................................................+-+
            |                         ****  #  *  *   #           *   *  #                                        ****####     |
            |                         *  *  #  *  *   #           *   *  #  ****###   +++####            ****###  *  *   #     |
         1x +-++-++++++-++++****###++-*++*++#++*++*+-+#++****+++++*+++*++#++*++*-+#++*****++#++****###-++*++*-+#++*+-*+++#+-++-+
            |     *****###  *  *  #   *  *  #  *  *   #  *++*###  *   *  #  *  *  #  *   *  #  *  *++#   *  *  #  *  *   #     |
            |     *   *++#  *  *  #   *  *  #  *  *   #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #   *  *  #  *  *   #     |
       0.9x +-+---*****###--****###---****###--****####--****###--*****###--****###--*****###--****###---****###--****####---+-+
            ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONNEURAL NUMERIC SORSTRING SORT    hmean
        png: http://imgur.com/qO9ubtk
      NB. cross here represents the previous commit.
      
      -                            SPECint06 (test set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz
      
       1.5x +-+--------------------------------------------------------------------------------------------------------------+-+
            |                                                                       *****                                      |
            |                                                                       *+++*                           jr         |
            |                                                                       *   *                                      |
       1.4x +-+.....................................................................*...*.....................+++............+-+
            |                                                                       *   *                      |               |
            |                                      *****                            *   *                      |               |
            |                                      *   *                            *   *                    *****             |
       1.3x +-+....................................*...*............................*...*....................*.|.*...........+-+
            |                       +++            *   *                            *   *                    * | *             |
            |                      *****           *   *                            *   *                    *+++*             |
            |                      *   *           *   *                            *   *                    *   *             |
       1.2x +-+....................*...*...........*...*............................*...*...........*****....*...*...........+-+
            |     *****            *   *           *   *                            *   *           *   *    *   *    +++      |
            |     *   *            *   *           *   *                            *   *           *   *    *   *   *****     |
            |     *   *            *   *   *****   *   *                            *   *           *   *    *   *   *   *     |
       1.1x +-+...*...*............*...*...*...*...*...*............................*...*....+++....*...*....*...*...*...*...+-+
            |     *   *            *   *   *   *   *   *                            *   *   *****   *   *    *   *   *   *     |
            |     *   *            *   *   *   *   *   *   *****                    *   *   *   *   *   *    *   *   *   *     |
            |     *   *   *****    *   *   *   *   *   *   *   *   ******           *   *   *   *   *   *    *   *   *   *     |
         1x +-++-+*+++*-++*+++*++++*+-+*+++*-++*+++*-++*+++*+++*++-*++++*-++*****+++*++-*+++*++-*+++*+-+*++++*+++*++-*+++*+-++-+
            |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *+++*   *   *   *   *   *   *    *   *   *   *     |
            |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
            |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
       0.9x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
               astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
        png: http://imgur.com/3Dp4vvq
      
      -                           SPECint06 (train set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz
      
       1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
            |                                                                                                                  |
            |                                                                                                       jr         |
       1.6x +-+...............................................................................................+++............+-+
            |                                                                                                *****             |
            |                                                                                                *+++*             |
            |                                                                                                *   *             |
       1.5x +-+..............................................................................................*...*...........+-+
            |                                                                        +++                     *   *             |
            |                                                                       *****                    *   *             |
       1.4x +-+.....................................................................*+++*....................*...*...........+-+
            |                                                                       *   *                    *   *             |
            |                                      *****                            *   *                    *   *             |
            |                                      *   *                            *   *   *****            *   *             |
       1.3x +-+....................................*...*............................*...*...*...*............*...*...........+-+
            |                       +++            *   *                            *   *   *   *            *   *             |
            |                      *****           *   *                            *   *   *   *   *****    *   *             |
       1.2x +-+....................*...*...........*...*............................*...*...*...*...*+++*....*...*...*****...+-+
            |                      *   *           *   *                            *   *   *   *   *   *    *   *   *+++*     |
            |     *****            *   *   *****   *   *                            *   *   *   *   *   *    *   *   *   *     |
            |     *   *            *   *   *+++*   *   *                            *   *   *   *   *   *    *   *   *   *     |
       1.1x +-+...*...*............*...*...*...*...*...*............................*...*...*...*...*...*....*...*...*...*...+-+
            |     *   *   *****    *   *   *   *   *   *                    *****   *   *   *   *   *   *    *   *   *   *     |
            |     *   *   *   *    *   *   *   *   *   *    +++    ******   *+++*   *   *   *   *   *   *    *   *   *   *     |
         1x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
               astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
        png: http://imgur.com/vRrdc9jSigned-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      e75449a3
    • E
      target/aarch64: optimize cross-page direct jumps in softmmu · e7872236
      Emilio G. Cota 提交于
      Perf numbers in next commit's log.
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      e7872236
  10. 02 6月, 2017 1 次提交
    • P
      arm: Add support for M profile CPUs having different MMU index semantics · 8bd5c820
      Peter Maydell 提交于
      The M profile CPU's MPU has an awkward corner case which we
      would like to implement with a different MMU index.
      
      We can avoid having to bump the number of MMU modes ARM
      uses, because some of our existing MMU indexes are only
      used by non-M-profile CPUs, so we can borrow one.
      To avoid that getting too confusing, clean up the code
      to try to keep the two meanings of the index separate.
      
      Instead of ARMMMUIdx enum values being identical to core QEMU
      MMU index values, they are now the core index values with some
      high bits set. Any particular CPU always uses the same high
      bits (so eventually A profile cores and M profile cores will
      use different bits). New functions arm_to_core_mmu_idx()
      and core_to_arm_mmu_idx() convert between the two.
      
      In general core index values are stored in 'int' types, and
      ARM values are stored in ARMMMUIdx types.
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Message-id: 1493122030-32191-3-git-send-email-peter.maydell@linaro.org
      8bd5c820
  11. 28 2月, 2017 1 次提交
  12. 24 2月, 2017 1 次提交
  13. 08 2月, 2017 1 次提交
  14. 14 1月, 2017 1 次提交
  15. 11 1月, 2017 3 次提交
  16. 27 12月, 2016 2 次提交
  17. 21 12月, 2016 1 次提交
    • T
      Move target-* CPU file into a target/ folder · fcf5ef2a
      Thomas Huth 提交于
      We've currently got 18 architectures in QEMU, and thus 18 target-xxx
      folders in the root folder of the QEMU source tree. More architectures
      (e.g. RISC-V, AVR) are likely to be included soon, too, so the main
      folder of the QEMU sources slowly gets quite overcrowded with the
      target-xxx folders.
      To disburden the main folder a little bit, let's move the target-xxx
      folders into a dedicated target/ folder, so that target-xxx/ simply
      becomes target/xxx/ instead.
      
      Acked-by: Laurent Vivier <laurent@vivier.eu> [m68k part]
      Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [tricore part]
      Acked-by: Michael Walle <michael@walle.cc> [lm32 part]
      Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x part]
      Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [s390x part]
      Acked-by: Eduardo Habkost <ehabkost@redhat.com> [i386 part]
      Acked-by: Artyom Tarasenko <atar4qemu@gmail.com> [sparc part]
      Acked-by: Richard Henderson <rth@twiddle.net> [alpha part]
      Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa part]
      Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ppc part]
      Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> [cris&microblaze part]
      Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [unicore32 part]
      Signed-off-by: NThomas Huth <thuth@redhat.com>
      fcf5ef2a
  18. 06 12月, 2016 1 次提交
    • A
      target-arm/translate-a64: fix gen_load_exclusive · 5460da50
      Alex Bennée 提交于
      While testing rth's latest TCG patches with risu I found ldaxp was
      broken. Investigating further I found it was broken by 1dd089d0 when
      the cmpxchg atomic work was merged. As part of that change the code
      attempted to be clever by doing a single 64 bit load and then shuffle
      the data around to set the two 32 bit registers.
      
      As I couldn't quite follow the endian magic I've simply partially
      reverted the change to the original code gen_load_exclusive code. This
      doesn't affect the cmpxchg functionality as that is all done on in
      gen_store_exclusive part which is untouched.
      
      I've also restored the comment that was removed (with a slight tweak
      to mention cmpxchg).
      Signed-off-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NRichard Henderson <rth@twiddle.net>
      Message-id: 20161202173454.19179-1-alex.bennee@linaro.org
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      5460da50
  19. 02 11月, 2016 1 次提交
    • R
      log: Add locking to large logging blocks · 1ee73216
      Richard Henderson 提交于
      Reuse the existing locking provided by stdio to keep in_asm, cpu,
      op, op_opt, op_ind, and out_asm as contiguous blocks.
      
      While it isn't possible to interleave e.g. in_asm or op_opt logs
      because of the TB lock protecting all code generation, it is
      possible to interleave cpu logs, or to interleave a cpu dump with
      an out_asm dump.
      
      For mingw32, we appear to have no viable solution for this.  The locking
      functions are not properly exported from the system runtime library.
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      1ee73216
  20. 26 10月, 2016 1 次提交
    • E
      target-arm: emulate aarch64's LL/SC using cmpxchg helpers · 1dd089d0
      Emilio G. Cota 提交于
      Emulating LL/SC with cmpxchg is not correct, since it can
      suffer from the ABA problem. Portable parallel code, however,
      is written assuming only cmpxchg--and not LL/SC--is available.
      This means that in practice emulating LL/SC with cmpxchg is
      a viable alternative.
      
      The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers.
      This works in both user and system mode. In usermode, it avoids
      pausing all other CPUs to perform the LL/SC pair. The subsequent
      performance and scalability improvement is significant, as the
      plots below show. They plot the throughput of atomic_add-bench
      compiled for ARM and executed on a 64-core x86 machine.
      
      Hi-res plots: http://imgur.com/a/JVc8Y
      
                      atomic_add-bench: 1000000 ops/thread, [0,1] range
      
        18 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +          +          +    |
        16 ++master +-H--+                                                      ++
           ||                                                                    |
        14 ++                                                                   ++
           | |                                                                   |
        12 ++|                                                                  ++
           | |                                                                   |
        10 ++++                                                                 ++
         8 ++E                                                                  ++
           |+++                                                                  |
         6 ++ |                                                                 ++
           |  |                                                                  |
         4 ++ |                                                                 ++
           |   |                                                                 |
         2 +H++E+---                                                            ++
           + |     +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E|
         0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                      atomic_add-bench: 1000000 ops/thread, [0,2] range
      
        18 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +          +          +    |
        16 ++master +-H--+                                                      ++
           | |                                                                   |
        14 ++E                                                                  ++
           | |                                                                   |
        12 ++|                                                                  ++
           |+++                                                                  |
        10 ++ |                                                                 ++
         8 ++ |                                                                 ++
           |  |                                                                  |
         6 ++ |                                                                 ++
           |   |                                                                 |
         4 ++  |                                                                ++
           |  +E+---                                                             |
         2 +H+     +E+-----+++              +++      +++   ---+E+-----+E+------+++
           +++        +    +E+---+--+E+----++E+------+E+---   ++++    +++   +  +E|
         0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                     atomic_add-bench: 1000000 ops/thread, [0,128] range
      
        70 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +          +          +    |
        60 ++master +-H--+                  +++            ---+E+-----+E+------+E+
           |                        +E+------E-------+E+---                      |
           |                     ---        +++                                  |
        50 ++              +++---                                               ++
           |              -+E+                                                   |
        40 ++      +++----                                                      ++
           |        E-                                                           |
           |      --|                                                            |
        30 ++   -- +++                                                          ++
           |  +E+                                                                |
        20 ++E+                                                                 ++
           |E+                                                                   |
           |                                                                     |
        10 ++                                                                   ++
           +          +          +         +          +          +          +    |
         0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                    atomic_add-bench: 1000000 ops/thread, [0,1024] range
      
        160 ++---------+---------+----------+---------+----------+----------+---++
            +cmpxchg +-E--+      +          +         +          +          +    |
        140 ++master +-H--+                                           +++      +++
            |                                                -+E+-----+E+-------E|
        120 ++                                       +++ ----                  +++
            |                                +++  ----E--                        |
        100 ++                              --E---   +++                        ++
            |                       +++ ---- +++                                 |
         80 ++                     --E--                                        ++
            |                  ---- +++                                          |
            |              -+E+                                                  |
         60 ++         ---- +++                                                 ++
            |      +E+-                                                          |
         40 ++   --                                                             ++
            |  +E+                                                               |
         20 +EE+                                                                ++
            +++        +         +          +         +          +          +    |
          0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
            0          10        20         30        40         50         60
                                      Number of threads
      
      [rth: Rearrange 128-bit cmpxchg helper.  Enforce alignment on LL.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-28-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      1dd089d0
  21. 18 10月, 2016 3 次提交
  22. 04 10月, 2016 1 次提交
  23. 16 9月, 2016 1 次提交
  24. 06 6月, 2016 1 次提交
  25. 19 5月, 2016 1 次提交
  26. 13 5月, 2016 1 次提交
  27. 12 5月, 2016 2 次提交