1. 28 10月, 2016 11 次提交
  2. 27 10月, 2016 3 次提交
    • P
      Merge remote-tracking branch 'remotes/rth/tags/pull-atomic-20161026' into staging · 5929d7e8
      Peter Maydell 提交于
      cmpxchg emulation of atomics, v8
      
      # gpg: Signature made Wed 26 Oct 2016 16:30:03 BST
      # gpg:                using RSA key 0xAD1270CC4DD0279B
      # gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
      # gpg:                 aka "Richard Henderson <rth@redhat.com>"
      # gpg:                 aka "Richard Henderson <rth@twiddle.net>"
      # Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B
      
      * remotes/rth/tags/pull-atomic-20161026: (37 commits)
        target-alpha: Emulate LL/SC using cmpxchg helpers
        target-alpha: Introduce MMU_PHYS_IDX
        target-arm: remove EXCP_STREX + cpu_exclusive_{test, info}
        linux-user: remove handling of aarch64's EXCP_STREX
        linux-user: remove handling of ARM's EXCP_STREX
        target-arm: emulate aarch64's LL/SC using cmpxchg helpers
        target-arm: emulate SWP with atomic_xchg helper
        target-arm: emulate LL/SC using cmpxchg helpers
        target-arm: Rearrange aa32 load and store functions
        tests: add atomic_add-bench
        target-i386: remove helper_lock()
        target-i386: emulate XCHG using atomic helper
        target-i386: emulate LOCK'ed BTX ops using atomic helpers
        target-i386: emulate LOCK'ed XADD using atomic helper
        target-i386: emulate LOCK'ed NEG using cmpxchg helper
        target-i386: emulate LOCK'ed NOT using atomic helper
        target-i386: emulate LOCK'ed INC using atomic helper
        target-i386: emulate LOCK'ed OP instructions using atomic helpers
        target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers
        tcg: Emit barriers with parallel_cpus
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      5929d7e8
    • P
      Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging · 8f9d84df
      Peter Maydell 提交于
      # gpg: Signature made Wed 26 Oct 2016 03:19:06 BST
      # gpg:                using RSA key 0xEF04965B398D6211
      # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
      # gpg: WARNING: This key is not certified with sufficiently trusted signatures!
      # gpg:          It is not certain that the signature belongs to the owner.
      # Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211
      
      * remotes/jasowang/tags/net-pull-request:
        colo-proxy: fix memory leak
        net: rtl8139: limit processing of ring descriptors
        net: vmxnet: initialise local tx descriptor
        e1000e: Don't zero out buffer address in rx descriptor
        net: rocker: set limit to DMA buffer size
        net: eepro100: fix memory leak in device uninit
        tap-bsd: OpenBSD uses tap(4) now
        net: pcnet: fix source formatting and indentation
        net: pcnet: check rx/tx descriptor ring length
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      8f9d84df
    • P
      Merge remote-tracking branch 'remotes/vivier/tags/m68k-part1-pull-request' into staging · 991a97ac
      Peter Maydell 提交于
      # gpg: Signature made Tue 25 Oct 2016 19:58:46 BST
      # gpg:                using RSA key 0xF30C38BD3F2FBE3C
      # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
      # gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
      # gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
      # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C
      
      * remotes/vivier/tags/m68k-part1-pull-request: (23 commits)
        target-m68k: Optimize gen_flush_flags
        target-m68k: Optimize some comparisons
        target-m68k: Use setcond for scc
        target-m68k: Introduce DisasCompare
        target-m68k: Reorg flags handling
        target-m68k: Remove incorrect clearing of cc_x
        target-m68k: Some fixes to SR and flags management
        target-m68k: Print flags properly
        target-m68k: update CPU flags management
        target-m68k: don't update cc_dest in helpers
        target-m68k: update move to/from ccr/sr
        target-m68k: remove m68k_cpu_exec_enter() and m68k_cpu_exec_exit()
        target-m68k: Replace helper_xflag_lt with setcond
        target-m68k: allow to update flags with operation on words and bytes
        target-m68k: REG() macro cleanup
        target-m68k: set PAGE_BITS to 12 for m68k
        target-m68k: define operand sizes
        target-m68k: set disassembler mode to 680x0 or coldfire
        target-m68k: introduce read_imXX() functions
        target-m68k: manage scaled index
        ...
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      991a97ac
  3. 26 10月, 2016 26 次提交
    • R
      target-alpha: Emulate LL/SC using cmpxchg helpers · ed283916
      Richard Henderson 提交于
      Emulating LL/SC with cmpxchg is not correct, since it can
      suffer from the ABA problem.  However, portable parallel
      code is written assuming only cmpxchg which means that in
      practice this is a viable alternative.
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      ed283916
    • R
      target-alpha: Introduce MMU_PHYS_IDX · 6a73ecf5
      Richard Henderson 提交于
      Rather than using helpers for physical accesses, use a mmu index.
      The primary cleanup is with store-conditional on physical addresses.
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      6a73ecf5
    • E
      target-arm: remove EXCP_STREX + cpu_exclusive_{test, info} · 05188cc7
      Emilio G. Cota 提交于
      The exception is not emitted anymore; remove it and the associated
      TCG variables.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      Message-Id: <1467054136-10430-31-git-send-email-cota@braap.org>
      05188cc7
    • E
      linux-user: remove handling of aarch64's EXCP_STREX · f4e6eb7f
      Emilio G. Cota 提交于
      The exception is not emitted anymore.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      Message-Id: <1467054136-10430-30-git-send-email-cota@braap.org>
      f4e6eb7f
    • E
      linux-user: remove handling of ARM's EXCP_STREX · b50b82fc
      Emilio G. Cota 提交于
      The exception is not emitted anymore.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twidle.net>
      Message-Id: <1467054136-10430-29-git-send-email-cota@braap.org>
      b50b82fc
    • E
      target-arm: emulate aarch64's LL/SC using cmpxchg helpers · 1dd089d0
      Emilio G. Cota 提交于
      Emulating LL/SC with cmpxchg is not correct, since it can
      suffer from the ABA problem. Portable parallel code, however,
      is written assuming only cmpxchg--and not LL/SC--is available.
      This means that in practice emulating LL/SC with cmpxchg is
      a viable alternative.
      
      The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers.
      This works in both user and system mode. In usermode, it avoids
      pausing all other CPUs to perform the LL/SC pair. The subsequent
      performance and scalability improvement is significant, as the
      plots below show. They plot the throughput of atomic_add-bench
      compiled for ARM and executed on a 64-core x86 machine.
      
      Hi-res plots: http://imgur.com/a/JVc8Y
      
                      atomic_add-bench: 1000000 ops/thread, [0,1] range
      
        18 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +          +          +    |
        16 ++master +-H--+                                                      ++
           ||                                                                    |
        14 ++                                                                   ++
           | |                                                                   |
        12 ++|                                                                  ++
           | |                                                                   |
        10 ++++                                                                 ++
         8 ++E                                                                  ++
           |+++                                                                  |
         6 ++ |                                                                 ++
           |  |                                                                  |
         4 ++ |                                                                 ++
           |   |                                                                 |
         2 +H++E+---                                                            ++
           + |     +E++----+E+---+--+E+----++E+------+E+------+E++----+E+---+--+E|
         0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                      atomic_add-bench: 1000000 ops/thread, [0,2] range
      
        18 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +          +          +    |
        16 ++master +-H--+                                                      ++
           | |                                                                   |
        14 ++E                                                                  ++
           | |                                                                   |
        12 ++|                                                                  ++
           |+++                                                                  |
        10 ++ |                                                                 ++
         8 ++ |                                                                 ++
           |  |                                                                  |
         6 ++ |                                                                 ++
           |   |                                                                 |
         4 ++  |                                                                ++
           |  +E+---                                                             |
         2 +H+     +E+-----+++              +++      +++   ---+E+-----+E+------+++
           +++        +    +E+---+--+E+----++E+------+E+---   ++++    +++   +  +E|
         0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                     atomic_add-bench: 1000000 ops/thread, [0,128] range
      
        70 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +          +          +    |
        60 ++master +-H--+                  +++            ---+E+-----+E+------+E+
           |                        +E+------E-------+E+---                      |
           |                     ---        +++                                  |
        50 ++              +++---                                               ++
           |              -+E+                                                   |
        40 ++      +++----                                                      ++
           |        E-                                                           |
           |      --|                                                            |
        30 ++   -- +++                                                          ++
           |  +E+                                                                |
        20 ++E+                                                                 ++
           |E+                                                                   |
           |                                                                     |
        10 ++                                                                   ++
           +          +          +         +          +          +          +    |
         0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                    atomic_add-bench: 1000000 ops/thread, [0,1024] range
      
        160 ++---------+---------+----------+---------+----------+----------+---++
            +cmpxchg +-E--+      +          +         +          +          +    |
        140 ++master +-H--+                                           +++      +++
            |                                                -+E+-----+E+-------E|
        120 ++                                       +++ ----                  +++
            |                                +++  ----E--                        |
        100 ++                              --E---   +++                        ++
            |                       +++ ---- +++                                 |
         80 ++                     --E--                                        ++
            |                  ---- +++                                          |
            |              -+E+                                                  |
         60 ++         ---- +++                                                 ++
            |      +E+-                                                          |
         40 ++   --                                                             ++
            |  +E+                                                               |
         20 +EE+                                                                ++
            +++        +         +          +         +          +          +    |
          0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
            0          10        20         30        40         50         60
                                      Number of threads
      
      [rth: Rearrange 128-bit cmpxchg helper.  Enforce alignment on LL.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-28-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      1dd089d0
    • E
      target-arm: emulate SWP with atomic_xchg helper · cf12bce0
      Emilio G. Cota 提交于
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-25-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      cf12bce0
    • E
      target-arm: emulate LL/SC using cmpxchg helpers · 354161b3
      Emilio G. Cota 提交于
      Emulating LL/SC with cmpxchg is not correct, since it can
      suffer from the ABA problem. Portable parallel code, however,
      is written assuming only cmpxchg--and not LL/SC--is available.
      This means that in practice emulating LL/SC with cmpxchg is
      a viable alternative.
      
      The appended emulates LL/SC pairs in ARM with cmpxchg helpers.
      This works in both user and system mode. In usermode, it avoids
      pausing all other CPUs to perform the LL/SC pair. The subsequent
      performance and scalability improvement is significant, as the
      plots below show. They plot the throughput of atomic_add-bench
      compiled for ARM and executed on a 64-core x86 machine.
      
      Hi-res plots: http://imgur.com/a/aNQpB
      
                     atomic_add-bench: 1000000 ops/thread, [0,1] range
      
        9 ++---------+----------+----------+----------+----------+----------+---++
          +cmpxchg +-E--+       +          +          +          +          +    |
        8 +Emaster +-H--+                                                       ++
          | |                                                                    |
        7 ++E                                                                   ++
          | |                                                                    |
        6 ++++                                                                  ++
          |  |                                                                   |
        5 ++ |                                                                  ++
        4 ++ |                                                                  ++
          |  |                                                                   |
        3 ++ |                                                                  ++
          |   |                                                                  |
        2 ++  |                                                                 ++
          |H++E+---                                  +++  ---+E+------+E+------+E|
        1 +++     +E+-----+E+------+E+------+E+------+E+--   +++      +++       ++
          ++H+       +    +++   +  +++     ++++       +          +          +    |
        0 ++--H----H-+-----H----+----------+----------+----------+----------+---++
          0          10         20         30         40         50         60
                                     Number of threads
      
                      atomic_add-bench: 1000000 ops/thread, [0,2] range
      
        16 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +          +          +    |
        14 ++master +-H--+                                                      ++
           | |                                                                   |
        12 ++|                                                                  ++
           | E                                                                   |
        10 ++|                                                                  ++
           | |                                                                   |
         8 ++++                                                                 ++
           |E+|                                                                  |
           |  |                                                                  |
         6 ++ |                                                                 ++
           |   |                                                                 |
         4 ++  |                                                                ++
           |  +E+---       +++      +++              +++           ---+E+------+E|
         2 +H+     +E+------E-------+E+-----+E+------+E+------+E+--            +++
           + |        +    +++   +         ++++       +          +          +    |
         0 ++H-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                     atomic_add-bench: 1000000 ops/thread, [0,128] range
      
        70 ++---------+----------+---------+----------+----------+----------+---++
           +cmpxchg +-E--+       +         +          +       ++++          +    |
        60 ++master +-H--+                                 ----E------+E+-------++
           |                                        -+E+---   +++     +++      +E|
           |                                +++ ---- +++                       ++|
        50 ++                       +++  ---+E+-                                ++
           |                        -E---                                        |
        40 ++                    ---+++                                         ++
           |               +++---                                                |
           |              -+E+                                                   |
        30 ++      +++----                                                      ++
           |       +E+                                                           |
        20 ++ +++--                                                             ++
           |  +E+                                                                |
           |+E+                                                                  |
        10 +E+                                                                  ++
           +          +          +         +          +          +          +    |
         0 +HH-H----H-+-----H----+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                    atomic_add-bench: 1000000 ops/thread, [0,1024] range
      
        120 ++---------+---------+----------+---------+----------+----------+---++
            +cmpxchg +-E--+      +          +         +          +          +    |
            | master +-H--+                                                    ++|
        100 ++                                                              ----E+
            |                                                 +++  ---+E+---   ++|
            |                                                --E---   +++        |
         80 ++                                           ---- +++               ++
            |                                     ---+E+-                        |
         60 ++                              -+E+--                              ++
            |                       +++ ---- +++                                 |
            |                      -+E+-                                         |
         40 ++              +++----                                             ++
            |      +++   ---+E+                                                  |
            |     -+E+---                                                        |
         20 ++ +E+                                                              ++
            |+E+++                                                               |
            +E+        +         +          +         +          +          +    |
          0 +HH-H---H--+-----H---+----------+---------+----------+----------+---++
            0          10        20         30        40         50         60
                                      Number of threads
      
      [rth: Enforce alignment for ldrexd.]
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-23-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      354161b3
    • R
      target-arm: Rearrange aa32 load and store functions · 7f5616f5
      Richard Henderson 提交于
      Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate
      a temp and expand with tcg_gen_extu_i32_tl.  Split out gen_aa32_addr,
      gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      7f5616f5
    • E
      tests: add atomic_add-bench · 070e3edc
      Emilio G. Cota 提交于
      With this microbenchmark we can measure the overhead of emulating atomic
      instructions with a configurable degree of contention.
      
      The benchmark spawns $n threads, each performing $o atomic ops (additions)
      in a loop. Each atomic operation is performed on a different cache line
      (assuming lines are 64b long) that is randomly selected from a range [0, $r).
      
      [ Note: each $foo corresponds to a -foo flag ]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      Message-Id: <1467054136-10430-20-git-send-email-cota@braap.org>
      070e3edc
    • E
      target-i386: remove helper_lock() · 37b995f6
      Emilio G. Cota 提交于
      It's been superseded by the atomic helpers.
      
      The use of the atomic helpers provides a significant performance and scalability
      improvement. Below is the result of running the atomic_add-test microbenchmark with:
       $ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 5000000 -r $r -n $n
      , where $n is the number of threads and $r is the allowed range for the additions.
      
      The scenarios measured are:
      - atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset)
      - cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper
      - master: before this patchset
      
      Results sorted in ascending range, i.e. descending degree of contention.
      Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64
      Opteron 6376 cores.
      
                      atomic_add-bench: 5000000 ops/thread, [0,1] range
      
        25 ++---------+----------+---------+----------+----------+----------+---++
           + atomic +-E--+       +         +          +          +          +    |
           |cmpxchg +-H--+                                                       |
        20 +Emaster +-N--+                                                      ++
           ||                                                                    |
           |++                                                                   |
           ||                                                                    |
        15 +++                                                                  ++
           |N|                                                                   |
           |+|                                                                   |
        10 ++|                                                                  ++
           |+|+                                                                  |
           | |    -+E+------        +++  ---+E+------+E+------+E+-----+E+------+E|
           |+E+E+- +++     +E+------+E+--                                        |
         5 ++|+                                                                 ++
           |+N+H+---                                 +++                         |
           ++++N+--+H++----+++   +  +++  --++H+------+H+------+H++----+H+---+--- |
         0 ++---------+-----H----+---H-----+----------+----------+----------+---H+
           0          10         20        30         40         50         60
                                      Number of threads
      
                      atomic_add-bench: 5000000 ops/thread, [0,2] range
      
        25 ++---------+----------+---------+----------+----------+----------+---++
           ++atomic +-E--+       +         +          +          +          +    |
           |cmpxchg +-H--+                                                       |
        20 ++master +-N--+                                                      ++
           |E|                                                                   |
           |++                                                                   |
           ||E                                                                   |
        15 ++|                                                                  ++
           |N||                                                                  |
           |+||                                   ---+E+------+E+-----+E+------+E|
        10 ++| |        ---+E+------+E+-----+E+---                    +++      +++
           ||H+E+--+E+--                                                         |
           |+++++                                                                |
           | ||                                                                  |
         5 ++|+H+--                                  +++                        ++
           |+N+    -                              ---+H+------+H+------          |
           +  +N+--+H++----+H+---+--+H+----++H+---    +          +    +H+---+--+H|
         0 ++---------+----------+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                      atomic_add-bench: 5000000 ops/thread, [0,8] range
      
        40 ++---------+----------+---------+----------+----------+----------+---++
           ++atomic +-E--+       +         +          +          +          +    |
        35 +cmpxchg +-H--+                                                      ++
           | master +-N--+               ---+E+------+E+------+E+-----+E+------+E|
        30 ++|                   ---+E+--   +++                                 ++
           | |            -+E+---                                                |
        25 ++E        ---- +++                                                  ++
           |+++++ -+E+                                                           |
        20 +E+ E-- +++                                                          ++
           |H|+++                                                                |
           |+|                                       +H+-------                  |
        15 ++H+                                   ---+++      +H+------         ++
           |N++H+--                         +++---                    +H+------++|
        10 ++ +++  -       +++           ---+H+                       +++      +H+
           | |     +H+-----+H+------+H+--                                        |
         5 ++|                      +++                                         ++
           ++N+N+--+N++          +         +          +          +          +    |
         0 ++---------+----------+---------+----------+----------+----------+---++
           0          10         20        30         40         50         60
                                      Number of threads
      
                     atomic_add-bench: 5000000 ops/thread, [0,128] range
      
        160 ++---------+---------+----------+---------+----------+----------+---++
            + atomic +-E--+      +          +         +          +          +    |
        140 +cmpxchg +-H--+                          +++      +++               ++
            | master +-N--+                           E--------E------+E+------++|
        120 ++                                      --|        |      +++       E+
            |                                     -- +++      +++              ++|
        100 ++                                   -                              ++
            |                                +++-                     +++      ++|
         80 ++                              -+E+    -+H+------+H+------H--------++
            |                           ----    ----                  +++       H|
            |            ---+E+-----+E+-  ---+H+                               ++|
         60 ++     +E+---   +++  ---+H+---                                      ++
            |    --+++   ---+H+--                                                |
         40 ++ +E+-+H+---                                                       ++
            |  +H+                                                               |
         20 +EE+                                                                ++
            +N+        +         +          +         +          +          +    |
          0 ++N-N---N--+---------+----------+---------+----------+----------+---++
            0          10        20         30        40         50         60
                                      Number of threads
      
                    atomic_add-bench: 5000000 ops/thread, [0,1024] range
      
        350 ++---------+---------+----------+---------+----------+----------+---++
            + atomic +-E--+      +          +         +          +          +    |
        300 +cmpxchg +-H--+                                                    +++
            | master +-N--+                                           +++       ||
            |                                                 +++      |    ----E|
        250 ++                                                 |   ----E----    ++
            |                                              ----E---    |    ---+H|
        200 ++                                      -+E+---   +++  ---+H+---    ++
            |                                   ----         -+H+--              |
            |                                +E+     +++ ---- +++                |
        150 ++                            ---+++  ---+H+-                       ++
            |                          ---  -+H+--                               |
        100 ++                   ---+E+ ---- +++                                ++
            |      +++   ---+E+-----+H+-                                         |
            |     -+E+------+H+--                                                |
         50 ++ +E+                                                              ++
            +EE+       +         +          +         +          +          +    |
          0 ++N-N---N--+---------+----------+---------+----------+----------+---++
            0          10        20         30        40         50         60
                                      Number of threads
      
        hi-res: http://imgur.com/a/fMRmq
      
      For master I stopped measuring master after 8 threads, because there is little
      point in measuring the well-known performance collapse of a contended lock.
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-21-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      37b995f6
    • E
      target-i386: emulate XCHG using atomic helper · ea97ebe8
      Emilio G. Cota 提交于
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-19-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      ea97ebe8
    • E
      target-i386: emulate LOCK'ed BTX ops using atomic helpers · cfe819d3
      Emilio G. Cota 提交于
      [rth: Avoid redundant qemu_ld in locked case.  Fix previously unnoticed
      incorrect zero-extension of address in register-offset case.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-18-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      cfe819d3
    • E
      target-i386: emulate LOCK'ed XADD using atomic helper · f53b0181
      Emilio G. Cota 提交于
      [rth: Move load of reg value to common location.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-17-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      f53b0181
    • E
      target-i386: emulate LOCK'ed NEG using cmpxchg helper · 8eb8c738
      Emilio G. Cota 提交于
      [rth: Move redundant qemu_load out of cmpxchg loop.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-16-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      8eb8c738
    • E
      target-i386: emulate LOCK'ed NOT using atomic helper · 2a5fe8ae
      Emilio G. Cota 提交于
      [rth: Avoid qemu_load that's redundant with the atomic op.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-15-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      2a5fe8ae
    • E
      target-i386: emulate LOCK'ed INC using atomic helper · 60e57346
      Emilio G. Cota 提交于
      [rth: Merge gen_inc_locked back into gen_inc to share cc update.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-14-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      60e57346
    • E
      target-i386: emulate LOCK'ed OP instructions using atomic helpers · a7cee522
      Emilio G. Cota 提交于
      [rth: Eliminate some unnecessary temporaries.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-13-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      a7cee522
    • E
      target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers · ae03f8de
      Emilio G. Cota 提交于
      The diff here is uglier than necessary. All this does is to turn
      
      FOO
      
      into:
      
      if (s->prefix & PREFIX_LOCK) {
        BAR
      } else {
        FOO
      }
      
      where FOO is the original implementation of an unlocked cmpxchg.
      
      [rth: Adjust unlocked cmpxchg to use movcond instead of branches.
      Adjust helpers to use atomic helpers.]
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <1467054136-10430-6-git-send-email-cota@braap.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      ae03f8de
    • R
      91682118
    • R
      tcg: Add CONFIG_ATOMIC64 · df79b996
      Richard Henderson 提交于
      Allow qemu to build on 32-bit hosts without 64-bit atomic ops.
      
      Even if we only allow 32-bit hosts to multi-thread emulate 32-bit
      guests, we still need some way to handle the 32-bit guest using a
      64-bit atomic operation.  Do so by dropping back to single-step.
      Reviewed-by: NEmilio G. Cota <cota@braap.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      df79b996
    • R
      tcg: Add atomic128 helpers · 7ebee43e
      Richard Henderson 提交于
      Force the use of cmpxchg16b on x86_64.
      
      Wikipedia suggests that only very old AMD64 (circa 2004) did not have
      this instruction.  Further, it's required by Windows 8 so no new cpus
      will ever omit it.
      
      If we truely care about these, then we could check this at startup time
      and then avoid executing paths that use it.
      Reviewed-by: NEmilio G. Cota <cota@braap.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      7ebee43e
    • R
      tcg: Add atomic helpers · c482cb11
      Richard Henderson 提交于
      Add all of cmpxchg, op_fetch, fetch_op, and xchg.
      Handle both endian-ness, and sizes up to 8.
      Handle expanding non-atomically, when emulating in serial.
      Reviewed-by: NEmilio G. Cota <cota@braap.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      c482cb11
    • R
      cputlb: Tidy some macros · c86c6e4c
      Richard Henderson 提交于
      TGT_LE and TGT_BE are not size dependent and do not need to be
      redefined.  The others are no longer used at all.
      Reviewed-by: NEmilio G. Cota <cota@braap.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      c86c6e4c
    • R
      cputlb: Move most of iotlb code out of line · 82a45b96
      Richard Henderson 提交于
      Saves 2k code size off of a cold path.
      Reviewed-by: NEmilio G. Cota <cota@braap.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      82a45b96
    • R
      cputlb: Remove includes from softmmu_template.h · 40978428
      Richard Henderson 提交于
      We already include exec/address-spaces.h and exec/memory.h in
      cputlb.c; the include of qemu/timer.h appears to be a fossil.
      Reviewed-by: NEmilio G. Cota <cota@braap.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NRichard Henderson <rth@twiddle.net>
      40978428