1. 17 12月, 2018 1 次提交
  2. 03 10月, 2018 1 次提交
    • E
      qsp: use atomic64 accessors · ac8c7748
      Emilio G. Cota 提交于
      With the seqlock, we either have to use atomics to remain
      within defined behaviour (and note that 64-bit atomics aren't
      always guaranteed to compile, irrespective of __nocheck), or
      drop the atomics and be in undefined behaviour territory.
      
      Fix it by dropping the seqlock and using atomic64 accessors.
      This will limit scalability when !CONFIG_ATOMIC64, but those
      machines (1) don't have many users and (2) are unlikely to
      have many cores.
      
      - With CONFIG_ATOMIC64:
      $ tests/atomic_add-bench -n 1 -m -p
       Throughput:         13.00 Mops/s
      
      - Forcing !CONFIG_ATOMIC64:
      $ tests/atomic_add-bench -n 1 -m -p
       Throughput:         10.89 Mops/s
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Message-Id: <20180910232752.31565-5-cota@braap.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      ac8c7748
  3. 26 9月, 2018 1 次提交
  4. 24 8月, 2018 5 次提交
    • E
      qsp: track BQL callers explicitly · cb764d06
      Emilio G. Cota 提交于
      The BQL is acquired via qemu_mutex_lock_iothread(), which makes
      the profiler assign the associated wait time (i.e. most of
      BQL wait time) entirely to that function. This loses the original
      call site information, which does not help diagnose BQL contention.
      Fix it by tracking the callers explicitly.
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      cb764d06
    • E
      qsp: support call site coalescing · d557de4a
      Emilio G. Cota 提交于
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      d557de4a
    • E
      qsp: add qsp_reset · 996e8d9a
      Emilio G. Cota 提交于
      I first implemented this by deleting all entries in the global
      hash table. But doing that safely slows down profiling, since
      we'd need to introduce rcu_read_lock/unlock in the fast path.
      
      What's implemented here avoids messing with the thread-local
      data in the global hash table. It achieves this by taking a snapshot
      of the current state, so that subsequent reports present the delta
      wrt to the snapshot.
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      996e8d9a
    • E
      0a22777c
    • E
      qsp: QEMU's Synchronization Profiler · fe9959a2
      Emilio G. Cota 提交于
      The goal of this module is to profile synchronization primitives (i.e.
      mutexes, recursive mutexes and condition variables) so that scalability
      issues can be quickly diagnosed.
      
      Sync primitives are profiled by QSP based on the vaddr of the object accessed
      as well as the call site (file:line_nr). That means the same object called
      from two different call sites will be tracked in separate entries, which
      might be reported together or separately (see subsequent commit on
      call site coalescing).
      
      Some perf numbers:
      
      Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
      Command: taskset -c 0 tests/atomic_add-bench -d 5 -m
      
      - Before: 54.80 Mops/s
      - After:  54.75 Mops/s
      
      That is, a negligible slowdown due to the now indirect call to
      qemu_mutex_lock. Note that using a branch instead of an indirect
      call introduces a more severe slowdown (53.65 Mops/s, i.e. 2% slowdown).
      
      Enabling the profiler (with -p, added in this series) is more interesting:
      
      - No profiling: 54.75 Mops/s
      - W/ profiling: 12.53 Mops/s
      
      That is, a 4.36X slowdown.
      
      We can break down this slowdown by removing the get_clock calls or
      the entry lookup:
      
      - No profiling:     54.75 Mops/s
      - W/o get_clock:    25.37 Mops/s
      - W/o entry lookup: 19.30 Mops/s
      - W/ profiling:     12.53 Mops/s
      Signed-off-by: NEmilio G. Cota <cota@braap.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fe9959a2
新手
引导
客服 返回
顶部