1. 09 12月, 2014 2 次提交
  2. 30 9月, 2014 2 次提交
  3. 18 9月, 2014 1 次提交
  4. 18 7月, 2014 1 次提交
  5. 20 6月, 2014 5 次提交
  6. 14 4月, 2014 1 次提交
  7. 14 3月, 2014 4 次提交
    • D
      perf bench: Add futex-requeue microbenchmark · 0fb298cf
      Davidlohr Bueso 提交于
      Block a bunch of threads on a futex and requeue them on another, N at a
      time.
      
      This program is particularly useful to measure the latency of nthread
      requeues without waking up any tasks -- thus mimicking a regular
      futex_wait.
      
      An example run:
      
        $ perf bench futex requeue -r 100 -t 64
        Run summary [PID 151011]: Requeuing 64 threads (from 0x7d15c4 to 0x7d15c8), 1 at a time.
      
        [Run 1]: Requeued 64 of 64 threads in 0.0400 ms
        [Run 2]: Requeued 64 of 64 threads in 0.0390 ms
        [Run 3]: Requeued 64 of 64 threads in 0.0400 ms
        ...
        [Run 100]: Requeued 64 of 64 threads in 0.0390 ms
        Requeued 64 of 64 threads in 0.0399 ms (+-0.37%)
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Acked-by: NDarren Hart <dvhart@linux.intel.com>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Darren Hart <dvhart@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Link: http://lkml.kernel.org/r/1387081917-9102-4-git-send-email-davidlohr@hp.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0fb298cf
    • D
      perf bench: Add futex-wake microbenchmark · 27db7830
      Davidlohr Bueso 提交于
      Block a bunch of threads on a futex and wake them up, N at a time.
      
      This program is particularly useful to measure the latency of nthread
      wakeups in non-error situations:  all waiters are queued and all wake
      calls wakeup one or more tasks.
      
      An example run:
      
        $ perf bench futex wake -t 512 -r 100
        Run summary [PID 27823]: blocking on 512 threads (at futex 0x7e10d4), waking up 1 at a time.
      
        [Run 1]: Wokeup 512 of 512 threads in 6.0080 ms
        [Run 2]: Wokeup 512 of 512 threads in 5.2280 ms
        [Run 3]: Wokeup 512 of 512 threads in 4.8300 ms
        ...
        [Run 100]: Wokeup 512 of 512 threads in 5.0100 ms
        Wokeup 512 of 512 threads in 5.0109 ms (+-2.25%)
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Acked-by: NDarren Hart <dvhart@linux.intel.com>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Darren Hart <dvhart@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Link: http://lkml.kernel.org/r/1387081917-9102-3-git-send-email-davidlohr@hp.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      27db7830
    • D
      perf bench: Add futex-hash microbenchmark · a0439711
      Davidlohr Bueso 提交于
      Introduce futexes to perf-bench and add a program that stresses and
      measures the kernel's implementation of the hash table.
      
      This is a multi-threaded program that simply measures the amount of
      failed futex wait calls - we only want to deal with the hashing
      overhead, so a negative return of futex_wait_setup() is enough to do the
      trick.
      
      An example run:
      
        $ perf bench futex hash -t 32
        Run summary [PID 10989]: 32 threads, each operating on 1024 [private] futexes for 10 secs.
      
        [thread  0] futexes: 0x19d9b10 ... 0x19dab0c [ 418713 ops/sec ]
        [thread  1] futexes: 0x19daca0 ... 0x19dbc9c [ 469913 ops/sec ]
        [thread  2] futexes: 0x19dbe30 ... 0x19dce2c [ 479744 ops/sec ]
        ...
        [thread 31] futexes: 0x19fbb80 ... 0x19fcb7c [ 464179 ops/sec ]
      
        Averaged 454310 operations/sec (+- 0.84%), total secs = 10
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Acked-by: NDarren Hart <dvhart@linux.intel.com>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Darren Hart <dvhart@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Link: http://lkml.kernel.org/r/1387081917-9102-2-git-send-email-davidlohr@hp.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a0439711
    • A
      perf bench numa: Make no args mean 'run all tests' · 0fae799e
      Arnaldo Carvalho de Melo 提交于
      If we call just:
      
        perf bench numa mem
      
      it will present the same output as:
      
        perf bench numa mem -h
      
      i.e. ask for instructions about what to run.
      
      While that is kinda ok, using 'run all tests' as the default, i.e.
      making 'no parms' be equivalent to:
      
        perf bench numa mem -a
      
      Will allow:
      
        perf bench numa all
      
      to actually do what is asked: i.e. run all the 'bench' tests, instead of
      responding to that by asking what to do.
      
      That, in turn, allows:
      
        perf bench all
      
      to actually complete, for the same reasons.
      
      And after that, the tests that come after that, and that at some point
      hit a NULL deref, will run, allowing me to reproduce a recently reported
      problem.
      
      That when you have the needed numa libraries, which wasn't the case for
      the reporter, making me a bit confused after trying to reproduce his
      report.
      
      So make no parms mean -a.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Patrick Palka <patrick@parcs.ath.cx>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-x7h0ghx4pef4n0brywg21krk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0fae799e
  8. 01 11月, 2013 1 次提交
  9. 21 10月, 2013 1 次提交
  10. 11 10月, 2013 2 次提交
  11. 09 10月, 2013 1 次提交
  12. 22 7月, 2013 1 次提交
  13. 09 7月, 2013 1 次提交
  14. 14 3月, 2013 1 次提交
  15. 30 1月, 2013 1 次提交
    • I
      perf: Add 'perf bench numa mem' NUMA performance measurement suite · 1c13f3c9
      Ingo Molnar 提交于
      Add a suite of NUMA performance benchmarks.
      
      The goal was simulate the behavior and access patterns of real NUMA
      workloads, via a wide range of parameters, so this tool goes well
      beyond simple bzero() measurements that most NUMA micro-benchmarks use:
      
       - It processes the data and creates a chain of data dependencies,
         like a real workload would. Neither the compiler, nor the
         kernel (via KSM and other optimizations) nor the CPU can
         eliminate parts of the workload.
      
       - It randomizes the initial state and also randomizes the target
         addresses of the processing - it's not a simple forward scan
         of addresses.
      
       - It provides flexible options to set process, thread and memory
         relationship information: -G sets "global" memory shared between
         all test processes, -P sets "process" memory shared by all
         threads of a process and -T sets "thread" private memory.
      
       - There's a NUMA convergence monitoring and convergence latency
         measurement option via -c and -m.
      
       - Micro-sleeps and synchronization can be injected to provoke lock
         contention and scheduling, via the -u and -S options. This simulates
         IO and contention.
      
       - The -x option instructs the workload to 'perturb' itself artificially
         every N seconds, by moving to the first and last CPU of the system
         periodically. This way the stability of convergence equilibrium and
         the number of steps taken for the scheduler to reach equilibrium again
         can be measured.
      
       - The amount of work can be specified via the -l loop count, and/or
         via a -s seconds-timeout value.
      
       - CPU and node memory binding options, to test hard binding scenarios.
         THP can be turned on and off via madvise() calls.
      
       - Live reporting of convergence progress in an 'at glance' output format.
         Printing of convergence and deconvergence events.
      
      The 'perf bench numa mem -a' option will start an array of about 30
      individual tests that will each output such measurements:
      
       # Running  5x5-bw-thread, "perf bench numa mem -p 5 -t 5 -P 512 -s 20 -zZ0q --thp  1"
        5x5-bw-thread,                         20.276, secs,           runtime-max/thread
        5x5-bw-thread,                         20.004, secs,           runtime-min/thread
        5x5-bw-thread,                         20.155, secs,           runtime-avg/thread
        5x5-bw-thread,                          0.671, %,              spread-runtime/thread
        5x5-bw-thread,                         21.153, GB,             data/thread
        5x5-bw-thread,                        528.818, GB,             data-total
        5x5-bw-thread,                          0.959, nsecs,          runtime/byte/thread
        5x5-bw-thread,                          1.043, GB/sec,         thread-speed
        5x5-bw-thread,                         26.081, GB/sec,         total-speed
      
      See the help text and the code for more details.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1c13f3c9
  16. 11 9月, 2012 1 次提交
    • I
      perf tools: Use __maybe_used for unused variables · 1d037ca1
      Irina Tirdea 提交于
      perf defines both __used and __unused variables to use for marking
      unused variables. The variable __used is defined to
      __attribute__((__unused__)), which contradicts the kernel definition to
      __attribute__((__used__)) for new gcc versions. On Android, __used is
      also defined in system headers and this leads to warnings like: warning:
      '__used__' attribute ignored
      
      __unused is not defined in the kernel and is not a standard definition.
      If __unused is included everywhere instead of __used, this leads to
      conflicts with glibc headers, since glibc has a variables with this name
      in its headers.
      
      The best approach is to use __maybe_unused, the definition used in the
      kernel for __attribute__((unused)). In this way there is only one
      definition in perf sources (instead of 2 definitions that point to the
      same thing: __used and __unused) and it works on both Linux and Android.
      This patch simply replaces all instances of __used and __unused with
      __maybe_unused.
      Signed-off-by: NIrina Tirdea <irina.tirdea@intel.com>
      Acked-by: NPekka Enberg <penberg@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com
      [ committer note: fixed up conflict with a116e05d in builtin-sched.c ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d037ca1
  17. 09 9月, 2012 1 次提交
  18. 03 7月, 2012 1 次提交
  19. 28 6月, 2012 1 次提交
  20. 07 2月, 2012 2 次提交
  21. 31 1月, 2012 1 次提交
  22. 25 1月, 2012 4 次提交
  23. 07 2月, 2011 1 次提交
    • K
      perf tool: Fix gcc 4.6.0 issues · fb7d0b3c
      Kyle McMartin 提交于
      GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
      due to the -Werror=unused-but-set-variable flag.
      
      I've gone through and annotated some of the assignments that had side
      effects (ie: return value from a function) with the __used annotation,
      and in some cases, just removed unused code.
      
      In a few cases, we were assigning something useful, but not using it in
      later parts of the function.
      
      kyle@dreadnought:~/src% gcc --version
      gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)
      
      Cc: Ingo Molnar <mingo@redhat.com>
      LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org>
      Signed-off-by: NKyle McMartin <kyle@redhat.com>
      [ committer note: Fixed up the annotation fixes, as that code moved recently ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fb7d0b3c
  24. 26 11月, 2010 2 次提交
    • H
      perf bench: Add feature that measures the performance of the... · ea7872b9
      Hitoshi Mitake 提交于
      perf bench: Add feature that measures the performance of the arch/x86/lib/memcpy_64.S memcpy routines via 'perf bench mem'
      
      This patch ports arch/x86/lib/memcpy_64.S to perf bench mem
      memcpy for benchmarking memcpy() in userland with tricky and
      dirty way.
      
      util/include/asm/cpufeature.h, util/include/asm/dwarf2.h, and
      util/include/linux/linkage.h are mostly dummy files with small
      wrappers, so that we are able to include memcpy_64.S
      unmodified.
      Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: h.mitake@gmail.com
      Cc: Miao Xie <miaox@cn.fujitsu.com>
      Cc: Ma Ling <ling.ma@intel.com>
      Cc: Zhao Yakui <yakui.zhao@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <1290668693-27068-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ea7872b9
    • H
      perf bench: Print both of prefaulted and no prefaulted results by default · 49ce8fc6
      Hitoshi Mitake 提交于
      After applying this patch, perf bench mem memcpy prints
      both of prefualted and without prefaulted score of memcpy().
      
      New options --no-prefault and --only-prefault are added
      to print single result, mainly for scripting usage.
      
      Usage example:
      
       | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB
       | # Running mem/memcpy benchmark...
       | # Copying 500MB Bytes ...
       |
       |      634.969014 MB/Sec
       |        4.828062 GB/Sec (with prefault)
       | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --only-prefault
       | # Running mem/memcpy benchmark...
       | # Copying 500MB Bytes ...
       |
       |        4.705192 GB/Sec (with prefault)
       | mitake@X201i:~/linux/.../tools/perf% ./perf bench mem memcpy -l 500MB --no-prefault
       | # Running mem/memcpy benchmark...
       | # Copying 500MB Bytes ...
       |
       |      642.725568 MB/Sec
      Signed-off-by: NHitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: h.mitake@gmail.com
      Cc: Miao Xie <miaox@cn.fujitsu.com>
      Cc: Ma Ling <ling.ma@intel.com>
      Cc: Zhao Yakui <yakui.zhao@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      LKML-Reference: <1290668693-27068-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      49ce8fc6
  25. 18 5月, 2010 1 次提交
    • A
      perf options: Check v type in OPT_U?INTEGER · 1967936d
      Arnaldo Carvalho de Melo 提交于
      To avoid problems like the one fixed by Stephane Eranian in 3de29cab, now
      we'll got this instead:
      
      	bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
      	bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’
      
      Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
      hackers should be already used to this.
      
      With it in place found some problems, fixed by changing the affected
      variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.
      
      Next csets will go thru converting each of the remaining OPT_ so that
      review can be made easier by grouping changes per type per patch.
      
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1967936d