1. 22 5月, 2011 6 次提交
  2. 19 5月, 2011 3 次提交
    • I
      perf stat: Add more cache-miss percentage printouts · c3305257
      Ingo Molnar 提交于
      Print out the cache-miss percentage as well if the cache refs were
      collected, for all the generic cache event types.
      
      Before:
      
         11,103,723,230 dTLB-loads                #  622.471 M/sec                    ( +-  0.30% )
             87,065,337 dTLB-load-misses          #    4.881 M/sec                    ( +-  0.90% )
      
      After:
      
         11,353,713,242 dTLB-loads                #  626.020 M/sec                    ( +-  0.35% )
            113,393,472 dTLB-load-misses          #    1.00% of all dTLB cache hits   ( +-  0.49% )
      
      Also ASCII color highlight too high percentages, them when it's executed on the console.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c3305257
    • I
      perf stat: Add -d -d and -d -d -d options to show more CPU events · 2cba3ffb
      Ingo Molnar 提交于
      Print even more detailed statistics if requested via perf stat -d:
      
             -d:          detailed events, L1 and LLC data cache
          -d -d:     more detailed events, dTLB and iTLB events
       -d -d -d:     very detailed events, adding prefetch events
      
      Full output looks like this now:
      
       Performance counter stats for '/home/mingo/hackbench 10' (5 runs):
      
             1703.674707 task-clock                #    8.709 CPUs utilized            ( +-  4.19% )
                  49,068 context-switches          #    0.029 M/sec                    ( +- 16.66% )
                   8,303 CPU-migrations            #    0.005 M/sec                    ( +- 24.90% )
                  17,397 page-faults               #    0.010 M/sec                    ( +-  0.46% )
           2,345,389,239 cycles                    #    1.377 GHz                      ( +-  4.61% ) [55.90%]
           1,884,503,527 stalled-cycles-frontend   #   80.35% frontend cycles idle     ( +-  5.67% ) [50.39%]
             743,919,737 stalled-cycles-backend    #   31.72% backend  cycles idle     ( +-  8.75% ) [49.91%]
           1,314,416,379 instructions              #    0.56  insns per cycle
                                                   #    1.43  stalled cycles per insn  ( +-  2.53% ) [60.87%]
             272,592,567 branches                  #  160.003 M/sec                    ( +-  1.74% ) [56.56%]
               3,794,846 branch-misses             #    1.39% of all branches          ( +-  6.59% ) [58.50%]
             449,982,778 L1-dcache-loads           #  264.125 M/sec                    ( +-  2.47% ) [49.88%]
              22,404,961 L1-dcache-load-misses     #    4.98% of all L1-dcache hits    ( +-  6.08% ) [55.05%]
               6,204,750 LLC-loads                 #    3.642 M/sec                    ( +-  8.91% ) [43.75%]
               1,837,411 LLC-load-misses           #    1.078 M/sec                    ( +-  7.27% ) [12.07%]
             411,440,421 L1-icache-loads           #  241.502 M/sec                    ( +-  5.60% ) [36.52%]
              27,556,832 L1-icache-load-misses     #   16.175 M/sec                    ( +-  7.46% ) [46.72%]
             464,067,627 dTLB-loads                #  272.392 M/sec                    ( +-  4.46% ) [54.17%]
              10,765,648 dTLB-load-misses          #    6.319 M/sec                    ( +-  3.18% ) [48.68%]
           1,273,080,386 iTLB-loads                #  747.256 M/sec                    ( +-  3.38% ) [47.53%]
                 117,481 iTLB-load-misses          #    0.069 M/sec                    ( +- 14.99% ) [47.01%]
               4,590,653 L1-dcache-prefetches      #    2.695 M/sec                    ( +-  4.49% ) [46.19%]
               1,712,660 L1-dcache-prefetch-misses #    1.005 M/sec                    ( +-  3.75% ) [44.82%]
      
              0.195622057  seconds time elapsed  ( +-  6.84% )
      
      Also clean up the attribute construction code to be appending, and factor
      it out into add_default_attributes().
      
      Tweak the coverage percentage printout a bit, so that it's easier to view it
      alongside the +- sttddev colum.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      2cba3ffb
    • I
      perf bench, x86: Add alternatives-asm.h wrapper · b3132072
      Ingo Molnar 提交于
      perf bench needs this to build the kernel's memcpy routine:
      
      In file included from bench/mem-memcpy-x86-64-asm.S:2:0:
      bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b3132072
  3. 18 5月, 2011 1 次提交
    • S
      perf: Fix multi-event parsing bug · 94692349
      Stephane Eranian 提交于
      This patch fixes an issue with event parsing.
      The following commit appears to have broken the
      ability to specify a comma separated list of events:
      
         commit ceb53fbf
         Author: Ingo Molnar <mingo@elte.hu>
         Date:   Wed Apr 27 04:06:33 2011 +0200
      
             perf stat: Fail more clearly when an invalid modifier is specified
      
      This patch fixes this while preserving the desired effect:
      
      $ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null
      
       Performance counter stats for 'ls /dev/null':
      
                  365956 instructions:u           #    0.00  insns per cycle
                  731806 instructions:k           #    0.00  insns per cycle
      
              0.001108862  seconds time elapsed
      
      $ perf stat -e task-clock-msecs true
      invalid event modifier: '-msecs'
      Run 'perf list' for a list of valid events and modifiers
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: acme@redhat.com
      Cc: peterz@infradead.org
      Cc: fweisbec@gmail.com
      Link: http://lkml.kernel.org/r/20110517133619.GA6999@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>
      94692349
  4. 10 5月, 2011 1 次提交
  5. 07 5月, 2011 1 次提交
  6. 30 4月, 2011 2 次提交
  7. 29 4月, 2011 5 次提交
  8. 28 4月, 2011 2 次提交
    • I
      perf stat: Fix compatibility behavior · ede70290
      Ingo Molnar 提交于
      Instead of failing on an unknown event, when new perf stat is run on
      older kernels:
      
        $ ./perf stat true
        Error: open_counter returned with 22 (Invalid argument). /bin/dmesg
        may provide additional information.
      
        Fatal: Not all events could be opened.
      
      Just ignore EINVAL and ENOSYS, we'll print the results as not counted:
      
       Performance counter stats for 'true':
      
                0.239483 task-clock               #    0.493 CPUs utilized
                       0 context-switches         #    0.000 M/sec
                       0 CPU-migrations           #    0.000 M/sec
                      86 page-faults              #    0.359 M/sec
                 704,766 cycles                   #    2.943 GHz
           <not counted> stalled-cycles
                 381,961 instructions             #    0.54  insns per cycle
                  69,626 branches                 #  290.735 M/sec
                   4,594 branch-misses            #    6.60% of all branches
      
              0.000485883  seconds time elapsed
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio5hjpn3dsrm@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      ede70290
    • I
      perf stat: Add --sync/-S option · f9cef0a9
      Ingo Molnar 提交于
      --sync will tell perf stat to run sync() before starting a command.
      
      This allows IO-heavy tests to be used with --repeat, without one
      iteration impacting the other.
      
      Elapsed time will stabilize for example:
      
        before:        3.971525714  seconds time elapsed  ( +-  8.56% )
        after:         3.211098537  seconds time elapsed  ( +-  1.52% )
      
      So measurements will be more accurate.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn1dsrm@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      f9cef0a9
  9. 27 4月, 2011 15 次提交
  10. 20 4月, 2011 1 次提交
  11. 19 4月, 2011 3 次提交