1. 06 9月, 2012 1 次提交
  2. 15 8月, 2012 3 次提交
    • A
      perf evlist: Introduce evsel list accessors · 0c21f736
      Arnaldo Carvalho de Melo 提交于
      To replace the longer list_entry constructs for things that are widely
      used:
      
      	perf_evlist__{first,last}(evlist)
      	perf_evsel__next(evsel)
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ng7azq26wg1jd801qqpcozwp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0c21f736
    • A
      perf evlist: Rename __group method to __set_leader · 63dab225
      Arnaldo Carvalho de Melo 提交于
      Just like was done for parse_events__set_leader.
      
      Also we need to have the list_entry set_leader method in evlist.c so that we
      don't grow another dep in the python binding:
      
       # ~acme/git/linux/tools/perf/python/twatch.py
       Traceback (most recent call last):
         File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module>
           import perf
       ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: parse_events__set_leader
      
      And also remove a pr_debug from evsel.c so that we avoid this one too:
      
       # ~acme/git/linux/tools/perf/python/twatch.py
       Traceback (most recent call last):
         File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module>
           import perf
       ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-0hk9dazg9pora9jylkqngovm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63dab225
    • J
      perf tools: Enable grouping logic for parsed events · 6a4bb04c
      Jiri Olsa 提交于
      This patch adds a functionality that allows to create event groups
      based on the way they are specified on the command line. Adding
      functionality to the '{}' group syntax introduced in earlier patch.
      
      The current '--group/-g' option behaviour remains intact. If you
      specify it for record/stat/top command, all the specified events
      become members of a single group with the first event as a group
      leader.
      
      With the new '{}' group syntax you can create group like:
        # perf record -e '{cycles,faults}' ls
      
      resulting in single event group containing 'cycles' and 'faults'
      events, with cycles event as group leader.
      
      All groups are created with regards to threads and cpus. Thus
      recording an event group within a 2 threads on server with
      4 CPUs will create 8 separate groups.
      
      Examples (first event in brackets is group leader):
      
        # 1 group (cpu-clock,task-clock)
        perf record --group -e cpu-clock,task-clock ls
        perf record -e '{cpu-clock,task-clock}' ls
      
        # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
        perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls
      
        # 1 group (cpu-clock,task-clock,minor-faults,major-faults)
        perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls
        perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls
      
        # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults)
        perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \
         -e instructions ls
      
        # 1 group
        # (cpu-clock,task-clock,minor-faults,major-faults,instructions)
        perf record --group -e cpu-clock,task-clock \
         -e minor-faults,major-faults -e instructions ls perf record -e
      '{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls
      
      It's possible to use standard event modifier for a group, which spans
      over all events in the group and updates each event modifier settings,
      for example:
      
        # perf record -r '{faults:k,cache-references}:p'
      
      resulting in ':kp' modifier being used for 'faults' and ':p' modifier
      being used for 'cache-references' event.
      Reviewed-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ulrich Drepper <drepper@gmail.com>
      Link: http://lkml.kernel.org/n/tip-ho42u0wcr8mn1otkalqi13qp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6a4bb04c
  3. 20 6月, 2012 1 次提交
  4. 11 6月, 2012 1 次提交
    • S
      perf stat: Fix default output file · fc3e4d07
      Stephane Eranian 提交于
      The following commit:
      
      commit 56f3bae7
      Author: Jim Cromie <jim.cromie@gmail.com>
      Date:   Wed Sep 7 17:14:00 2011 -0600
      
          perf stat: Add --log-fd <N> option to redirect stderr elsewhere
      
      introduced a bug in the way perf stat outputs the results by default,
      i.e., without the --log-fd or --output option. It would default to
      writing to file descriptor 0, i.e., stdin. Writing to stdin is allowed
      and is equivalent to writing to stdout. However, there is a major
      difference for any script that was already capturing the output of perf
      stat via redirection:
      
          perf stat >/tmp/log .... or perf stat 2>/tmp/log ....
      
      They would not capture anything anymore. They would have to do:
          perf stat 0>/tmp/log ...
      
      This breaks compatibility with existing scripts and does not look very
      natural.
      
      This patch fixes the problem by looking at output_fd only when it was
      modified by user (> 0). It also checks that the value if positive.
      Passing --log-fd 0 is ignored.
      
      I would also argue that defaulting to stderr for the results is not the
      right thing to do, though this patch does not address this specific
      issue.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jim Cromie <jim.cromie@gmail.com>
      Link: http://lkml.kernel.org/r/20120515111111.GA9870@quadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fc3e4d07
  5. 31 5月, 2012 1 次提交
    • A
      perf stat: Initialize default events wrt exclude_{guest,host} · 79695e1b
      Arnaldo Carvalho de Melo 提交于
      When no event is specified the tools use perf_evlist__add_default(), that will
      call event_attr_init to initialize the KVM exclusion bits.
      
      When the change was made to the tools so that by default guest samples would be
      excluded, the changes were made just to the parsing routines and to
      perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far
      just by perf stat to add multiple events, according to the level of detail
      specified.
      
      Recently the tools were changed to reconstruct the event name from all the
      details in perf_event_attr, not just from .type and .config, but taking into
      account all the feature bits (.exclude_{guest,host,user,kernel,etc},
      .precise_ip, etc).
      
      That is when we noticed that the default for perf stat wasn't the one for the
      rest of the tools, i.e. the .exclude_guest bit wasn't being set.
      
      I.e. the default, that doesn't call event_attr_init was showing the :HG
      modifier:
      
        $ perf stat usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  0.942119 task-clock                #    0.454 CPUs utilized
                         1 context-switches          #    0.001 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       126 page-faults               #    0.134 M/sec
                   693,193 cycles:HG                 #    0.736 GHz                     [40.11%]
                   407,461 stalled-cycles-frontend:HG #   58.78% frontend cycles idle    [72.29%]
                   365,403 stalled-cycles-backend:HG #   52.71% backend  cycles idle
                   465,982 instructions:HG           #    0.67  insns per cycle
                                                     #    0.87  stalled cycles per insn
                    89,760 branches:HG               #   95.275 M/sec
                     6,178 branch-misses:HG          #    6.88% of all branches
      
               0.002077228 seconds time elapsed
      
      While if one explicitely specifies the same events, which will make the parsing code
      to be called and thus event_attr_init is called:
      
        $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1
      
         Performance counter stats for 'usleep 1':
      
                  1.040349 task-clock                #    0.500 CPUs utilized
                         2 context-switches          #    0.002 M/sec
                         0 CPU-migrations            #    0.000 K/sec
                       127 page-faults               #    0.122 M/sec
                   587,966 cycles                    #    0.565 GHz                     [13.18%]
                   459,167 stalled-cycles-frontend   #   78.09% frontend cycles idle
                   390,249 stalled-cycles-backend    #   66.37% backend  cycles idle
                   504,006 instructions              #    0.86  insns per cycle
                                                     #    0.91  stalled cycles per insn
                    96,455 branches                  #   92.714 M/sec
                     6,522 branch-misses             #    6.76% of all branches         [96.12%]
      
               0.002078681 seconds time elapsed
      
      Fix it by introducing a perf_evlist__add_default_attrs method that will call
      evlist_attr_init in all the perf_event_attr entries before adding the events.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4eysr236r0pgiyum9epwxw7s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79695e1b
  6. 16 5月, 2012 1 次提交
  7. 10 5月, 2012 1 次提交
    • D
      perf stat: handle ENXIO error for perf_event_open · 20d23aaa
      David Ahern 提交于
      perf stat on PPC currently fails to run:
      
      $ perf stat -- sleep 1
        Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information.
      
        Fatal: Not all events could be opened.
      
      The problem is that until 2.6.37 (behavior changed with commit b0a873eb)
      perf on PPC returns ENXIO when hw_perf_event_init() fails. With this
      patch we get the expected behavior:
      
      $ perf stat -v -- sleep 1
      cycles event is not supported by the kernel.
      stalled-cycles-frontend event is not supported by the kernel.
      stalled-cycles-backend event is not supported by the kernel.
      instructions event is not supported by the kernel.
      branches event is not supported by the kernel.
      branch-misses event is not supported by the kernel.
      
      ...
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1336490956-57145-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      20d23aaa
  8. 09 5月, 2012 1 次提交
    • D
      perf stat: handle ENXIO error for perf_event_open · 979987a5
      David Ahern 提交于
      perf stat on PPC currently fails to run:
      
      $ perf stat -- sleep 1
        Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information.
      
        Fatal: Not all events could be opened.
      
      The problem is that until 2.6.37 (behavior changed with commit b0a873eb)
      perf on PPC returns ENXIO when hw_perf_event_init() fails. With this
      patch we get the expected behavior:
      
      $ perf stat -v -- sleep 1
      cycles event is not supported by the kernel.
      stalled-cycles-frontend event is not supported by the kernel.
      stalled-cycles-backend event is not supported by the kernel.
      instructions event is not supported by the kernel.
      branches event is not supported by the kernel.
      branch-misses event is not supported by the kernel.
      
      ...
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1336490956-57145-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      979987a5
  9. 08 5月, 2012 2 次提交
  10. 03 5月, 2012 2 次提交
  11. 02 5月, 2012 1 次提交
    • S
      perf stat: Fix case where guest/host monitoring is not supported by kernel · 5622c07b
      Stephane Eranian 提交于
      By default, perf stat sets exclude_guest = 1. But when you run perf on a
      kernel which does not support  host/guest filtering, then you get an
      error saying the event in unsupported. This comes from the fact that
      when the perf_event_attr struct passed by the user is larger than the
      one known to the kernel there is safety check which ensures that all
      unknown bits are zero. But here, exclude_guest is 1 (part of the unknown
      bits) and thus the perf_event_open() syscall return EINVAL.
      
      To my surprise, running perf record on the same kernel did not exhibit
      the problem. The reason is that perf record handles the problem by
      catching the error and retrying with guest/host excludes set to zero.
      For some reason, this was not done with perf stat. This patch fixes this
      problem.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Link: http://lkml.kernel.org/r/20120427124538.GA7230@quadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5622c07b
  12. 12 4月, 2012 1 次提交
  13. 17 3月, 2012 1 次提交
    • N
      perf stat: Fix event grouping on forked task · 4c19ea45
      Namhyung Kim 提交于
      When event group is enabled for forked task (i.e. no target task was
      specified) all events were disabled and marked ->enable_on_exec.
      However they are not counted at all since only group leader will be
      enabled on exec actually. So the result looked like below:
      
       $ ./perf stat --group -- sleep 1
      
       Performance counter stats for 'sleep 1':
      
                0.554926 task-clock                #    0.001 CPUs utilized
           <not counted> context-switches
           <not counted> CPU-migrations
           <not counted> page-faults
           <not counted> cycles
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
           <not counted> instructions
           <not counted> branches
           <not counted> branch-misses
      
             1.001228093 seconds time elapsed
      
      Fix it by disabling group leader only.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1331887340-32448-1-git-send-email-namhyung.kim@lge.comSigned-off-by: NNamhyung Kim <namhyung.kim@lge.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4c19ea45
  14. 14 2月, 2012 1 次提交
  15. 07 2月, 2012 2 次提交
  16. 25 1月, 2012 1 次提交
  17. 04 1月, 2012 1 次提交
  18. 06 12月, 2011 1 次提交
    • A
      perf stat: Failure with "Operation not supported" · 38f6ae1e
      Anton Blanchard 提交于
      perf stat is failing on PowerPC:
      
        Error: open_counter returned with 95 (Operation not supported). /bin/dmesg may provide additional information.
      
        Fatal: Not all events could be opened.
      
      commit 370faf1d (perf stat: Fail softly on unsupported events)
      added a check for failure returning ENOENT, but the POWER backend
      returns EOPNOTSUPP. It looks like alpha, blackfin and mips do the
      same.
      
      With the patch applied, things work as expected:
      
       Performance counter stats for '/bin/true':
      
                0.362176 task-clock                #    0.623 CPUs utilized
                       0 context-switches          #    0.000 M/sec
                       0 CPU-migrations            #    0.000 M/sec
                      28 page-faults               #    0.077 M/sec
               1,677,020 cycles                    #    4.630 GHz
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
                 431,220 instructions              #    0.26  insns per cycle
                 101,889 branches                  #  281.325 M/sec
                   4,145 branch-misses             #    4.07% of all branches
      
             0.000581361 seconds time elapsed
      
      Cc: <stable@kernel.org> # 3.0+
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20111202093833.5fef7226@krytenSigned-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      38f6ae1e
  19. 29 11月, 2011 1 次提交
  20. 28 11月, 2011 1 次提交
  21. 26 10月, 2011 1 次提交
  22. 30 9月, 2011 6 次提交
  23. 18 8月, 2011 2 次提交
  24. 21 7月, 2011 1 次提交
  25. 01 7月, 2011 1 次提交
    • Z
      perf stat: Add noise output for csv mode · 3ae9a34d
      Zhengyu He 提交于
      Previously, when you want perf-stat to output the statistics in
      csv mode, no information of the noise will be printed out.
      
      For example right now we output this --repeat information:
      
       ./perf stat -r3 -x, sleep 1
       1.164789,task-clock
       8,context-switches
       0,CPU-migrations
       219,page-faults
       3337800,cycles
      
      With this patch, the output will be appended with an additional
      entry for the noise value:
      
       ./perf stat -r3 -x, sleep 1
       1.164789,task-clock,3.75%
       8,context-switches,75.00%
       0,CPU-migrations,100.00%
       219,page-faults,0.00%
       3337800,cycles,3.36%
      Signed-off-by: NZhengyu He <zhengyuh@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Venkatesh Pallipadi <venki@google.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Link: http://lkml.kernel.org/r/1308861942-4945-1-git-send-email-zhengyuh@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      3ae9a34d
  26. 03 6月, 2011 1 次提交
    • D
      perf stat: clarify unsupported events from uncounted events · 2cee77c4
      David Ahern 提交于
      perf stat continues running even if the event list contains counters
      that are not supported. The resulting output then contains <not counted>
      for those events which gets confusing as to which events are supported,
      but not counted and which are not supported.
      
      Before:
      
      perf stat -ddd -- sleep 1
      
            Performance counter stats for 'sleep 1':
      
                0.571283 task-clock                #    0.001 CPUs utilized
                       1 context-switches          #    0.002 M/sec
                       0 CPU-migrations            #    0.000 M/sec
                     157 page-faults               #    0.275 M/sec
               1,037,707 cycles                    #    1.816 GHz
           <not counted> stalled-cycles-frontend
           <not counted> stalled-cycles-backend
                 654,499 instructions              #    0.63  insns per cycle
                 136,129 branches                  #  238.286 M/sec
           <not counted> branch-misses
           <not counted> L1-dcache-loads
           <not counted> L1-dcache-load-misses
           <not counted> LLC-loads
           <not counted> LLC-load-misses
           <not counted> L1-icache-loads
           <not counted> L1-icache-load-misses
           <not counted> dTLB-loads
           <not counted> dTLB-load-misses
           <not counted> iTLB-loads
           <not counted> iTLB-load-misses
           <not counted> L1-dcache-prefetches
           <not counted> L1-dcache-prefetch-misses
      
             1.001004836 seconds time elapsed
      
      After:
      
      perf stat -ddd -- sleep 1
      
       Performance counter stats for 'sleep 1':
      
                1.350326 task-clock                #    0.001 CPUs utilized
                       2 context-switches          #    0.001 M/sec
                       0 CPU-migrations            #    0.000 M/sec
                     157 page-faults               #    0.116 M/sec
                  11,986 cycles                    #    0.009 GHz
         <not supported> stalled-cycles-frontend
         <not supported> stalled-cycles-backend
                 496,986 instructions              #   41.46  insns per cycle
                 138,065 branches                  #  102.246 M/sec
                   7,245 branch-misses             #    5.25% of all branches
           <not counted> L1-dcache-loads
           <not counted> L1-dcache-load-misses
           <not counted> LLC-loads
           <not counted> LLC-load-misses
           <not counted> L1-icache-loads
           <not counted> L1-icache-load-misses
           <not counted> dTLB-loads
           <not counted> dTLB-load-misses
           <not counted> iTLB-loads
           <not counted> iTLB-load-misses
           <not counted> L1-dcache-prefetches
         <not supported> L1-dcache-prefetch-misses
      
             1.002397333 seconds time elapsed
      
      v1->v2:
      changed supported type from int to bool
      
      v2->v3
      fixed vertical alignment of new struct element
      
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2cee77c4
  27. 19 5月, 2011 2 次提交
    • I
      perf stat: Add more cache-miss percentage printouts · c3305257
      Ingo Molnar 提交于
      Print out the cache-miss percentage as well if the cache refs were
      collected, for all the generic cache event types.
      
      Before:
      
         11,103,723,230 dTLB-loads                #  622.471 M/sec                    ( +-  0.30% )
             87,065,337 dTLB-load-misses          #    4.881 M/sec                    ( +-  0.90% )
      
      After:
      
         11,353,713,242 dTLB-loads                #  626.020 M/sec                    ( +-  0.35% )
            113,393,472 dTLB-load-misses          #    1.00% of all dTLB cache hits   ( +-  0.49% )
      
      Also ASCII color highlight too high percentages, them when it's executed on the console.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c3305257
    • I
      perf stat: Add -d -d and -d -d -d options to show more CPU events · 2cba3ffb
      Ingo Molnar 提交于
      Print even more detailed statistics if requested via perf stat -d:
      
             -d:          detailed events, L1 and LLC data cache
          -d -d:     more detailed events, dTLB and iTLB events
       -d -d -d:     very detailed events, adding prefetch events
      
      Full output looks like this now:
      
       Performance counter stats for '/home/mingo/hackbench 10' (5 runs):
      
             1703.674707 task-clock                #    8.709 CPUs utilized            ( +-  4.19% )
                  49,068 context-switches          #    0.029 M/sec                    ( +- 16.66% )
                   8,303 CPU-migrations            #    0.005 M/sec                    ( +- 24.90% )
                  17,397 page-faults               #    0.010 M/sec                    ( +-  0.46% )
           2,345,389,239 cycles                    #    1.377 GHz                      ( +-  4.61% ) [55.90%]
           1,884,503,527 stalled-cycles-frontend   #   80.35% frontend cycles idle     ( +-  5.67% ) [50.39%]
             743,919,737 stalled-cycles-backend    #   31.72% backend  cycles idle     ( +-  8.75% ) [49.91%]
           1,314,416,379 instructions              #    0.56  insns per cycle
                                                   #    1.43  stalled cycles per insn  ( +-  2.53% ) [60.87%]
             272,592,567 branches                  #  160.003 M/sec                    ( +-  1.74% ) [56.56%]
               3,794,846 branch-misses             #    1.39% of all branches          ( +-  6.59% ) [58.50%]
             449,982,778 L1-dcache-loads           #  264.125 M/sec                    ( +-  2.47% ) [49.88%]
              22,404,961 L1-dcache-load-misses     #    4.98% of all L1-dcache hits    ( +-  6.08% ) [55.05%]
               6,204,750 LLC-loads                 #    3.642 M/sec                    ( +-  8.91% ) [43.75%]
               1,837,411 LLC-load-misses           #    1.078 M/sec                    ( +-  7.27% ) [12.07%]
             411,440,421 L1-icache-loads           #  241.502 M/sec                    ( +-  5.60% ) [36.52%]
              27,556,832 L1-icache-load-misses     #   16.175 M/sec                    ( +-  7.46% ) [46.72%]
             464,067,627 dTLB-loads                #  272.392 M/sec                    ( +-  4.46% ) [54.17%]
              10,765,648 dTLB-load-misses          #    6.319 M/sec                    ( +-  3.18% ) [48.68%]
           1,273,080,386 iTLB-loads                #  747.256 M/sec                    ( +-  3.38% ) [47.53%]
                 117,481 iTLB-load-misses          #    0.069 M/sec                    ( +- 14.99% ) [47.01%]
               4,590,653 L1-dcache-prefetches      #    2.695 M/sec                    ( +-  4.49% ) [46.19%]
               1,712,660 L1-dcache-prefetch-misses #    1.005 M/sec                    ( +-  3.75% ) [44.82%]
      
              0.195622057  seconds time elapsed  ( +-  6.84% )
      
      Also clean up the attribute construction code to be appending, and factor
      it out into add_default_attributes().
      
      Tweak the coverage percentage printout a bit, so that it's easier to view it
      alongside the +- sttddev colum.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
      2cba3ffb
  28. 30 4月, 2011 1 次提交