1. 21 7月, 2015 2 次提交
    • W
      perf record: Allow filtering perf's pid via --exclude-perf · 4ba1faa1
      Wang Nan 提交于
      This patch allows 'perf record' to exclude events issued by perf itself
      by '--exclude-perf' option.
      
      Before this patch, when doing something like:
      
       # perf record -a -e syscalls:sys_enter_write <cmd>
      
      One could easily get result like this:
      
       # /tmp/perf report --stdio
       ...
        # Overhead  Command  Shared Object       Symbol
        # ........  .......  ..................  ....................
        #
            99.99%  perf     libpthread-2.18.so  [.] __write_nocancel
            0.01%   ls       libc-2.18.so        [.] write
            0.01%   sshd     libc-2.18.so        [.] write
       ...
      
      Where most events are generated by perf itself.
      
      A shell trick can be done to filter perf itself out:
      
       # cat << EOF > ./tmp
       > #!/bin/sh
       > exec perf record -e ... --filter="common_pid != \$\$" -a sleep 10
       > EOF
       # chmod a+x ./tmp
       # ./tmp
      
      However, doing so is user unfriendly.
      
      This patch extracts evsel iteration framework introduced by patch 'perf
      record: Apply filter to all events in a glob matching' into
      foreach_evsel_in_last_glob(), and makes exclude_perf() function append
      new filter expression to each evsel selected by a '-e' selector.
      
      To avoid losing filters if user pass '--filter' after '--exclude-perf',
      this patch uses perf_evsel__append_filter() in both case, instead of
      perf_evsel__set_filter() which removes old filter. As a side effect, now
      it is possible to use multiple '--filter' option for one selector. They
      are combinded with '&&'.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1436513770-8896-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ba1faa1
    • W
      perf record: Apply filter to all events in a glob matching · 15bfd2cc
      Wang Nan 提交于
      There is an old problem in perf's filter applying which first posted at
      Sep. 2014 at https://lkml.org/lkml/2014/9/9/944 that, if passing
      multiple events in a glob matching expression in cmdline then add
      '--filter' after them, the filter will be applied on only the last one.
      
      For example:
      
       # dd if=/dev/zero of=/dev/null &
       [1] 464
       # perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.239 MB perf.data (2094 samples) ]
       # perf report --stdio | tee
       ...
       # Samples: 2K of event 'syscalls:sys_enter_read'
       # Event count (approx.): 2092
       ...
       # Samples: 2  of event 'syscalls:sys_exit_read'
       # Event count (approx.): 2
       ...
      
      In this example, filter only applied on 'syscalls:sys_exit_read', and
      there's no way to set filter for ''syscalls:sys_enter_read'.
      
      This patch adds a 'cmdline_group_boundary' for 'struct evsel', and
      apply filter on all events between two boundary marks.
      
      After applying this patch:
      
       # perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.031 MB perf.data (3 samples) ]
       # perf report --stdio | tee
       ...
       # Samples: 1  of event 'syscalls:sys_enter_read'
       # Event count (approx.): 1
       ...
       # Samples: 2  of event 'syscalls:sys_exit_read'
       # Event count (approx.): 2
       ...
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Reported-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1436513770-8896-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      15bfd2cc
  2. 06 7月, 2015 1 次提交
  3. 26 6月, 2015 1 次提交
  4. 27 5月, 2015 2 次提交
  5. 29 4月, 2015 7 次提交
  6. 08 4月, 2015 1 次提交
  7. 28 2月, 2015 2 次提交
    • Y
      perf list: Clean up the printing functions of hardware/software events · 705750f2
      Yunlong Song 提交于
      Do not need print_events_type or __print_events_type for listing hw/sw
      events, let print_symbol_events do its job instead. Moreover,
      print_symbol_events can also handle event_glob and name_only.
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-4-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      705750f2
    • Y
      perf list: Sort the output of 'perf list' to view more clearly · ab0e4800
      Yunlong Song 提交于
      Sort the output according to ASCII character list (using strcmp), which
      supports both number sequence and alphabet sequence.
      
      Example:
      
      Before this patch:
      
       $ perf list
      
       List of pre-defined events (to be used in -e):
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         cache-references                                   [Hardware event]
         cache-misses                                       [Hardware event]
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         ...                                                ...
      
         jbd2:jbd2_start_commit                             [Tracepoint event]
         jbd2:jbd2_commit_locking                           [Tracepoint event]
         jbd2:jbd2_run_stats                                [Tracepoint event]
         block:block_rq_issue                               [Tracepoint event]
         block:block_bio_complete                           [Tracepoint event]
         block:block_bio_backmerge                          [Tracepoint event]
         block:block_getrq                                  [Tracepoint event]
         ...                                                ...
      
      After this patch:
      
       $ perf list
      
       List of pre-defined events (to be used in -e):
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         ...                                                ...
      
         block:block_bio_backmerge                          [Tracepoint event]
         block:block_bio_complete                           [Tracepoint event]
         block:block_getrq                                  [Tracepoint event]
         block:block_rq_issue                               [Tracepoint event]
         jbd2:jbd2_commit_locking                           [Tracepoint event]
         jbd2:jbd2_run_stats                                [Tracepoint event]
         jbd2:jbd2_start_commit                             [Tracepoint event]
         ...                                                ...
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1425032491-20224-2-git-send-email-yunlong.song@huawei.com
      [ Don't forget closedir({sys,evt}_dir) when handling errors ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab0e4800
  8. 13 2月, 2015 1 次提交
    • Y
      perf list: Place the header text in its right position · 619a303c
      Yunlong Song 提交于
      The hearer text 'List of pre-defined events (to be used in -e):' is
      placed in an improper function, which causes an abnormal output, e.g.
      'perf list hw' shows no guiding text at all, and 'perf list hw
      L1-dcache*' shows the guiding text incorrectly in the middle of the
      output.
      
      Example
      Before this patch:
      
       $ perf list hw L1-dcache*
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
       List of pre-defined events (to be used in -e):              <-- incorrect position
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      
      After this patch:
      
       $ perf list hw L1-dcache*
      
       List of pre-defined events (to be used in -e):              <-- correct position
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1423833115-11199-8-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      619a303c
  9. 07 2月, 2015 1 次提交
  10. 22 1月, 2015 1 次提交
  11. 03 12月, 2014 1 次提交
  12. 25 11月, 2014 2 次提交
  13. 16 10月, 2014 2 次提交
  14. 02 10月, 2014 1 次提交
  15. 30 9月, 2014 1 次提交
  16. 18 9月, 2014 2 次提交
    • A
      perf tools: Let default config be defined for a PMU · dc0a6202
      Adrian Hunter 提交于
      This allows default config terms to be provided for a PMU. So, for
      example, when the Intel PT PMU is added, it will be possible to specify:
      
      	intel_pt//
      
      which will be the same as:
      
      	intel_pt/tsc=1,noretcomp=0/
      
      meaning that the trace should contain TSC timestamps and perform 'return
      compression'.
      
      An important consideration of this patch is that it must be possible to
      overwrite the default values.  That has meant changing the logic so that
      a zero value can replace a non-zero value.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1406786474-9306-7-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dc0a6202
    • A
      perf tools: Let a user specify a PMU event without any config terms · ad962273
      Adrian Hunter 提交于
      This enables a PMU event to be specified in the form:
      
      	pmu//
      
      which is effectively the same as:
      
      	pmu/config=0/
      
      This patch is a precursor to defining default config for a PMU.
      
      Further explanation extracted from lkml thread:
      
      Imagine that the 'tsc' term did not exist.
      
      Intel PT trace data would not contain TSC packets, and the decoder would
      not know how to decode them.
      
      Then imagine that a new version of the hardware adds 'tsc'.
      
      It is such a useful feature that we want it by default, but older
      versions of the tools don't know how to decode it, so the kernel cannot
      turn it on by default.
      
      It is similar to why the kernel does not select perf_event_attr.mmap2 by
      default.
      
      The kernel doesn't know whether the tool supports it.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1408129739-17368-6-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ad962273
  17. 15 8月, 2014 1 次提交
  18. 10 2月, 2014 1 次提交
  19. 21 1月, 2014 1 次提交
  20. 13 1月, 2014 1 次提交
  21. 28 12月, 2013 1 次提交
  22. 17 12月, 2013 1 次提交
    • B
      tools/: Convert to new topic libraries · 553873e1
      Borislav Petkov 提交于
      Move debugfs.* to api/fs/. We have a common tools/lib/api/ place where
      the Makefile lives and then we place the headers in subdirs.
      
      For example, all the fs-related stuff goes to tools/lib/api/fs/ from
      which we get libapikfs.a (acme got almost the naming he wanted :-)) and
      we link it into the tools which need it - in this case perf and
      tools/vm/page-types.
      
      acme:
      
      "Looking at the implementation, I think some tools can even link
      directly to the .o files, avoiding the .a file altogether.
      
      But that is just an optimization/finer granularity tools/lib/
      cherrypicking that toolers can make use of."
      
      Fixup documentation cleaning target while at it.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rric@kernel.org>
      Cc: Stanislav Fomichev <stfomichev@yandex-team.ru>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/1386605664-24041-2-git-send-email-bp@alien8.deSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      553873e1
  23. 27 11月, 2013 1 次提交
    • S
      tools/perf/stat: Add event unit and scale support · 410136f5
      Stephane Eranian 提交于
      This patch adds perf stat support for handling event units and
      scales as exported by the kernel.
      
      The kernel can export PMU events actual unit and scaling factor
      via sysfs:
      
        $ ls -1 /sys/devices/power/events/energy-*
        /sys/devices/power/events/energy-cores
        /sys/devices/power/events/energy-cores.scale
        /sys/devices/power/events/energy-cores.unit
        /sys/devices/power/events/energy-pkg
        /sys/devices/power/events/energy-pkg.scale
        /sys/devices/power/events/energy-pkg.unit
        $ cat /sys/devices/power/events/energy-cores.scale
        2.3283064365386962890625e-10
        $ cat cat /sys/devices/power/events/energy-cores.unit
        Joules
      
      This patch modifies the pmu event alias code to check
      for the presence of the .unit and .scale files to load
      the corresponding values. They are then used by perf stat
      transparently:
      
         # perf stat -a -e power/energy-pkg/,power/energy-cores/,cycles -I 1000 sleep 1000
         #          time             counts   unit events
             1.000214717               3.07 Joules power/energy-pkg/         [100.00%]
             1.000214717               0.53 Joules power/energy-cores/
             1.000214717           12965028        cycles                    [100.00%]
             2.000749289               3.01 Joules power/energy-pkg/
             2.000749289               0.52 Joules power/energy-cores/
             2.000749289           15817043        cycles
      
      When the event does not have an explicit unit exported by
      the kernel, nothing is printed. In csv output mode, there
      will be an empty field.
      
      Special thanks to Jiri for providing the supporting code
      in the parser to trigger reading of the scale and unit files.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NJiri Olsa <jolsa@redhat.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: zheng.z.yan@intel.com
      Cc: bp@alien8.de
      Cc: maria.n.dimakopoulou@gmail.com
      Cc: acme@redhat.com
      Link: http://lkml.kernel.org/r/1384275531-10892-3-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      410136f5
  24. 12 11月, 2013 1 次提交
    • A
      perf evsel: Remove idx parm from constructor · ef503831
      Arnaldo Carvalho de Melo 提交于
      Most uses of the evsel constructor are followed by a call to
      perf_evlist__add with an idex of evlist->nr_entries, so make rename
      the current constructor to perf_evsel__new_idx and remove the need
      for passing the constructor for the common case.
      
      We still need the new_idx variant because the way groups are handled,
      with evsel->nr_members holding the number of entries in an evlist,
      partitioning the evlist into sublists inside a single linked list.
      
      This asks for a clarifying refactoring, but for now simplify the non
      parser cases, so that tool writers don't have to bother with evsel idx
      setting.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-zy9tskx6jqm2rmw7468zze2a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ef503831
  25. 22 10月, 2013 1 次提交
  26. 03 9月, 2013 2 次提交
  27. 08 8月, 2013 1 次提交
    • M
      perf tools: Add support for pinned modifier · e9a7c414
      Michael Ellerman 提交于
      This commit adds support for a new modifier "D", which requests that the
      event, or group of events, be pinned to the PMU.
      
      The "p" modifier is already taken for precise, and "P" may be used in
      future to mean "fully precise".
      
      So we use "D", which stands for pinneD - and looks like a padlock, or if
      you're using the ":D" syntax perf smiles at you.
      
      This is an oft-requested feature from our HW folks, who want to be able
      to run a large number of events, but also want 100% accurate results for
      instructions per cycle.
      
      Comparison of results with and without pinning:
      
      $ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...
      
        79,590,480,683 cycles         #  0.000 GHz
       166,123,716,524 instructions   #  2.09  insns per cycle
                                      #  0.11  stalled cycles per insn
      
        79,352,134,463 cycles         #  0.000 GHz                     [11.11%]
       165,178,301,818 instructions   #  2.08  insns per cycle
                                      #  0.11  stalled cycles per insn [11.13%]
      
      As you can see although perf does a very good job of scaling the values
      in the non-pinned case, there is some small discrepancy.
      
      The patch is fairly straight forward, the one detail is that we need to
      make sure we only request pinning for the group leader when we have a
      group.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
      [ Use perf_evsel__is_group_leader instead of open coded equivalent, as
        suggested by Jiri Olsa ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e9a7c414