1. 30 10月, 2015 1 次提交
    • W
      perf record: Add clang options for compiling BPF scripts · 71dc2326
      Wang Nan 提交于
      Although previous patch allows setting BPF compiler related options in
      perfconfig, on some ad-hoc situation it still requires passing options
      through cmdline. This patch introduces 2 options to 'perf record' for
      this propose: --clang-path and --clang-opt.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-9-git-send-email-wangnan0@huawei.com
      [ Add the new options to the 'record' man page ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      71dc2326
  2. 23 10月, 2015 1 次提交
    • N
      perf tools: Improve call graph documents and help messages · 76a26549
      Namhyung Kim 提交于
      The --call-graph option is complex so we should provide better guide for
      users.  Also change help message to be consistent with config option
      names.  Now perf top will show help like below:
      
        $ perf top --call-graph
          Error: option `call-graph' requires a value
      
         Usage: perf top [<options>]
      
            --call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]>
                 setup and enables call-graph (stack chain/backtrace):
      
      		record_mode:	call graph recording mode (fp|dwarf|lbr)
      		record_size:	if record_mode is 'dwarf', max size of stack recording (<bytes>)
      				default: 8192 (bytes)
      		print_type:	call graph printing style (graph|flat|fractal|none)
      		threshold:	minimum call graph inclusion threshold (<percent>)
      		print_limit:	maximum number of call graph entry (<number>)
      		order:		call graph order (caller|callee)
      		sort_key:	call graph sort key (function|address)
      		branch:		include last branch info to call graph (branch)
      
      		Default: fp,graph,0.5,caller,function
      Requested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      76a26549
  3. 20 10月, 2015 1 次提交
  4. 01 9月, 2015 1 次提交
    • S
      perf record: Add ability to name registers to record · bcc84ec6
      Stephane Eranian 提交于
      This patch modifies the -I/--int-regs option to enablepassing the name
      of the registers to sample on interrupt. Registers can be specified by
      their symbolic names. For instance on x86, --intr-regs=ax,si.
      
      The motivation is to reduce the size of the perf.data file and the
      overhead of sampling by only collecting the registers useful to a
      specific analysis. For instance, for value profiling, sampling only the
      registers used to passed arguements to functions.
      
      With no parameter, the --intr-regs still records all possible registers
      based on the architecture.
      
      To name registers, it is necessary to use the long form of the option,
      i.e., --intr-regs:
      
        $ perf record --intr-regs=si,di,r8,r9 .....
      
      To record any possible registers:
      
        $ perf record -I .....
        $ perf report --intr-regs ...
      
      To display the register, one can use perf report -D
      
      To list the available registers:
      
        $ perf record --intr-regs=\?
        available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1441039273-16260-4-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bcc84ec6
  5. 13 8月, 2015 2 次提交
    • K
      perf callchain: Allow disabling call graphs per event · f9db0d0f
      Kan Liang 提交于
      This patch introduce "call-graph=no" to disable per-event callgraph.
      
      Here is an example.
      
        perf record -e 'cpu/cpu-cycles,call-graph=fp/,cpu/instructions,call-graph=no/' sleep 1
      
        perf report --stdio
      
        # To display the perf.data header info, please use
        --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 6  of event 'cpu/cpu-cycles,call-graph=fp/'
        # Event count (approx.): 774218
        #
        # Children      Self  Command  Shared Object     Symbol
        # ........  ........  .......  ................  ........................................
        #
          61.94%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--97.30%-- __brk
                       |
                        --2.70%-- mmap64
                                  _dl_check_map_versions
                                  _dl_check_all_versions
      
          61.94%     0.00%  sleep    [kernel.vmlinux]  [k] perf_event_mmap
                    |
                    ---perf_event_mmap
                       |
                       |--97.30%-- do_brk
                       |          sys_brk
                       |          entry_SYSCALL_64_fastpath
                       |          __brk
                       |
                        --2.70%-- mmap_region
                                  do_mmap_pgoff
                                  vm_mmap_pgoff
                                  sys_mmap_pgoff
                                  sys_mmap
                                  entry_SYSCALL_64_fastpath
                                  mmap64
                                  _dl_check_map_versions
                                  _dl_check_all_versions
        ......
      
        # Samples: 6  of event 'cpu/instructions,call-graph=no/'
        # Event count (approx.): 359692
        #
        # Children      Self  Command  Shared Object     Symbol
        # ........  ........  .......  ................  .................................
        #
           89.03%     0.00%  sleep    [unknown]         [.] 0xffff6598ffff6598
           89.03%     0.00%  sleep    ld-2.17.so        [.] _dl_resolve_conflicts
           89.03%     0.00%  sleep    [kernel.vmlinux]  [k] page_fault
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9db0d0f
    • K
      perf callchain: Per-event type selection support · d457c963
      Kan Liang 提交于
      This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per
      event. This in term can reduce sampling overhead and the size of the
      perf.data.
      
      Here is an example.
      
        perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1
      
       perf evlist -v
       cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112,
       config: 0x3c, { sample_period, sample_freq }: 1000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
       inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
       1, exclude_guest: 1, mmap2: 1, comm_exec: 1
       cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, {
       sample_period, sample_freq }: 4000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID,
       disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1,
       exclude_guest: 1
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d457c963
  6. 11 8月, 2015 1 次提交
  7. 05 8月, 2015 1 次提交
  8. 30 7月, 2015 1 次提交
    • J
      perf tools: Force period term to overload global settings · ee4c7588
      Jiri Olsa 提交于
      Currently the command line option settings beats the per event period
      settings:
      
      With no global settings, we get per-event configuration:
      
        $ perf record -e 'cpu/instructions,period=20000/' sleep 1
        $ perf evlist -v
        ... { sample_period, sample_freq }: 20000 ...
      
      With 'c' option period setup, we get 'c' option value:
        $ perf record -e 'cpu/instructions,period=20000/' -c 1000 sleep 1
        $ perf evlist -v
        ... { sample_period, sample_freq }: 1000 ...
      
      This patch makes the per-event settings overload the global 'c' option
      setup:
      
        $ perf record -e 'cpu/instructions,period=20000/' -c 1000 sleep 1
        $ perf evlist -v
        ... { sample_period, sample_freq }: 20000 ...
      
      I think the making the per-event settings to overload any other config
      makes more sense than current state. However it breaks the current
      'period' term handling, which might cause some noise.. so let's see ;-).
      
      Also fixing parse event tests with the new behaviour.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1438162936-59698-3-git-send-email-kan.liang@intel.comSigned-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ee4c7588
  9. 24 7月, 2015 1 次提交
  10. 21 7月, 2015 1 次提交
    • W
      perf record: Allow filtering perf's pid via --exclude-perf · 4ba1faa1
      Wang Nan 提交于
      This patch allows 'perf record' to exclude events issued by perf itself
      by '--exclude-perf' option.
      
      Before this patch, when doing something like:
      
       # perf record -a -e syscalls:sys_enter_write <cmd>
      
      One could easily get result like this:
      
       # /tmp/perf report --stdio
       ...
        # Overhead  Command  Shared Object       Symbol
        # ........  .......  ..................  ....................
        #
            99.99%  perf     libpthread-2.18.so  [.] __write_nocancel
            0.01%   ls       libc-2.18.so        [.] write
            0.01%   sshd     libc-2.18.so        [.] write
       ...
      
      Where most events are generated by perf itself.
      
      A shell trick can be done to filter perf itself out:
      
       # cat << EOF > ./tmp
       > #!/bin/sh
       > exec perf record -e ... --filter="common_pid != \$\$" -a sleep 10
       > EOF
       # chmod a+x ./tmp
       # ./tmp
      
      However, doing so is user unfriendly.
      
      This patch extracts evsel iteration framework introduced by patch 'perf
      record: Apply filter to all events in a glob matching' into
      foreach_evsel_in_last_glob(), and makes exclude_perf() function append
      new filter expression to each evsel selected by a '-e' selector.
      
      To avoid losing filters if user pass '--filter' after '--exclude-perf',
      this patch uses perf_evsel__append_filter() in both case, instead of
      perf_evsel__set_filter() which removes old filter. As a side effect, now
      it is possible to use multiple '--filter' option for one selector. They
      are combinded with '&&'.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1436513770-8896-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ba1faa1
  11. 14 7月, 2015 1 次提交
  12. 20 6月, 2015 1 次提交
  13. 10 6月, 2015 1 次提交
  14. 12 5月, 2015 1 次提交
  15. 06 5月, 2015 1 次提交
  16. 29 4月, 2015 1 次提交
  17. 08 4月, 2015 1 次提交
  18. 02 3月, 2015 2 次提交
  19. 25 2月, 2015 1 次提交
  20. 19 2月, 2015 1 次提交
    • K
      perf tools: Enable LBR call stack support · aad2b21c
      Kan Liang 提交于
      Currently, there are two call chain recording options, fp and dwarf.
      
      Haswell has a new feature that utilizes the existing LBR facility to
      record call chains. Kernel side LBR support code provides this as a
      third option to record call chains. This patch enables the lbr call
      stack support on the tooling side.
      
      LBR call stack has some limitations:
      
       - It reuses current LBR facility, so LBR call stack and branch record
         can not be enabled at the same time.
      
       - It is only available for user-space callchains.
      
      However, it also offers some advantages:
      
       - LBR call stack can work on user apps which don't have frame-pointers
         or dwarf debug info compiled. It is a good alternative when nothing
         else works.
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Jacob Shin <jacob.w.shin@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masanari Iida <standby24x7@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      aad2b21c
  21. 22 1月, 2015 1 次提交
  22. 03 12月, 2014 1 次提交
  23. 16 11月, 2014 1 次提交
  24. 16 10月, 2014 1 次提交
  25. 05 6月, 2014 1 次提交
  26. 15 1月, 2014 2 次提交
  27. 13 1月, 2014 1 次提交
  28. 28 11月, 2013 2 次提交
  29. 15 11月, 2013 1 次提交
  30. 29 10月, 2013 1 次提交
  31. 09 10月, 2013 1 次提交
  32. 04 10月, 2013 2 次提交
  33. 09 7月, 2013 2 次提交
  34. 01 4月, 2013 1 次提交
    • A
      perf tools: Add support for weight v7 (modified) · 05484298
      Andi Kleen 提交于
      perf record has a new option -W that enables weightened sampling.
      
      Add sorting support in top/report for the average weight per sample and the
      total weight sum. This allows to both compare relative cost per event
      and the total cost over the measurement period.
      
      Add the necessary glue to perf report, record and the library.
      
      v2: Merge with new hist refactoring.
      v3: Fix manpage. Remove value check.
      Rename global_weight to weight and weight to local_weight.
      v4: Readd sort keys to manpage
      v5: Move weight to end
      v6: Move weight to template
      v7: Rename weight key.
      
      Original patch from Andi modified by Stephane Eranian <eranian@google.com>
      to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
      [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      05484298