1. 10 10月, 2014 1 次提交
    • A
      perf evsel: Add hists helper · 4ea062ed
      Arnaldo Carvalho de Melo 提交于
      Not all tools need a hists instance per perf_evsel, so lets pave the way
      to remove evsel->hists while leaving a way to access the hists from a
      specially allocated evsel, one that comes with space at the end where
      lives the evsel.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-qlktkhe31w4mgtbd84035sr2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ea062ed
  2. 18 9月, 2014 1 次提交
    • J
      perf tools: Add +field argument support for --sort option · 1a1c0ffb
      Jiri Olsa 提交于
      Adding support to add field(s) to default sort order via using the '+'
      prefix, like for report:
      
        $ perf report
        Samples: 2K of event 'cycles', Event count (approx.): 882172583
        Overhead  Command  Shared Object        Symbol
           7.39%  swapper  [kernel.kallsyms]    [k] intel_idle
           1.97%  firefox  libpthread-2.17.so   [.] pthread_mutex_lock
           1.39%  firefox  [snd_hda_intel]      [k] azx_get_position
           1.11%  firefox  libpthread-2.17.so   [.] pthread_mutex_unlock
      
        $ perf report -s +cpu
        Samples: 2K of event 'cycles', Event count (approx.): 882172583
        Overhead  Command  Shared Object        Symbol                  CPU
           2.89%  swapper  [kernel.kallsyms]    [k] intel_idle          000
           2.61%  swapper  [kernel.kallsyms]    [k] intel_idle          002
           1.20%  swapper  [kernel.kallsyms]    [k] intel_idle          001
           0.82%  firefox  libpthread-2.17.so   [.] pthread_mutex_lock  002
      
      Works in general for commands using --sort option.
      
      v2 with changes suggested:
        - Use dynamic memory instead static buffer
        - Fix error message typo
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20140823125948.GA1193@krava.brq.redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1a1c0ffb
  3. 24 8月, 2014 1 次提交
    • J
      perf tools: Add +field argument support for --field option · 2f3f9bcf
      Jiri Olsa 提交于
      Adding support to add field(s) to default field order via using the '+'
      prefix, like for report:
      
        $ perf report
        Samples: 10  of event 'cycles', Event count (approx.): 4463799
        Overhead  Command  Shared Object      Symbol
          32.40%  ls       [kernel.kallsyms]  [k] filemap_fault
          28.19%  ls       [kernel.kallsyms]  [k] get_page_from_freelist
          23.38%  ls       [kernel.kallsyms]  [k] enqueue_entity
          15.04%  ls       [kernel.kallsyms]  [k] mmap_region
      
        $ perf report -F +period,sample
        Samples: 10  of event 'cycles', Event count (approx.): 4463799
        Overhead        Period       Samples  Command  Shared Object      Symbol
          32.40%       1446493             1  ls       [kernel.kallsyms]  [k] filemap_fault
          28.19%       1258486             1  ls       [kernel.kallsyms]  [k] get_page_from_freelist
          23.38%       1043754             1  ls       [kernel.kallsyms]  [k] enqueue_entity
          15.04%        671160             1  ls       [kernel.kallsyms]  [k] mmap_region
      
      Works in general for commands using --field option.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1408715919-25990-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f3f9bcf
  4. 12 8月, 2014 4 次提交
  5. 08 7月, 2014 1 次提交
  6. 09 6月, 2014 1 次提交
    • D
      perf tools: Add dcacheline sort · 9b32ba71
      Don Zickus 提交于
      In perf's 'mem-mode', one can get access to a whole bunch of details specific to a
      particular sample instruction.  A bunch of those details relate to the data
      address.
      
      One interesting thing you can do with data addresses is to convert them into a unique
      cacheline they belong too.  Organizing these data cachelines into similar groups and sorting
      them can reveal cache contention.
      
      This patch creates an alogorithm based on various sample details that can help group
      entries together into data cachelines and allows 'perf report' to sort on it.
      
      The algorithm relies on having proper mmap2 support in the kernel to help determine
      if the memory map the data address belongs to is private to a pid or globally shared.
      
      The alogortithm is as follows:
      
      o group cpumodes together
      o group entries with discovered maps together
      o sort on major, minor, inode and inode generation numbers
      o if userspace anon, then sort on pid
      o sort on cachelines based on data addresses
      
      The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'.
      
      Sample output:
      
       #
       # Samples: 206  of event 'cpu/mem-loads/pp'
       # Total weight : 2534
       # Sort order   : dcacheline,pid
       #
       # Overhead       Samples                                                          Data Cacheline       Command:  Pid
       # ........  ............  ......................................................................  ..................
       #
          13.22%             1  [k] 0xffff88042f08ebc0                                                       swapper:    0
           9.27%             1  [k] 0xffff88082e8cea80                                                       swapper:    0
           3.59%             2  [k] 0xffffffff819ba180                                                       swapper:    0
           0.32%             1  [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0       swapper:    0
           0.32%             1  [k] timekeeper_seq+0xfffffffffffffff8                                        swapper:    0
      
      Note:  Added a '+1' to symlen size in hists__calc_col_len to prevent the next column
      from prematurely tabbing over and mis-aligning.  Not sure what the problem is.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      9b32ba71
  7. 04 6月, 2014 2 次提交
    • J
      perf tools: Move elide bool into perf_hpp_fmt struct · f2998422
      Jiri Olsa 提交于
      After output/sort fields refactoring, it's expensive
      to check the elide bool in its current location inside
      the 'struct sort_entry'.
      
      The perf_hpp__should_skip function gets highly noticable in
      workloads with high number of output/sort fields, like for:
      
        $ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
      
      Performance report:
         9.70%  perf  [.] perf_hpp__should_skip
      
      Moving the elide bool into the 'struct perf_hpp_fmt', which
      makes the perf_hpp__should_skip just single struct read.
      
      Got speedup of around 22% for my test perf.data workload.
      The change should not harm any other workload types.
      
      Performance counter stats for (10 runs):
        before:
         358,319,732,626      cycles                    ( +-  0.55% )
         467,129,581,515      instructions              #    1.30  insns per cycle          ( +-  0.00% )
      
           150.943975206 seconds time elapsed           ( +-  0.62% )
      
        now:
         278,785,972,990      cycles                    ( +-  0.12% )
         370,146,797,640      instructions              #    1.33  insns per cycle          ( +-  0.00% )
      
           116.416670507 seconds time elapsed           ( +-  0.31% )
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      f2998422
    • J
      perf tools: Remove elide setup for SORT_MODE__MEMORY mode · 2ec85c62
      Jiri Olsa 提交于
      There's no need to setup elide of sort_dso sort entry again
      with symbol_conf.dso_list list.
      
      The only difference were list names of memory mode data,
      which does not make much sense to me.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1400858147-7155-2-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      2ec85c62
  8. 01 6月, 2014 2 次提交
  9. 21 5月, 2014 9 次提交
  10. 19 12月, 2013 2 次提交
  11. 12 11月, 2013 1 次提交
  12. 04 11月, 2013 3 次提交
  13. 23 10月, 2013 1 次提交
  14. 22 10月, 2013 1 次提交
  15. 10 10月, 2013 5 次提交
  16. 04 10月, 2013 2 次提交
  17. 23 7月, 2013 1 次提交
  18. 13 7月, 2013 2 次提交