1. 03 2月, 2016 8 次提交
  2. 08 1月, 2016 1 次提交
  3. 07 1月, 2016 3 次提交
    • N
      perf tools: Skip dynamic fields not defined for current event · 361459f1
      Namhyung Kim 提交于
      When there are multiple events, each dynamic sort key is defined just
      for one event.  In this case other events will always show "N/A" for
      those fields.  But they are meaningless and consume precious screen
      width.
      
      Let's skip those undefined dynamic fields.
      
        $ perf record -e kmem:kmalloc,kmem:kfree -a sleep 1
      
        $ perf report -s 'comm,kmalloc.*' --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 20K of event 'kmem:kmalloc'
        # Event count (approx.): 20533
        #
        # Overhead  Command           call_site                 ptr  bytes_req  bytes_alloc            gfp_flags
        # ........  .......  ..................  ..................  .........  ...........  ...................
        #
            99.89%  perf       ffffffffa01d4396  0xffff8803ffb79720         96           96    GFP_NOFS|GFP_ZERO
             0.06%  sleep      ffffffff8114e1cd  0xffff8803d228a000       4096         4096           GFP_KERNEL
             0.03%  perf       ffffffff811d6ae6  0xffff8803f7678f00        240          256  GFP_KERNEL|GFP_ZERO
             0.00%  perf       ffffffff812263c1  0xffff880406172380        128          128           GFP_KERNEL
             0.00%  perf       ffffffff812264b9  0xffff8803ffac1600        504          512           GFP_KERNEL
             0.00%  perf       ffffffff81226634  0xffff880401dc5280         28           32           GFP_KERNEL
             0.00%  sleep      ffffffff81226da9  0xffff8803ffac3a00        392          512           GFP_KERNEL
      
        # Samples: 20K of event 'kmem:kfree'
        # Event count (approx.): 20597
        #
        # Overhead  Command
        # ........  ..............
        #
            99.63%  perf
             0.14%  sleep
             0.11%  irq/36-iwlwifi
             0.11%  kworker/u16:0
             0.01%  Xorg
             0.00%  firefox
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-12-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      361459f1
    • N
      perf tools: Add 'trace' sort key · a34bb6a0
      Namhyung Kim 提交于
      The 'trace' sort key is to show tracepoint event output using either
      print fmt or plugin.  For example sched_switch event (using plugin) will
      show output like below:
      
        # perf record -e sched:sched_switch -a usleep 10
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.197 MB perf.data (69 samples) ]
        #
      
        $ perf report -s trace --stdio
        ...
        # Overhead  Trace output
        # ........  ...................................................
        #
             9.48%  swapper/0:0 [120] R ==> transmission-gt:17773 [120]
             9.48%  transmission-gt:17773 [120] S ==> swapper/0:0 [120]
             9.04%  swapper/2:0 [120] R ==> transmission-gt:17773 [120]
             8.92%  transmission-gt:17773 [120] S ==> swapper/2:0 [120]
             5.25%  swapper/0:0 [120] R ==> kworker/0:1H:109 [100]
             5.21%  kworker/0:1H:109 [100] S ==> swapper/0:0 [120]
             1.78%  swapper/3:0 [120] R ==> transmission-gt:17773 [120]
             1.78%  transmission-gt:17773 [120] S ==> swapper/3:0 [120]
             1.53%  Xephyr:6524 [120] S ==> swapper/0:0 [120]
             1.53%  swapper/0:0 [120] R ==> Xephyr:6524 [120]
             1.17%  swapper/2:0 [120] R ==> irq/33-iwlwifi:233 [49]
             1.13%  irq/33-iwlwifi:233 [49] S ==> swapper/2:0 [120]
      
      Note that the 'trace' sort key works only for tracepoint events.  If
      it's used to other type of events, just "N/A" will be printed.
      Suggested-and-acked-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-8-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a34bb6a0
    • N
      perf hist: Pass struct sample to __hists__add_entry() · fd36f3dd
      Namhyung Kim 提交于
      This is a preparation to add more info into the hist_entry.  Also it
      already passes too many argument, so passing sample directly will reduce
      the overhead of the function call.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fd36f3dd
  4. 06 10月, 2015 1 次提交
  5. 29 9月, 2015 1 次提交
  6. 15 9月, 2015 1 次提交
  7. 14 9月, 2015 2 次提交
    • K
      perf report: Introduce --socket-filter option · 21394d94
      Kan Liang 提交于
      Introduce --socket-filter option for 'perf report' to only show entries
      for a processor socket that match this filter.
      
        $ perf report --socket-filter 1 --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 752  of event 'cycles'
        # Event count (approx.): 350995599
        # Processor Socket: 1
        #
        # Overhead  Command    Shared Object     Symbol
        # ........  .........  ................  .................................
        #
            97.02%  test       test              [.] plusB_c
             0.97%  test       test              [.] plusA_c
             0.23%  swapper    [kernel.vmlinux]  [k] acpi_idle_do_entry
             0.09%  rcu_sched  [kernel.vmlinux]  [k] dyntick_save_progress_counter
             0.01%  swapper    [kernel.vmlinux]  [k] task_waking_fair
             0.00%  swapper    [kernel.vmlinux]  [k] run_timer_softirq
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1441377946-44429-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      21394d94
    • K
      perf tools: Introduce new sort type "socket" for the processor socket · 2e7ea3ab
      Kan Liang 提交于
      This patch enable perf report to sort by processor socket:
      
        $ perf report --stdio --sort socket,comm,dso,symbol
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 686  of event 'cycles'
        # Event count (approx.): 349215462
        #
        # Overhead SOCKET Command Shared Object    Symbol
        # ........ ...... ....... ................ ............................
        #
          97.05%    000   test    test             [.] plusB_c
           0.98%    000   test    test             [.] plusA_c
           0.93%    001   perf    [kernel.vmlinux] [k] smp_call_function_single
           0.19%    001   perf    [kernel.vmlinux] [k] page_fault
           0.19%    001   swapper [kernel.vmlinux] [k] pm_qos_request
           0.16%    000   test    [kernel.vmlinux] [k] add_mm_counter_fast
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1441377946-44429-2-git-send-email-kan.liang@intel.com
      [ Fix col calc, un-allcapsify col header & read the topology when not using perf.data ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2e7ea3ab
  8. 29 8月, 2015 1 次提交
  9. 11 8月, 2015 1 次提交
    • A
      perf report: Add support for srcfile sort key · 31191a85
      Andi Kleen 提交于
      In some cases it's useful to characterize samples by file. This is
      useful to get a higher level categorization, for example to map cost to
      subsystems.
      
      Add a srcfile sort key to perf report. It builds on top of the existing
      srcline support.
      
      Commiter notes:
      
      E.g.:
      
        # perf record -F 10000 usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.016 MB perf.data (13 samples) ]
        [root@zoo ~]# perf report -s srcfile --stdio
        # Total Lost Samples: 0
        #
        # Samples: 13  of event 'cycles'
        # Event count (approx.): 869878
        #
        # Overhead  Source File
        # ........  ...........
            60.99%  .
            20.62%  paravirt.h
            14.23%  rmap.c
             4.04%  signal.c
             0.11%  msr.h
      
        #
      
      The first line is collecting all the files for which srcfiles couldn't somehow
      get resolved to:
      
        # perf report -s srcfile,dso --stdio
        # Total Lost Samples: 0
        #
        # Samples: 13  of event 'cycles'
        # Event count (approx.): 869878
        #
        # Overhead  Source File  Shared Object
        # ........  ...........  ................
            40.97%  .            ld-2.20.so
            20.62%  paravirt.h   [kernel.vmlinux]
            20.02%  .            libc-2.20.so
            14.23%  rmap.c       [kernel.vmlinux]
             4.04%  signal.c     [kernel.vmlinux]
             0.11%  msr.h        [kernel.vmlinux]
      
        #
      
      XXX: Investigate why that is not resolving on Fedora 21, Andi says he hasn't
           seen this on Fedora 22.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1438988064-21834-1-git-send-email-andi@firstfloor.org
      [ Added column length update, from 0e65bdb3f90f ('perf hists: Update the column width for the "srcline" sort key') ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      31191a85
  10. 07 8月, 2015 2 次提交
  11. 27 5月, 2015 1 次提交
  12. 18 3月, 2015 1 次提交
  13. 03 3月, 2015 1 次提交
    • A
      perf tools: Reference count struct thread · f3b623b8
      Arnaldo Carvalho de Melo 提交于
      We need to do that to stop accumulating entries in the dead_threads
      linked list, i.e. we were keeping references to threads in struct hists
      that continue to exist even after a thread exited and was removed from
      the machine threads rbtree.
      
      We still keep the dead_threads list, but just for debugging, allowing us
      to iterate at any given point over the threads that still are referenced
      by things like struct hist_entry.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-3ejvfyed0r7ue61dkurzjux4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b623b8
  14. 22 1月, 2015 2 次提交
  15. 23 12月, 2014 1 次提交
  16. 15 10月, 2014 2 次提交
    • A
      perf tools: Remove hists from evsel · a635fc51
      Arnaldo Carvalho de Melo 提交于
      Now tools that deals want to have an hists per evsel need to call
      hists__init() before creating any evsels, which can be as early as when
      parsing the command line, so do it before calling parse_options().
      
      The current tools using hists/hist_entries are report, top and annotate,
      change them to request per evsel hists.
      
      This is in preparation for making evsels usable by 3rd party tools, that
      not necessarily live in perf's source code repository.
      Acked-by: NBorislav Petkov <bp@suse.de>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-usjx2la743f10ippj7p1b20x@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a635fc51
    • A
      perf callchain: Move the callchain_param extern to callchain.h · 8f651eae
      Arnaldo Carvalho de Melo 提交于
      It was lost in hist.h, move it to where it belongs, callchain.h, as
      there are places that gets hist.h by means of evsel.h, and since evsel.h
      is being untangled from hist.h...
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-0rg7ji1jnbm6q6gj35j37jby@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8f651eae
  17. 14 10月, 2014 1 次提交
    • A
      perf session: Remove last reference to hists struct · 2a1731fb
      Arnaldo Carvalho de Melo 提交于
      Now perf_session doesn't require that the evsels in its evlist are hists
      containing ones.
      
      Tools that are hists based and want to do per evsel events_stats
      updates, if at some point this turns into a necessity, should do it in
      the tool specific code, keeping the session class hists agnostic.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jean Pihet <jean.pihet@linaro.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-cli1bgwpo82mdikuhy3djsuy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2a1731fb
  18. 11 10月, 2014 1 次提交
  19. 14 8月, 2014 1 次提交
  20. 12 8月, 2014 4 次提交
  21. 09 6月, 2014 1 次提交
    • D
      perf tools: Add dcacheline sort · 9b32ba71
      Don Zickus 提交于
      In perf's 'mem-mode', one can get access to a whole bunch of details specific to a
      particular sample instruction.  A bunch of those details relate to the data
      address.
      
      One interesting thing you can do with data addresses is to convert them into a unique
      cacheline they belong too.  Organizing these data cachelines into similar groups and sorting
      them can reveal cache contention.
      
      This patch creates an alogorithm based on various sample details that can help group
      entries together into data cachelines and allows 'perf report' to sort on it.
      
      The algorithm relies on having proper mmap2 support in the kernel to help determine
      if the memory map the data address belongs to is private to a pid or globally shared.
      
      The alogortithm is as follows:
      
      o group cpumodes together
      o group entries with discovered maps together
      o sort on major, minor, inode and inode generation numbers
      o if userspace anon, then sort on pid
      o sort on cachelines based on data addresses
      
      The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'.
      
      Sample output:
      
       #
       # Samples: 206  of event 'cpu/mem-loads/pp'
       # Total weight : 2534
       # Sort order   : dcacheline,pid
       #
       # Overhead       Samples                                                          Data Cacheline       Command:  Pid
       # ........  ............  ......................................................................  ..................
       #
          13.22%             1  [k] 0xffff88042f08ebc0                                                       swapper:    0
           9.27%             1  [k] 0xffff88082e8cea80                                                       swapper:    0
           3.59%             2  [k] 0xffffffff819ba180                                                       swapper:    0
           0.32%             1  [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0       swapper:    0
           0.32%             1  [k] timekeeper_seq+0xfffffffffffffff8                                        swapper:    0
      
      Note:  Added a '+1' to symlen size in hists__calc_col_len to prevent the next column
      from prematurely tabbing over and mis-aligning.  Not sure what the problem is.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      9b32ba71
  22. 04 6月, 2014 1 次提交
    • J
      perf tools: Move elide bool into perf_hpp_fmt struct · f2998422
      Jiri Olsa 提交于
      After output/sort fields refactoring, it's expensive
      to check the elide bool in its current location inside
      the 'struct sort_entry'.
      
      The perf_hpp__should_skip function gets highly noticable in
      workloads with high number of output/sort fields, like for:
      
        $ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
      
      Performance report:
         9.70%  perf  [.] perf_hpp__should_skip
      
      Moving the elide bool into the 'struct perf_hpp_fmt', which
      makes the perf_hpp__should_skip just single struct read.
      
      Got speedup of around 22% for my test perf.data workload.
      The change should not harm any other workload types.
      
      Performance counter stats for (10 runs):
        before:
         358,319,732,626      cycles                    ( +-  0.55% )
         467,129,581,515      instructions              #    1.30  insns per cycle          ( +-  0.00% )
      
           150.943975206 seconds time elapsed           ( +-  0.62% )
      
        now:
         278,785,972,990      cycles                    ( +-  0.12% )
         370,146,797,640      instructions              #    1.33  insns per cycle          ( +-  0.00% )
      
           116.416670507 seconds time elapsed           ( +-  0.31% )
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      f2998422
  23. 01 6月, 2014 2 次提交