1. 23 5月, 2016 1 次提交
    • A
      perf report: Add srcline_from/to branch sort keys · 508be0df
      Andi Kleen 提交于
      Add "srcline_from" and "srcline_to" branch sort keys that allow to show
      the source lines of a branch.
      
      That makes it much easier to track down where particular branches happen
      in the program, for example to examine branch mispredictions, or to
      associate it with cycle counts:
      
        % perf record -b -e cycles:p ./tcall
        % perf report --sort srcline_from,srcline_to,mispredict
        ...
          15.10%  tcall.c:18       tcall.c:10       N
          14.83%  tcall.c:11       tcall.c:5        N
          14.12%  tcall.c:7        tcall.c:12       N
          14.04%  tcall.c:12       tcall.c:5        N
          12.42%  tcall.c:17       tcall.c:18       N
          12.39%  tcall.c:7        tcall.c:13       N
          12.27%  tcall.c:13       tcall.c:17       N
        ...
      
        % perf report --sort srcline_from,srcline_to,cycles
        ...
          17.12%  tcall.c:18       tcall.c:11       1
          17.01%  tcall.c:12       tcall.c:6        1
          16.98%  tcall.c:11       tcall.c:6        1
          15.91%  tcall.c:17       tcall.c:18       1
           6.38%  tcall.c:7        tcall.c:17       7
           4.80%  tcall.c:7        tcall.c:12       8
           4.21%  tcall.c:7        tcall.c:17       8
           2.67%  tcall.c:7        tcall.c:12       7
           2.62%  tcall.c:7        tcall.c:12       10
           2.10%  tcall.c:7        tcall.c:17       9
           1.58%  tcall.c:7        tcall.c:12       6
           1.44%  tcall.c:7        tcall.c:12       5
           1.38%  tcall.c:7        tcall.c:12       9
           1.06%  tcall.c:7        tcall.c:17       13
           1.05%  tcall.c:7        tcall.c:12       4
           1.01%  tcall.c:7        tcall.c:17       6
      
      Open issues:
      
      - Some kernel symbols get misresolved.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      508be0df
  2. 06 5月, 2016 1 次提交
  3. 26 4月, 2016 1 次提交
    • K
      perf hists: Clear dummy entry accumulated period · 09623d79
      Kan Liang 提交于
      The accumulated period for dummy entry should also be 0.  Otherwise, the
      total overhead could be overcounted.
      
        $ perf record -e '{LLC-load-misses,cpu/instructions/}' --call-graph=lbr ./tchain
        $ perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 21K of event 'anon group { LLC-load-misses, cpu/instructions/ }'
        # Event count (approx.): 16313667937
        #
        #         Children              Self  Command      Shared Object     Symbol
        # ................  ................  ...........  ................  ............................
        #
          4769.98%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] update_fast_timekeeper
          4356.18%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] trigger_load_balance
          3181.12%   0.01%     0.00%   0.01%  tchain_edit  [kernel.vmlinux]  [k] irq_work_tick
          1592.37%   0.00%     0.00%   0.00%  tchain_edit  [kernel.vmlinux]  [k] cpu_needs_another_gp
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1461565689-5862-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      09623d79
  4. 15 4月, 2016 1 次提交
    • A
      perf callchain: Start moving away from global per thread cursors · 91d7b2de
      Arnaldo Carvalho de Melo 提交于
      The recent perf_evsel__fprintf_callchain() move to evsel.c added several
      new symbol requirements to the python binding, for instance:
      
        # perf test -v python
        16: Try 'import perf' in python, checking link problems      :
        --- start ---
        test child forked, pid 18030
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
        callchain_cursor
        test child finished with -1
        ---- end ----
        Try 'import perf' in python, checking link problems: FAILED!
        #
      
      This would require linking against callchain.c to access to the global
      callchain_cursor variables.
      
      Since lots of functions already receive as a parameter a
      callchain_cursor struct pointer, make that be the case for some more
      function so that we can start phasing out usage of yet another global
      variable.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91d7b2de
  5. 30 3月, 2016 1 次提交
  6. 23 3月, 2016 1 次提交
  7. 11 3月, 2016 2 次提交
    • N
      perf tools: Recalc total periods using top-level entries in hierarchy · f7fb538a
      Namhyung Kim 提交于
      When hierarchy mode is enabled, each entry in a hierarchy level shares
      the period.  IOW an upper level entry's period is the sum of lower level
      entries.  Thus perf uses only one of them to calculate the total period
      of hists.  It was lowest-level (leaf) entries but it has a problem when
      it comes to filters.
      
      If a filter is applied, entries in the same level will be filtered or
      not.  But upper level entries still have period of their sum including
      filtered one.  So total sum of upper level entries will not be same as
      sum of lower level entries.
      
      This resulted in entries having more than 100% of overhead and it can be
      produced using perf top with filter(s).
      Reported-and-Tested-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1457531222-18130-8-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f7fb538a
    • N
      perf tools: Fix command line filters in hierarchy mode · aec13a7e
      Namhyung Kim 提交于
      When a command-line filter is applied in hierarchy mode, output is
      broken especially when filtering on lower level.  The higher level
      entries doesn't show up so it's hard to see the results.
      
      Also it needs to handle multi sort keys in a single hierarchy level.
      
      Before:
      
        $ perf report --hierarchy -s 'cpu,{dso,comm}' --comms swapper --stdio
        ...
        #    Overhead  CPU / Shared Object+Command
        # ...........  ...........................
        #
               13.79%     [kernel.vmlinux]  swapper
            31.71%     000
               13.80%     [kernel.vmlinux]  swapper
                0.43%     [e1000e]          swapper
               11.89%     [kernel.vmlinux]  swapper
                9.18%     [kernel.vmlinux]  swapper
      
      After:
      
        #    Overhead  CPU / Shared Object+Command
        # ...........  ...............................
        #
            33.09%     003
               13.79%     [kernel.vmlinux]  swapper
            31.71%     000
               13.80%     [kernel.vmlinux]  swapper
                0.43%     [e1000e]          swapper
            21.90%     002
               11.89%     [kernel.vmlinux]  swapper
            13.30%     001
                9.18%     [kernel.vmlinux]  swapper
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1457531222-18130-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aec13a7e
  8. 08 3月, 2016 2 次提交
  9. 27 2月, 2016 3 次提交
    • N
      perf report: Update column width of dynamic entries · abab5e7f
      Namhyung Kim 提交于
      The column width of dynamic entries is updated when comparing hist
      entries.  However some unique entries can miss the chance to update.  So
      move the update to output resort stage to make sure every entry will get
      called before display.
      
      To do that, abuse ->sort callback to update the width when the third
      argument is NULL.  When resorting entries in normal path, it never be
      NULL so it should be fine IMHO.
      
      Before:
      
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP   <-- here
      
      After:
      
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  .....................................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NOMEMALLOC
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      abab5e7f
    • N
      perf hists: Fix dynamic entry display in hierarchy · e049d4a3
      Namhyung Kim 提交于
      When dynamic sort key is used it might not show pretty printed output.
      This is because the trace output was not set only for the first dynamic
      sort key.  During hierarchy_insert_entry() it missed to pass the
      trace_output to dynamic entries.  Also even if it did, only first entry
      will have it.  Subsequent entries might set it during collapsing stage
      but it's not guaranteed.
      
      Before:
      
        $ perf report --hierarchy --stdio -s ptr,bytes_req,gfp_flags -g none
        #
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        66080
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        66080
                2.08%        512
                   2.08%        67280
      
      After:
      
        #
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e049d4a3
    • N
      perf hists: Fix comparing of dynamic entries · 84b6ee8e
      Namhyung Kim 提交于
      When hist_entry__cmp() and hist_entry__collapse() are called, they
      should check if the dynamic entry is comparing matching hists only.
      
      Otherwise it might access different hists resulting in incorrect output.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      84b6ee8e
  10. 26 2月, 2016 2 次提交
  11. 25 2月, 2016 6 次提交
  12. 22 2月, 2016 1 次提交
  13. 20 2月, 2016 2 次提交
  14. 12 2月, 2016 1 次提交
    • A
      perf hists: Do column alignment on the format iterator · 89fee709
      Arnaldo Carvalho de Melo 提交于
      We were doing column alignment in the format function for each cell,
      returning a string padded with spaces so that when the next column is
      printed the cursor is at its column alignment.
      
      This ends up needlessly printing trailing spaces, do it at the format
      iterator, that is where we know if it is needed, i.e. if there is more
      columns to be printed.
      
      This eliminates the need for triming lines when doing a dump using 'P'
      in the TUI browser and also produces far saner results with things like
      piping 'perf report' to 'less'.
      
      Right now only the formatters for sym->name and the 'locked' column
      (perf mem report), that are the ones that end up at the end of lines
      in the default 'perf report', 'perf top' and 'perf mem report' tools,
      the others will be done in a subsequent patch.
      
      In the end the 'width' parameter for the formatters now mean, in
      'printf' terms, the 'precision', where before it was the field 'width'.
      Reported-by: NDave Jones <davej@codemonkey.org.uk>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      89fee709
  15. 03 2月, 2016 6 次提交
  16. 02 2月, 2016 2 次提交
  17. 26 1月, 2016 3 次提交
  18. 08 1月, 2016 2 次提交
  19. 07 1月, 2016 2 次提交
    • N
      perf tools: Skip dynamic fields not defined for current event · 361459f1
      Namhyung Kim 提交于
      When there are multiple events, each dynamic sort key is defined just
      for one event.  In this case other events will always show "N/A" for
      those fields.  But they are meaningless and consume precious screen
      width.
      
      Let's skip those undefined dynamic fields.
      
        $ perf record -e kmem:kmalloc,kmem:kfree -a sleep 1
      
        $ perf report -s 'comm,kmalloc.*' --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 20K of event 'kmem:kmalloc'
        # Event count (approx.): 20533
        #
        # Overhead  Command           call_site                 ptr  bytes_req  bytes_alloc            gfp_flags
        # ........  .......  ..................  ..................  .........  ...........  ...................
        #
            99.89%  perf       ffffffffa01d4396  0xffff8803ffb79720         96           96    GFP_NOFS|GFP_ZERO
             0.06%  sleep      ffffffff8114e1cd  0xffff8803d228a000       4096         4096           GFP_KERNEL
             0.03%  perf       ffffffff811d6ae6  0xffff8803f7678f00        240          256  GFP_KERNEL|GFP_ZERO
             0.00%  perf       ffffffff812263c1  0xffff880406172380        128          128           GFP_KERNEL
             0.00%  perf       ffffffff812264b9  0xffff8803ffac1600        504          512           GFP_KERNEL
             0.00%  perf       ffffffff81226634  0xffff880401dc5280         28           32           GFP_KERNEL
             0.00%  sleep      ffffffff81226da9  0xffff8803ffac3a00        392          512           GFP_KERNEL
      
        # Samples: 20K of event 'kmem:kfree'
        # Event count (approx.): 20597
        #
        # Overhead  Command
        # ........  ..............
        #
            99.63%  perf
             0.14%  sleep
             0.11%  irq/36-iwlwifi
             0.11%  kworker/u16:0
             0.01%  Xorg
             0.00%  firefox
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-12-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      361459f1
    • N
      perf tools: Try to show pretty printed output for dynamic sort keys · 60517d28
      Namhyung Kim 提交于
      Each tracepoint event has format string for print to improve
      readability.  Try to parse the output and match the field name.  If it
      finds one, use that for the result.  If not, fallbacks to the original
      output.
      
      For example, sort on kmem:kmalloc.gfp_flags looks like below:
      (Note: libtraceevent plugins are not installed on my system.  They might
      affect the output below)
      
      Before:
        # Overhead  Command   gfp_flags
        # ........  .......  ..........
        #
            99.89%  perf          32848
             0.06%  sleep           208
             0.03%  perf          32976
             0.01%  perf            208
      
      After:
        # Overhead  Command            gfp_flags
        # ........  .......  ...................
        #
            99.89%  perf       GFP_NOFS|GFP_ZERO
             0.06%  sleep             GFP_KERNEL
             0.03%  perf     GFP_KERNEL|GFP_ZERO
             0.01%  perf              GFP_KERNEL
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1450804030-29193-7-git-send-email-namhyung@kernel.org
      [ Fixed clash with earlier, updated patch in this patchkit ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      60517d28