1. 15 3月, 2017 1 次提交
    • H
      perf tools: Add 'cgroup_id' sort order keyword · d890a98c
      Hari Bathini 提交于
      This patch introduces a cgroup identifier entry field in perf report to
      identify or distinguish data of different cgroups. It uses the device
      number and inode number of cgroup namespace, included in perf data with
      the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
      
      With the assumption that each container is created with it's own cgroup
      namespace,  this allows assessment/analysis of multiple containers at
      once.
      
      A simple test for this would be to clone a few processes passing
      SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
      different workloads  on each of those contexts,  while running perf
      record command with --namespaces option.
      
      Shown below is the output of perf report, sorted with cgroup identifier,
      on perf.data generated with the above test scenario, clearly indicating
      one context's considerable use of kernel memory in comparison with
      others:
      
      	$ perf report -s cgroup_id,sample --stdio
      	#
      	# Total Lost Samples: 0
      	#
      	# Samples: 5K of event 'kmem:kmalloc'
      	# Event count (approx.): 5965
      	#
      	# Overhead  cgroup id (dev/inode)       Samples
      	# ........  .....................  ............
      	#
      	    81.27%  3/0xeffffffb                   4848
      	    16.24%  3/0xf00000d0                    969
      	     1.16%  3/0xf00000ce                     69
      	     0.82%  3/0xf00000cf                     49
      	     0.50%  0/0x0                            30
      
      While this is a start, there is further scope of improving this. For
      example, instead of cgroup namespace's device and inode numbers, dev
      and inode numbers of some or all namespaces may be used to distinguish
      which processes are running in a given container context.
      
      Also, scripts to map device and inode info to containers sounds
      plausible for better tracing of containers.
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d890a98c
  2. 13 3月, 2017 1 次提交
  3. 04 3月, 2017 1 次提交
    • C
      perf tools: Allow sorting by symbol size · 7768f8da
      Charles Baylis 提交于
      Add new sort key 'symbol_size' to allow user to sort by symbol size, or
      (more usefully) display the symbol size using --fields=...,symbol_size.
      
      Committer note:
      
      Testing it together with the recently added -q, to remove the headers,
      and using the '+' sign with -s, to add the symbol_size sort order to
      the default, which is '-s/--sort comm,dso,symbol':
      
        # perf report -q -s +symbol_size | head -10
        10.39%  swapper       [kernel.vmlinux] [k] intel_idle               270
         3.45%  swapper       [kernel.vmlinux] [k] update_blocked_averages 1546
         2.61%  swapper       [kernel.vmlinux] [k] update_load_avg         1292
         2.36%  swapper       [kernel.vmlinux] [k] update_cfs_shares        240
         1.83%  swapper       [kernel.vmlinux] [k] __hrtimer_run_queues     606
         1.74%  swapper       [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187
         1.66%  swapper       [kernel.vmlinux] [k] apic_timer_interrupt     152
         1.60%  CPU 0/KVM     [kvm]            [k] kvm_set_msr_common      3046
         1.60%  gnome-shell   libglib-2.0.so.0 [.] g_slist_find              37
         1.46%  gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup      370
        #
      Signed-off-by: NCharles Baylis <charles.baylis@linaro.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1487943176-13840-1-git-send-email-charles.baylis@linaro.org
      [ Use symbol__size(), remove needless %lld + (long long) casting ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7768f8da
  4. 20 2月, 2017 1 次提交
  5. 20 10月, 2016 1 次提交
  6. 23 9月, 2016 3 次提交
  7. 24 8月, 2016 3 次提交
  8. 09 8月, 2016 1 次提交
  9. 15 7月, 2016 1 次提交
  10. 23 6月, 2016 1 次提交
  11. 22 6月, 2016 1 次提交
    • J
      perf hists: Enlarge pid sort entry size · 89c7cb2c
      Jiri Olsa 提交于
      The pid sort entry currently aligns pids with 5 digits, which is not
      enough for current 4 million pids limit.
      
      This leads to unaligned ':' header-data output when we display 7 digits
      pid:
      
        # Children      Self  Symbol                    Pid:Command
        # ........  ........  ......................  .....................
        #
             0.12%     0.12%  [.] 0x0000000000147e0f  2052894:krava
        ...
      
      Adding 2 more digit to properly align the pid limit:
      
        # Children      Self  Symbol                      Pid:Command
        # ........  ........  ......................  .......................
        #
             0.12%     0.12%  [.] 0x0000000000147e0f  2052894:krava
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1466459899-1166-9-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      89c7cb2c
  12. 15 6月, 2016 3 次提交
  13. 23 5月, 2016 1 次提交
    • A
      perf report: Add srcline_from/to branch sort keys · 508be0df
      Andi Kleen 提交于
      Add "srcline_from" and "srcline_to" branch sort keys that allow to show
      the source lines of a branch.
      
      That makes it much easier to track down where particular branches happen
      in the program, for example to examine branch mispredictions, or to
      associate it with cycle counts:
      
        % perf record -b -e cycles:p ./tcall
        % perf report --sort srcline_from,srcline_to,mispredict
        ...
          15.10%  tcall.c:18       tcall.c:10       N
          14.83%  tcall.c:11       tcall.c:5        N
          14.12%  tcall.c:7        tcall.c:12       N
          14.04%  tcall.c:12       tcall.c:5        N
          12.42%  tcall.c:17       tcall.c:18       N
          12.39%  tcall.c:7        tcall.c:13       N
          12.27%  tcall.c:13       tcall.c:17       N
        ...
      
        % perf report --sort srcline_from,srcline_to,cycles
        ...
          17.12%  tcall.c:18       tcall.c:11       1
          17.01%  tcall.c:12       tcall.c:6        1
          16.98%  tcall.c:11       tcall.c:6        1
          15.91%  tcall.c:17       tcall.c:18       1
           6.38%  tcall.c:7        tcall.c:17       7
           4.80%  tcall.c:7        tcall.c:12       8
           4.21%  tcall.c:7        tcall.c:17       8
           2.67%  tcall.c:7        tcall.c:12       7
           2.62%  tcall.c:7        tcall.c:12       10
           2.10%  tcall.c:7        tcall.c:17       9
           1.58%  tcall.c:7        tcall.c:12       6
           1.44%  tcall.c:7        tcall.c:12       5
           1.38%  tcall.c:7        tcall.c:12       9
           1.06%  tcall.c:7        tcall.c:17       13
           1.05%  tcall.c:7        tcall.c:12       4
           1.01%  tcall.c:7        tcall.c:17       6
      
      Open issues:
      
      - Some kernel symbols get misresolved.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      508be0df
  14. 11 5月, 2016 1 次提交
    • N
      perf diff: Fix duplicated output column · e9d848cb
      Namhyung Kim 提交于
      The commit b97511c5 ("perf tools: Add overhead/overhead_children
      keys defaults via string") moved initialization of column headers but it
      missed to check the sort__mode.  As 'perf diff' doesn't call
      perf_hpp__init(), the setup_overhead() also should not be called.
      
      Before:
      
        # Baseline    Delta  Children  Overhead  Shared Object        Symbol
        # ........  .......  ........  ........  ...................  .......................
        #
            28.48%  -28.47%    28.48%    28.48%  [kernel.vmlinux ]    [k] intel_idle
            11.51%  -11.47%    11.51%    11.51%  libxul.so            [.] 0x0000000001a360f7
             3.49%   -3.49%     3.49%     3.49%  [kernel.vmlinux]     [k] generic_exec_single
             2.91%   -2.89%     2.91%     2.91%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
             2.86%   -2.85%     2.86%     2.86%  libxcb.so.1.1.0      [.] 0x000000000000c890
             2.44%   -2.39%     2.44%     2.44%  [kernel.vmlinux]     [k] perf_event_aux_ctx
      
      After:
      
        # Baseline    Delta  Shared Object        Symbol
        # ........  .......  ...................  .......................
        #
            28.48%  -28.47%  [kernel.vmlinux]     [k] intel_idle
            11.51%  -11.47%  libxul.so            [.] 0x0000000001a360f7
             3.49%   -3.49%  [kernel.vmlinux]     [k] generic_exec_single
             2.91%   -2.89%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
             2.86%   -2.85%  libxcb.so.1.1.0      [.] 0x000000000000c890
             2.44%   -2.39%  [kernel.vmlinux]     [k] perf_event_aux_ctx
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: <stable@vger.kernel.org> # 4.5+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: b97511c5 ("perf tools: Add overhead/overhead_children keys defaults via string")
      Link: http://lkml.kernel.org/r/1462890384-12486-2-git-send-email-acme@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e9d848cb
  15. 06 5月, 2016 7 次提交
  16. 23 3月, 2016 1 次提交
  17. 11 3月, 2016 4 次提交
  18. 09 3月, 2016 1 次提交
  19. 08 3月, 2016 3 次提交
  20. 03 3月, 2016 1 次提交
    • A
      perf test: Fix hists related entries · 9b240637
      Arnaldo Carvalho de Melo 提交于
      That got broken by d3a72fd8 ("perf report: Fix indentation of
      dynamic entries in hierarchy"), by using the evlist in setup_sorting()
      without checking if it is NULL, as done in some 'perf test' entries:
      
        $ find tools/ -name "*.c" | xargs grep 'setup_sorting(NULL);'
        tools/perf/tests/hists_output.c:      setup_sorting(NULL);
        tools/perf/tests/hists_output.c:      setup_sorting(NULL);
        tools/perf/tests/hists_output.c:      setup_sorting(NULL);
        tools/perf/tests/hists_output.c:      setup_sorting(NULL);
        tools/perf/tests/hists_output.c:      setup_sorting(NULL);
        tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
        tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
        tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
        tools/perf/tests/hists_cumulate.c:    setup_sorting(NULL);
        $
      
      Fix it.
      
      Before:
      
        [root@jouet ~]# perf test
        <SNIP>
        15: Test matching and linking multiple hists                 : FAILED!
        16: Try 'import perf' in python, checking link problems      : Ok
        17: Test breakpoint overflow signal handler                  : Ok
        18: Test breakpoint overflow sampling                        : Ok
        19: Test number of exit event of a simple workload           : Ok
        20: Test software clock events have valid period values      : Ok
        21: Test object code reading                                 : Ok
        22: Test sample parsing                                      : Ok
        23: Test using a dummy software event to keep tracking       : Ok
        24: Test parsing with no sample_id_all bit set               : Ok
        25: Test filtering hist entries                              : FAILED!
        26: Test mmap thread lookup                                  : Ok
        27: Test thread mg sharing                                   : Ok
        28: Test output sorting of hist entries                      : FAILED!
        29: Test cumulation of child hist entries                    : FAILED!
        <SNIP>
      
      After the patch the above failed tests complete successfully.
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: d3a72fd8 ("perf report: Fix indentation of dynamic entries in hierarchy")
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9b240637
  21. 27 2月, 2016 3 次提交
    • N
      perf report: Update column width of dynamic entries · abab5e7f
      Namhyung Kim 提交于
      The column width of dynamic entries is updated when comparing hist
      entries.  However some unique entries can miss the chance to update.  So
      move the update to output resort stage to make sure every entry will get
      called before display.
      
      To do that, abuse ->sort callback to update the width when the third
      argument is NULL.  When resorting entries in normal path, it never be
      NULL so it should be fine IMHO.
      
      Before:
      
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP   <-- here
      
      After:
      
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  .....................................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NOMEMALLOC
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      abab5e7f
    • N
      perf hists: Fix dynamic entry display in hierarchy · e049d4a3
      Namhyung Kim 提交于
      When dynamic sort key is used it might not show pretty printed output.
      This is because the trace output was not set only for the first dynamic
      sort key.  During hierarchy_insert_entry() it missed to pass the
      trace_output to dynamic entries.  Also even if it did, only first entry
      will have it.  Subsequent entries might set it during collapsing stage
      but it's not guaranteed.
      
      Before:
      
        $ perf report --hierarchy --stdio -s ptr,bytes_req,gfp_flags -g none
        #
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        66080
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        66080
                2.08%        512
                   2.08%        67280
      
      After:
      
        #
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e049d4a3
    • N
      perf report: Fix indentation of dynamic entries in hierarchy · d3a72fd8
      Namhyung Kim 提交于
      When dynamic entries are used in the hierarchy mode with multiple
      events, the output might not be aligned properly.  In the hierarchy
      mode, the each sort column is indented using total number of sort keys.
      So it keeps track of number of sort keys when adding them.  However
      a dynamic sort key can be added more than once when multiple events have
      same field names.  This results in unnecessarily long indentation in the
      output.
      
      For example perf kmem records following events:
      
        $ perf evlist --trace-fields -i perf.data.kmem
        kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
        kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
        kmem:kfree: trace_fields: call_site,ptr
        kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
        kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
        kmem:kmem_cache_free: trace_fields: call_site,ptr
        kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype
        kmem:mm_page_free: trace_fields: page,order
      
      As you can see, many field names shared between kmem events.  So adding
      'ptr' dynamic sort key alone will set nr_sort_keys to 6.  And this adds
      many unnecessary spaces between columns.
      
      Before:
      
        $ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio
        ...
        #                Overhead                 ptr
        # .......................  ...................................
        #
            99.89%                 0xffff8803ffb79720
             0.06%                 0xffff8803d228a000
             0.03%                 0xffff8803f7678f00
             0.00%                 0xffff880401dc5280
             0.00%                 0xffff880406172380
             0.00%                 0xffff8803ffac3a00
             0.00%                 0xffff8803ffac1600
      
      After:
      
        # Overhead                 ptr
        # ........  ....................
        #
            99.89%  0xffff8803ffb79720
             0.06%  0xffff8803d228a000
             0.03%  0xffff8803f7678f00
             0.00%  0xffff880401dc5280
             0.00%  0xffff880406172380
             0.00%  0xffff8803ffac3a00
             0.00%  0xffff8803ffac1600
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3a72fd8