1. 20 4月, 2017 4 次提交
  2. 27 3月, 2017 1 次提交
    • M
      perf report: Enable sorting by srcline as key · 5dfa210e
      Milian Wolff 提交于
      Often it is interesting to know how costly a given source line is in
      total. Previously, one had to build these sums manually based on all
      addresses that pointed to the same source line. This patch introduces
      srcline as a sort key, which will do the aggregation for us.
      
      Paired with the recent addition of showing inline frames, this makes
      perf report much more useful for many C++ work loads.
      
      The following shows the new feature in action. First, let's show the
      status quo output when we sort by address. The result contains many hist
      entries that generate the same output:
      
        ~~~~~~~~~~~~~~~~
        $ perf report --stdio --inline -g address
        # Children      Self  Command       Shared Object        Symbol
        # ........  ........  ............  ...................  .........................................
        #
            99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
                  |
                  |--64.55%--main complex:655
                  |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                  |          /usr/include/c++/6.3.1/complex:664 (inline)
                  |          |
                  |          |--60.31%--hypot +20
                  |          |          |
                  |          |          |--8.52%--__hypot_finite +273
                  |          |          |
                  |          |          |--7.32%--__hypot_finite +411
      ...
                   --35.34%--_start +4194346
                             __libc_start_main +241
                             |
                             |--6.65%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                             |
                             |--2.70%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                             |
                             |--1.69%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
        ...
        ~~~~~~~~~~~~~~~~
      
      With this patch and `-g srcline` we instead get the following output:
      
        ~~~~~~~~~~~~~~~~
        $ perf report --stdio --inline -g srcline
        # Children      Self  Command       Shared Object        Symbol
        # ........  ........  ............  ...................  .........................................
        #
            99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
                  |
                  |--64.55%--main complex:655
                  |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                  |          /usr/include/c++/6.3.1/complex:664 (inline)
                  |          |
                  |          |--64.02%--hypot
                  |          |          |
                  |          |           --59.81%--__hypot_finite
                  |          |
                  |           --0.53%--cabs
                  |
                   --35.34%--_start
                             __libc_start_main
                             |
                             |--12.48%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
        ...
        ~~~~~~~~~~~~~~~~
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5dfa210e
  3. 15 3月, 2017 1 次提交
    • H
      perf tools: Add 'cgroup_id' sort order keyword · d890a98c
      Hari Bathini 提交于
      This patch introduces a cgroup identifier entry field in perf report to
      identify or distinguish data of different cgroups. It uses the device
      number and inode number of cgroup namespace, included in perf data with
      the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
      
      With the assumption that each container is created with it's own cgroup
      namespace,  this allows assessment/analysis of multiple containers at
      once.
      
      A simple test for this would be to clone a few processes passing
      SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
      different workloads  on each of those contexts,  while running perf
      record command with --namespaces option.
      
      Shown below is the output of perf report, sorted with cgroup identifier,
      on perf.data generated with the above test scenario, clearly indicating
      one context's considerable use of kernel memory in comparison with
      others:
      
      	$ perf report -s cgroup_id,sample --stdio
      	#
      	# Total Lost Samples: 0
      	#
      	# Samples: 5K of event 'kmem:kmalloc'
      	# Event count (approx.): 5965
      	#
      	# Overhead  cgroup id (dev/inode)       Samples
      	# ........  .....................  ............
      	#
      	    81.27%  3/0xeffffffb                   4848
      	    16.24%  3/0xf00000d0                    969
      	     1.16%  3/0xf00000ce                     69
      	     0.82%  3/0xf00000cf                     49
      	     0.50%  0/0x0                            30
      
      While this is a start, there is further scope of improving this. For
      example, instead of cgroup namespace's device and inode numbers, dev
      and inode numbers of some or all namespaces may be used to distinguish
      which processes are running in a given container context.
      
      Also, scripts to map device and inode info to containers sounds
      plausible for better tracing of containers.
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d890a98c
  4. 13 3月, 2017 1 次提交
  5. 04 3月, 2017 1 次提交
    • C
      perf tools: Allow sorting by symbol size · 7768f8da
      Charles Baylis 提交于
      Add new sort key 'symbol_size' to allow user to sort by symbol size, or
      (more usefully) display the symbol size using --fields=...,symbol_size.
      
      Committer note:
      
      Testing it together with the recently added -q, to remove the headers,
      and using the '+' sign with -s, to add the symbol_size sort order to
      the default, which is '-s/--sort comm,dso,symbol':
      
        # perf report -q -s +symbol_size | head -10
        10.39%  swapper       [kernel.vmlinux] [k] intel_idle               270
         3.45%  swapper       [kernel.vmlinux] [k] update_blocked_averages 1546
         2.61%  swapper       [kernel.vmlinux] [k] update_load_avg         1292
         2.36%  swapper       [kernel.vmlinux] [k] update_cfs_shares        240
         1.83%  swapper       [kernel.vmlinux] [k] __hrtimer_run_queues     606
         1.74%  swapper       [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187
         1.66%  swapper       [kernel.vmlinux] [k] apic_timer_interrupt     152
         1.60%  CPU 0/KVM     [kvm]            [k] kvm_set_msr_common      3046
         1.60%  gnome-shell   libglib-2.0.so.0 [.] g_slist_find              37
         1.46%  gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup      370
        #
      Signed-off-by: NCharles Baylis <charles.baylis@linaro.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1487943176-13840-1-git-send-email-charles.baylis@linaro.org
      [ Use symbol__size(), remove needless %lld + (long long) casting ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7768f8da
  6. 20 2月, 2017 1 次提交
  7. 20 10月, 2016 1 次提交
  8. 23 9月, 2016 3 次提交
  9. 24 8月, 2016 3 次提交
  10. 09 8月, 2016 1 次提交
  11. 15 7月, 2016 1 次提交
  12. 23 6月, 2016 1 次提交
  13. 22 6月, 2016 1 次提交
    • J
      perf hists: Enlarge pid sort entry size · 89c7cb2c
      Jiri Olsa 提交于
      The pid sort entry currently aligns pids with 5 digits, which is not
      enough for current 4 million pids limit.
      
      This leads to unaligned ':' header-data output when we display 7 digits
      pid:
      
        # Children      Self  Symbol                    Pid:Command
        # ........  ........  ......................  .....................
        #
             0.12%     0.12%  [.] 0x0000000000147e0f  2052894:krava
        ...
      
      Adding 2 more digit to properly align the pid limit:
      
        # Children      Self  Symbol                      Pid:Command
        # ........  ........  ......................  .......................
        #
             0.12%     0.12%  [.] 0x0000000000147e0f  2052894:krava
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1466459899-1166-9-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      89c7cb2c
  14. 15 6月, 2016 3 次提交
  15. 23 5月, 2016 1 次提交
    • A
      perf report: Add srcline_from/to branch sort keys · 508be0df
      Andi Kleen 提交于
      Add "srcline_from" and "srcline_to" branch sort keys that allow to show
      the source lines of a branch.
      
      That makes it much easier to track down where particular branches happen
      in the program, for example to examine branch mispredictions, or to
      associate it with cycle counts:
      
        % perf record -b -e cycles:p ./tcall
        % perf report --sort srcline_from,srcline_to,mispredict
        ...
          15.10%  tcall.c:18       tcall.c:10       N
          14.83%  tcall.c:11       tcall.c:5        N
          14.12%  tcall.c:7        tcall.c:12       N
          14.04%  tcall.c:12       tcall.c:5        N
          12.42%  tcall.c:17       tcall.c:18       N
          12.39%  tcall.c:7        tcall.c:13       N
          12.27%  tcall.c:13       tcall.c:17       N
        ...
      
        % perf report --sort srcline_from,srcline_to,cycles
        ...
          17.12%  tcall.c:18       tcall.c:11       1
          17.01%  tcall.c:12       tcall.c:6        1
          16.98%  tcall.c:11       tcall.c:6        1
          15.91%  tcall.c:17       tcall.c:18       1
           6.38%  tcall.c:7        tcall.c:17       7
           4.80%  tcall.c:7        tcall.c:12       8
           4.21%  tcall.c:7        tcall.c:17       8
           2.67%  tcall.c:7        tcall.c:12       7
           2.62%  tcall.c:7        tcall.c:12       10
           2.10%  tcall.c:7        tcall.c:17       9
           1.58%  tcall.c:7        tcall.c:12       6
           1.44%  tcall.c:7        tcall.c:12       5
           1.38%  tcall.c:7        tcall.c:12       9
           1.06%  tcall.c:7        tcall.c:17       13
           1.05%  tcall.c:7        tcall.c:12       4
           1.01%  tcall.c:7        tcall.c:17       6
      
      Open issues:
      
      - Some kernel symbols get misresolved.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      508be0df
  16. 11 5月, 2016 1 次提交
    • N
      perf diff: Fix duplicated output column · e9d848cb
      Namhyung Kim 提交于
      The commit b97511c5 ("perf tools: Add overhead/overhead_children
      keys defaults via string") moved initialization of column headers but it
      missed to check the sort__mode.  As 'perf diff' doesn't call
      perf_hpp__init(), the setup_overhead() also should not be called.
      
      Before:
      
        # Baseline    Delta  Children  Overhead  Shared Object        Symbol
        # ........  .......  ........  ........  ...................  .......................
        #
            28.48%  -28.47%    28.48%    28.48%  [kernel.vmlinux ]    [k] intel_idle
            11.51%  -11.47%    11.51%    11.51%  libxul.so            [.] 0x0000000001a360f7
             3.49%   -3.49%     3.49%     3.49%  [kernel.vmlinux]     [k] generic_exec_single
             2.91%   -2.89%     2.91%     2.91%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
             2.86%   -2.85%     2.86%     2.86%  libxcb.so.1.1.0      [.] 0x000000000000c890
             2.44%   -2.39%     2.44%     2.44%  [kernel.vmlinux]     [k] perf_event_aux_ctx
      
      After:
      
        # Baseline    Delta  Shared Object        Symbol
        # ........  .......  ...................  .......................
        #
            28.48%  -28.47%  [kernel.vmlinux]     [k] intel_idle
            11.51%  -11.47%  libxul.so            [.] 0x0000000001a360f7
             3.49%   -3.49%  [kernel.vmlinux]     [k] generic_exec_single
             2.91%   -2.89%  libdbus-1.so.3.8.11  [.] 0x000000000000cdc2
             2.86%   -2.85%  libxcb.so.1.1.0      [.] 0x000000000000c890
             2.44%   -2.39%  [kernel.vmlinux]     [k] perf_event_aux_ctx
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: <stable@vger.kernel.org> # 4.5+
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: b97511c5 ("perf tools: Add overhead/overhead_children keys defaults via string")
      Link: http://lkml.kernel.org/r/1462890384-12486-2-git-send-email-acme@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e9d848cb
  17. 06 5月, 2016 7 次提交
  18. 23 3月, 2016 1 次提交
  19. 11 3月, 2016 4 次提交
  20. 09 3月, 2016 1 次提交
  21. 08 3月, 2016 2 次提交
    • N
      perf hists: Support multiple sort keys in a hierarchy level · a23f37e8
      Namhyung Kim 提交于
      This implements having multiple sort keys in a single hierarchy level.
      Originally only single sort key is supported for each level, but now
      using the group syntax with '{ }', it can set more than one sort key in
      one level.  Note that now it needs to quote in order to prevent shell
      interpretation.
      
      For example:
      
        $ perf report --hierarchy -s '{comm,dso},sym'
        ...
        #       Overhead  Command / Shared Object / Symbol
        # ..............  ..........................................
        #
            48.67%        swapper          [kernel.vmlinux]
               34.42%        [k] intel_idle
                1.30%        [k] __tick_nohz_idle_enter
                1.03%        [k] cpuidle_reflect
             8.87%        firefox          libpthread-2.22.so
                6.60%        [.] __GI___libc_recvmsg
                1.18%        [.] pthread_cond_signal@@GLIBC_2.3.2
                1.09%        [.] 0x000000000000ff4b
             6.11%        Xorg             libc-2.22.so
                5.27%        [.] __memcpy_sse2_unaligned
      
      In the above example, the command name and the shared object name are
      shown on the same line but the symbol name is on the different line.
      Since the first two are grouped by '{}', they are in the same level.
      
      Suggested-and-Tested=by: Arnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1457361308-514-4-git-send-email-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a23f37e8
    • N
      perf hists: Introduce perf_hpp__setup_hists_formats() · c3bc0c43
      Namhyung Kim 提交于
      The perf_hpp__setup_hists_formats() is to build hists-specific output
      formats (and sort keys).  Currently it's only used in order to build the
      output format in a hierarchy with same sort keys, but it could be used
      with different sort keys in non-hierarchy mode later.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1457361308-514-2-git-send-email-namhyung@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c3bc0c43