1. 04 5月, 2017 1 次提交
  2. 13 4月, 2017 1 次提交
    • R
      perf trace: Add usage of --no-syscalls in man page · 739cf305
      Ravi Bangoria 提交于
      perf trace supports --no-syscalls option but it's not listed in the man
      page. (Though, I see an example using --no-syscalls in EXAMPLES
      section.)
      
      Committer note:
      
      The --no-syscalls option tells 'perf trace' not to automagically ask for
      raw_syscalls:sys_{enter,exit} to then format it in a strace like way.
      
      This become more used as 'perf trace' got support for arbitrary events,
      such as tracepoints, so more and more we use:
      
        # perf trace --no-syscalls -e nmi:*
           0.000 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 36649 handled: 1)
           0.019 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 2907 handled: 0)
           0.676 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 9401 handled: 1)
           0.680 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 288 handled: 0)
           0.701 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 4977 handled: 1)
           0.703 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 67 handled: 0)
           0.736 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 8549 handled: 1)
        ^C#
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1492063332-5745-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      739cf305
  3. 12 4月, 2017 1 次提交
  4. 27 3月, 2017 3 次提交
    • M
      perf report: Enable sorting by srcline as key · 5dfa210e
      Milian Wolff 提交于
      Often it is interesting to know how costly a given source line is in
      total. Previously, one had to build these sums manually based on all
      addresses that pointed to the same source line. This patch introduces
      srcline as a sort key, which will do the aggregation for us.
      
      Paired with the recent addition of showing inline frames, this makes
      perf report much more useful for many C++ work loads.
      
      The following shows the new feature in action. First, let's show the
      status quo output when we sort by address. The result contains many hist
      entries that generate the same output:
      
        ~~~~~~~~~~~~~~~~
        $ perf report --stdio --inline -g address
        # Children      Self  Command       Shared Object        Symbol
        # ........  ........  ............  ...................  .........................................
        #
            99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
                  |
                  |--64.55%--main complex:655
                  |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                  |          /usr/include/c++/6.3.1/complex:664 (inline)
                  |          |
                  |          |--60.31%--hypot +20
                  |          |          |
                  |          |          |--8.52%--__hypot_finite +273
                  |          |          |
                  |          |          |--7.32%--__hypot_finite +411
      ...
                   --35.34%--_start +4194346
                             __libc_start_main +241
                             |
                             |--6.65%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                             |
                             |--2.70%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                             |
                             |--1.69%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
        ...
        ~~~~~~~~~~~~~~~~
      
      With this patch and `-g srcline` we instead get the following output:
      
        ~~~~~~~~~~~~~~~~
        $ perf report --stdio --inline -g srcline
        # Children      Self  Command       Shared Object        Symbol
        # ........  ........  ............  ...................  .........................................
        #
            99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
                  |
                  |--64.55%--main complex:655
                  |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                  |          /usr/include/c++/6.3.1/complex:664 (inline)
                  |          |
                  |          |--64.02%--hypot
                  |          |          |
                  |          |           --59.81%--__hypot_finite
                  |          |
                  |           --0.53%--cabs
                  |
                   --35.34%--_start
                             __libc_start_main
                             |
                             |--12.48%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
        ...
        ~~~~~~~~~~~~~~~~
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5dfa210e
    • J
      perf report: Introduce --inline option · f3a60646
      Jin Yao 提交于
      It takes some time to look for inline stack for callgraph addresses.  So
      it provides new option "--inline" to let user decide if enable this
      feature.
      
        --inline:
      
        If a callgraph address belongs to an inlined function, the inline stack
        will be printed. Each entry is the inline function name or file/line.
      Signed-off-by: NYao Jin <yao.jin@linux.intel.com>
      Tested-by: NMilian Wolff <milian.wolff@kdab.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1490474069-15823-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3a60646
    • R
      perf list sdt: Show option in man page · 6963d3c3
      Ravi Bangoria 提交于
      Commit 40218dae ("perf list: Show SDT and pre-cached events") added
      sdt support in perf list, but it missed to update documentation.
      
      Show sdt option in man perf-list.
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20170327025538.1753-1-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6963d3c3
  5. 23 3月, 2017 1 次提交
  6. 22 3月, 2017 1 次提交
    • A
      perf stat: Collapse identically named events · 430daf2d
      Andi Kleen 提交于
      The uncore PMU has a lot of duplicated PMUs for different subsystems.
      When expanding an uncore alias we usually end up with a large
      number of identically named aliases, which makes perf stat
      output difficult to read.
      
      Automatically sum them up in perf stat, unless --no-merge is specified.
      
      This can be default because only the uncores generally have duplicated
      aliases. Other PMUs have unique names.
      
      Before:
      
        % perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1
      
        Performance counter stats for 'system wide':
      
                 694,976 Bytes unc_c_llc_lookup.any
                 706,304 Bytes unc_c_llc_lookup.any
                 956,608 Bytes unc_c_llc_lookup.any
                 782,720 Bytes unc_c_llc_lookup.any
                 605,696 Bytes unc_c_llc_lookup.any
                 442,816 Bytes unc_c_llc_lookup.any
                 659,328 Bytes unc_c_llc_lookup.any
                 509,312 Bytes unc_c_llc_lookup.any
                 263,936 Bytes unc_c_llc_lookup.any
                 592,448 Bytes unc_c_llc_lookup.any
                 672,448 Bytes unc_c_llc_lookup.any
                 608,640 Bytes unc_c_llc_lookup.any
                 641,024 Bytes unc_c_llc_lookup.any
                 856,896 Bytes unc_c_llc_lookup.any
                 808,832 Bytes unc_c_llc_lookup.any
                 684,864 Bytes unc_c_llc_lookup.any
                 710,464 Bytes unc_c_llc_lookup.any
                 538,304 Bytes unc_c_llc_lookup.any
      
             1.002577660 seconds time elapsed
      
      After:
      
        % perf stat -a -e unc_c_llc_lookup.any sleep 1
      
        Performance counter stats for 'system wide':
      
               2,685,120 Bytes unc_c_llc_lookup.any
      
             1.002648032 seconds time elapsed
      
      v2: Split collect_aliases. Rename alias flag.
      v3: Make sure unsupported/not counted is always printed.
      v4: Factor out callback change into separate patch.
      v5: Move check for bad results here
          Move merged check into collect_data
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      430daf2d
  7. 21 3月, 2017 1 次提交
  8. 16 3月, 2017 1 次提交
    • A
      perf script: Add 'brstackinsn' for branch stacks · 48d02a1d
      Andi Kleen 提交于
      Implement printing instruction sequences as hex dump for branch stacks.
      
      This relies on the x86 instruction decoder used by the PT decoder to
      find the lengths of instructions to dump them individually.
      
      This is good enough for pattern matching.
      
      This allows to study hot paths for individual samples, together with
      branch misprediction and cycle count / IPC information if available (on
      Skylake systems).
      
        % perf record -b ...
        % perf script -F brstackinsn
        ...
          read_hpet+67:
                ffffffff9905b843        insn: 74 ea                     # PRED
                ffffffff9905b82f        insn: 85 c9
                ffffffff9905b831        insn: 74 12
                ffffffff9905b833        insn: f3 90
                ffffffff9905b835        insn: 48 8b 0f
                ffffffff9905b838        insn: 48 89 ca
                ffffffff9905b83b        insn: 48 c1 ea 20
                ffffffff9905b83f        insn: 39 f2
                ffffffff9905b841        insn: 89 d0
                ffffffff9905b843        insn: 74 ea                     # PRED
      
      Only works when no special branch filters are specified.
      
      Occasionally the path does not reach up to the sample IP, as the LBRs
      may be frozen before executing a final jump. In this case we print a
      special message.
      
      The instruction dumper piggy backs on the existing infrastructure from
      the IP PT decoder.
      
      An earlier iteration of this patch relied on a disassembler, but this
      version only uses the existing instruction decoder.
      
      Committer note:
      
      Added hint about how to get suitable perf.data files for use with
      '-F brstackinsm':
      
        $ perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
        $
        $ perf script -F brstackinsn
        Display of branch stack assembler requested, but non all-branch filter set
        Hint: run 'perf record -b ...'
        $
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      48d02a1d
  9. 15 3月, 2017 3 次提交
    • B
      perf sched timehist: Add --next option · 292c4a8f
      Brendan Gregg 提交于
      The --next option shows the next task for each context switch, providing
      more context for the sequence of scheduler events.
      
        $ perf sched timehist --next | head
        Samples do not have callchains.
             time  cpu task name  waittime schdelay run time
                       [tid/pid]     (msec) (msec) (msec)
        ---------- --- ---------- --------- ------ -----
        374.793792 [0] <idle>         0.000  0.000 0.000 next: rngd[1524]
        374.793801 [0] rngd[1524]     0.000  0.000 0.009 next: swapper/0[0]
        374.794048 [7] <idle>         0.000  0.000 0.000 next: yes[30884]
        374.794066 [7] yes[30884]     0.000  0.000 0.018 next: swapper/7[0]
        374.794126 [2] <idle>         0.000  0.000 0.000 next: rngd[1524]
        374.794140 [2] rngd[1524]     0.325  0.006 0.013 next: swapper/2[0]
        374.794281 [3] <idle>         0.000  0.000 0.000 next: perf[31070]
      Signed-off-by: NBrendan Gregg <bgregg@netflix.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1489456589-32555-1-git-send-email-bgregg@netflix.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      292c4a8f
    • H
      perf tools: Add 'cgroup_id' sort order keyword · d890a98c
      Hari Bathini 提交于
      This patch introduces a cgroup identifier entry field in perf report to
      identify or distinguish data of different cgroups. It uses the device
      number and inode number of cgroup namespace, included in perf data with
      the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
      
      With the assumption that each container is created with it's own cgroup
      namespace,  this allows assessment/analysis of multiple containers at
      once.
      
      A simple test for this would be to clone a few processes passing
      SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
      different workloads  on each of those contexts,  while running perf
      record command with --namespaces option.
      
      Shown below is the output of perf report, sorted with cgroup identifier,
      on perf.data generated with the above test scenario, clearly indicating
      one context's considerable use of kernel memory in comparison with
      others:
      
      	$ perf report -s cgroup_id,sample --stdio
      	#
      	# Total Lost Samples: 0
      	#
      	# Samples: 5K of event 'kmem:kmalloc'
      	# Event count (approx.): 5965
      	#
      	# Overhead  cgroup id (dev/inode)       Samples
      	# ........  .....................  ............
      	#
      	    81.27%  3/0xeffffffb                   4848
      	    16.24%  3/0xf00000d0                    969
      	     1.16%  3/0xf00000ce                     69
      	     0.82%  3/0xf00000cf                     49
      	     0.50%  0/0x0                            30
      
      While this is a start, there is further scope of improving this. For
      example, instead of cgroup namespace's device and inode numbers, dev
      and inode numbers of some or all namespaces may be used to distinguish
      which processes are running in a given container context.
      
      Also, scripts to map device and inode info to containers sounds
      plausible for better tracing of containers.
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d890a98c
    • H
      perf script: Add script print support for namespace events · 96a44bbc
      Hari Bathini 提交于
      Introduce a new option to display events of type PERF_RECORD_NAMESPACES
      and update perf-script documentation accordingly.
      
      Shown below is output (trimmed) of perf script command with the newly
      introduced option, on perf.data generated with perf record command using
      --namespaces option.
      
        $ perf script --show-namespace-events
            swapper   0 [000]     0.000000: PERF_RECORD_NAMESPACES 1/1 - nr_namespaces: 7
                      [0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
            swapper   0 [000]     0.000000: PERF_RECORD_NAMESPACES 2/2 - nr_namespaces: 7
                      [0/net: 3/0xf000001c, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      
      Commiter notes:
      
      Testing it:
      
      Investigating that double PERF_RECORD_NAMESPACES for the 19155
      pid/tid... Its more than that, there are two PERF_RECORD_COMM as well,
      and with zeroed timestamps, so probably a synthesizing artifact...
      
        # perf script --show-task --show-namespace
        <SNIP>
            perf     0 [000]     0.000000: PERF_RECORD_COMM: perf:19154/19154
            perf     0 [000]     0.000000: PERF_RECORD_FORK(19155:19155):(19154:19154)
            perf     0 [000]     0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
                [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
            perf     0 [000]     0.000000: PERF_RECORD_COMM: perf:19155/19155
            perf     0 [000]     0.000000: PERF_RECORD_COMM: perf:19155/19155
            perf     0 [000]     0.000000: PERF_RECORD_NAMESPACES 19155/19155 - nr_namespaces: 7
                [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                 4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
         swapper     0 [000]  3110.881834:          1 cycles:  ffffffffa7060bf6 native_write_msr (/lib/modules/4.11.0-rc1+/build/vmlinux)
      
        <SNIP>
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891932627.25309.1941587059154176221.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      96a44bbc
  10. 14 3月, 2017 1 次提交
    • H
      perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info · f3b3614a
      Hari Bathini 提交于
      Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
      by the kernel when fork, clone, setns or unshare are invoked. And update
      perf-record documentation with the new option to record namespace
      events.
      
      Committer notes:
      
      Combined it with a later patch to allow printing it via 'perf report -D'
      and be able to test the feature introduced in this patch. Had to move
      here also perf_ns__name(), that was introduced in another later patch.
      
      Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
      
        util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
           ret  += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
                                               ^
      Testing it:
      
        # perf record --namespaces -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
        #
        # perf report -D
        <SNIP>
        3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
                      [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
                       4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
      
        0x1151e0 [0x30]: event: 9
        .
        . ... raw event: size 48 bytes
        .  0000:  09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00  ......0..q.h....
        .  0010:  a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00  .9...9...(.c....
        .  0020:  03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00  ................
        <SNIP>
              NAMESPACES events:          1
        <SNIP>
        #
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sargun Dhillon <sargun@sargun.me>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b3614a
  11. 13 3月, 2017 1 次提交
  12. 04 3月, 2017 3 次提交
    • N
      perf ftrace: Add support for -a and -C option · dc231032
      Namhyung Kim 提交于
      The -a/--all-cpus and -C/--cpu option is for controlling tracing cpus.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170224011251.14946-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dc231032
    • N
      perf ftrace: Add support for --pid option · a9af6be5
      Namhyung Kim 提交于
      The -p (--pid) option enables to trace existing process by its pid.
      
      Committer notes:
      
      Testing it:
      
      Using the function_graph tracer on a process that is just waiting for user
      input and thus will make 'perf ftrace' sit there waiting for that, then press
      any key on that mutt session and see what happens:
      
        # perf ftrace -t function_graph -p `pidof mutt` | head -40
        2)   1.038 us    |  switch_mm_irqs_off();
        ------------------------------------------
        2)    <idle>-0    =>   mutt-3595
        ------------------------------------------
      
        2)               |              finish_task_switch() {
        2)               |                smp_irq_work_interrupt() {
        2)               |                  irq_enter() {
        2)   0.180 us    |                    rcu_irq_enter();
        2)   1.248 us    |                  }
        2)               |                  __wake_up() {
        2)   0.126 us    |                    _raw_spin_lock_irqsave();
        2)               |                    __wake_up_common() {
        2)               |                      pollwake() {
        2)               |                        default_wake_function() {
        2)               |                          try_to_wake_up() {
        2)   0.662 us    |                            _raw_spin_lock_irqsave();
        2)               |                            select_task_rq_fair() {
        2)   1.719 us    |                              effective_load.isra.41();
        2)   1.343 us    |                              effective_load.isra.41();
        2)               |                              select_idle_sibling() {
        2)   0.331 us    |                                idle_cpu();
        2)   1.458 us    |                              }
        2)   8.350 us    |                            }
        2)   0.200 us    |                            _raw_spin_lock();
        2)               |                            ttwu_do_activate() {
        2)               |                              activate_task() {
        2)   0.136 us    |                                update_rq_clock.part.77();
        2)               |                                enqueue_task_fair() {
        2)               |                                  enqueue_entity() {
        2)   0.146 us    |                                    update_curr();
        2)   0.330 us    |                                    account_entity_enqueue();
        2)   0.280 us    |                                    update_cfs_shares();
        2)   0.321 us    |                                    place_entity();
        2)   0.206 us    |                                    __enqueue_entity();
        2)   6.926 us    |                                  }
        2)               |                                  enqueue_entity() {
        2)   0.105 us    |                                    update_curr();
        2)   0.175 us    |                                    account_entity_enqueue();
        2)   0.531 us    |                                    update_cfs_shares();
       #
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170224011251.14946-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a9af6be5
    • C
      perf tools: Allow sorting by symbol size · 7768f8da
      Charles Baylis 提交于
      Add new sort key 'symbol_size' to allow user to sort by symbol size, or
      (more usefully) display the symbol size using --fields=...,symbol_size.
      
      Committer note:
      
      Testing it together with the recently added -q, to remove the headers,
      and using the '+' sign with -s, to add the symbol_size sort order to
      the default, which is '-s/--sort comm,dso,symbol':
      
        # perf report -q -s +symbol_size | head -10
        10.39%  swapper       [kernel.vmlinux] [k] intel_idle               270
         3.45%  swapper       [kernel.vmlinux] [k] update_blocked_averages 1546
         2.61%  swapper       [kernel.vmlinux] [k] update_load_avg         1292
         2.36%  swapper       [kernel.vmlinux] [k] update_cfs_shares        240
         1.83%  swapper       [kernel.vmlinux] [k] __hrtimer_run_queues     606
         1.74%  swapper       [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187
         1.66%  swapper       [kernel.vmlinux] [k] apic_timer_interrupt     152
         1.60%  CPU 0/KVM     [kvm]            [k] kvm_set_msr_common      3046
         1.60%  gnome-shell   libglib-2.0.so.0 [.] g_slist_find              37
         1.46%  gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup      370
        #
      Signed-off-by: NCharles Baylis <charles.baylis@linaro.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1487943176-13840-1-git-send-email-charles.baylis@linaro.org
      [ Use symbol__size(), remove needless %lld + (long long) casting ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7768f8da
  13. 28 2月, 2017 1 次提交
  14. 20 2月, 2017 3 次提交
    • N
      perf annotate: Add -q/--quiet option · eddaef88
      Namhyung Kim 提交于
      The -q/--quiet option is to suppress any message.  Sometimes users just
      want to see the numbers and it can be used for that case.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170217081742.17417-6-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eddaef88
    • N
      perf diff: Add -q/--quiet option · 63b42fce
      Namhyung Kim 提交于
      The -q/--quiet option is to suppress any message.  Sometimes users just
      want to see the numbers and it can be used for that case.
      
      Committer notes:
      
      Before:
      
        # perf diff | head -10
        Failed to open /tmp/perf-6678.map, continuing without symbols
        Failed to open /tmp/perf-6678.map, continuing without symbols
        Failed to open /tmp/perf-2646.map, continuing without symbols
        # Event 'cycles'
        #
        # Baseline  Delta Abs  Shared Object               Symbol
        # ........  .........  ..........................  ............................................
        #
             5.36%     -1.76%  [kernel.vmlinux]            [k] intel_idle
             2.80%     +1.48%  firefox                     [.] 0x00000000000101fe
            57.12%     -1.25%  libxul.so                   [.] 0x00000000009bea92
             1.36%     -1.11%  [kernel.vmlinux]            [k] __schedule
             4.26%     -1.00%  perf-6678.map               [.] 0x00007fac4b0e9320
      
      After:
      
        # perf diff -q | head -10
             5.36%     -1.76%  [kernel.vmlinux]            [k] intel_idle
             2.80%     +1.48%  firefox                     [.] 0x00000000000101fe
            57.12%     -1.25%  libxul.so                   [.] 0x00000000009bea92
             1.36%     -1.11%  [kernel.vmlinux]            [k] __schedule
             4.26%     -1.00%  perf-6678.map               [.] 0x00007fac4b0e9320
             1.86%     +0.95%  [kernel.vmlinux]            [k] update_blocked_averages
             0.80%     -0.70%  [kernel.vmlinux]            [k] native_sched_clock
             0.74%     -0.58%  [kernel.vmlinux]            [k] native_write_msr
             0.76%     -0.56%  qemu-system-x86_64          [.] 0x00000000002395c0
                       +0.54%  libpulsecommon-10.0.so      [.] 0x000000000002d91b
        #
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170217081742.17417-5-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63b42fce
    • N
      perf report: Add -q/--quiet option · 27fafab5
      Namhyung Kim 提交于
      The -q/--quiet option is to suppress any message.  Sometimes users just
      want to see the numbers and it can be used for that case.
      
      Before:
      
        $ perf report | head -15
        Failed to open /lib/modules/3.19.3-3-ARCH/kernel/fs/ext4/ext4.ko.gz, continuing without symbols
        Failed to open /lib/modules/3.19.3-3-ARCH/kernel/fs/jbd2/jbd2.ko.gz, continuing without symbols
        Failed to open /tmp/perf-14507.map, continuing without symbols
        ...
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 39K of event 'cycles'
        # Event count (approx.): 30444796573
        #
        # Overhead  Command      Shared Object        Symbol
        # ........  ...........  ...................  .........................
        #
             9.28%  swapper	   [kernel.vmlinux]     [k] intel_idle
             5.64%  swapper	   [kernel.vmlinux]     [k] native_write_msr_safe
             1.93%  swapper	   [kernel.vmlinux]     [k] __switch_to
             1.89%  swapper	   [kernel.vmlinux]     [k] menu_select
             1.75%  sched-pipe   [kernel.vmlinux]     [k] __switch_to
      
      After:
      
        $ perf report -q | head
             9.28%  swapper	   [kernel.vmlinux]     [k] intel_idle
             5.64%  swapper	   [kernel.vmlinux]     [k] native_write_msr_safe
             1.93%  swapper	   [kernel.vmlinux]     [k] __switch_to
             1.89%  swapper	   [kernel.vmlinux]     [k] menu_select
             1.75%  sched-pipe   [kernel.vmlinux]     [k] __switch_to
             1.67%  swapper	   [kernel.vmlinux]     [k] cpu_startup_entry
             1.48%  sched-pipe   [kernel.vmlinux]     [k] enqueue_entity
             1.46%  swapper	   [kernel.vmlinux]     [k] __schedule
             1.36%  swapper	   [kernel.vmlinux]     [k] native_read_tsc
             1.34%  sched-pipe   [kernel.vmlinux]     [k] __schedule
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Suggested-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170217081742.17417-4-namhyung@kernel.org
      [ Removed builtin-report.c verbose > 0 hunk added to the previous patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      27fafab5
  15. 18 2月, 2017 2 次提交
  16. 14 2月, 2017 4 次提交
  17. 26 1月, 2017 1 次提交
    • N
      perf ftrace: Introduce new 'ftrace' tool · d01f4e8d
      Namhyung Kim 提交于
      The 'perf ftrace' command is a simple wrapper of kernel's ftrace
      functionality.  It only supports single thread tracing currently and
      just reads trace_pipe in text and then write it to stdout.
      
      Committer notes:
      
      Testing it:
      
        # perf ftrace -f function_graph usleep 123456
        <SNIP>
        2)               |  SyS_nanosleep() {
        2)               |    _copy_from_user() {
        <SNIP>
        2)   0.900 us    |      }
        2)   1.354 us    |    }
        2)               |    hrtimer_nanosleep() {
        2)   0.062 us    |      __hrtimer_init();
        2)               |      do_nanosleep() {
        2)               |        hrtimer_start_range_ns() {
        <SNIP>
        2)   5.025 us    |        }
        2)               |        schedule() {
        2)   0.125 us    |          rcu_note_context_switch();
        2)   0.057 us    |          _raw_spin_lock();
        2)               |          deactivate_task() {
        2)   0.369 us    |            update_rq_clock.part.77();
        2)               |            dequeue_task_fair() {
        <SNIP>
        2) + 22.453 us   |            }
        2) + 23.736 us   |          }
        2)               |          pick_next_task_fair() {
        <SNIP>
        2) + 47.167 us   |          }
        2)               |          pick_next_task_idle() {
        <SNIP>
        2)   4.462 us    |          }
        ------------------------------------------
        2)  usleep-20387  =>    <idle>-0
        ------------------------------------------
      
        2)   0.806 us    |  switch_mm_irqs_off();
        ------------------------------------------
        2)    <idle>-0    =>  usleep-20387
        ------------------------------------------
      
        2)   0.151 us    |          finish_task_switch();
        2) @ 123597.2 us |        }
        2)   0.037 us    |        _cond_resched();
        2)               |        hrtimer_try_to_cancel() {
        2)   0.064 us    |          hrtimer_active();
        2)   0.353 us    |        }
        2) @ 123605.3 us |      }
        2) @ 123606.2 us |    }
        2) @ 123608.3 us |  } /* SyS_nanosleep */
        2)               |  __do_page_fault() {
       <SNIP>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jeremy Eder <jeder@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>,
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/n/tip-r1hgmsj4dxny8arn3o9mw512@git.kernel.org
      [ Various foward port fixes, add man page ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d01f4e8d
  18. 21 1月, 2017 1 次提交
  19. 17 1月, 2017 2 次提交
  20. 12 1月, 2017 4 次提交
  21. 03 1月, 2017 1 次提交
  22. 16 12月, 2016 1 次提交
    • N
      perf sched timehist: Add -I/--idle-hist option · 07235f84
      Namhyung Kim 提交于
      The --idle-hist option is to analyze system idle state so which process
      makes cpu to go idle.  If this option is specified, non-idle events will
      be skipped and processes switching to/from idle will be shown.
      
      This option is mostly useful when used with --summary(-only) option.  In
      the idle-time summary view, idle time is accounted to previous thread
      which is run before idle task.
      
      The example output looks like following:
      
        Idle-time summary
                        comm parent sched-out idle-time min-idle avg-idle max-idle stddev migrations
                                      (count)    (msec)   (msec)   (msec)   (msec)      %
        --------------------------------------------------------------------------------------------
              rcu_preempt[7]      2        95   550.872    0.011    5.798   23.146   7.63      0
             migration/1[16]      2         1    15.558   15.558   15.558   15.558   0.00      0
              khugepaged[39]      2         1     3.062    3.062    3.062    3.062   0.00      0
           kworker/0:1H[124]      2         2     4.728    0.611    2.364    4.116  74.12      0
        systemd-journal[167]      1         1     4.510    4.510    4.510    4.510   0.00      0
          kworker/u16:3[558]      2        13    74.737    0.080    5.749   12.960  21.96      0
         irq/34-iwlwifi[628]      2        21   118.403    0.032    5.638   23.990  24.00      0
          kworker/u17:0[673]      2         1     3.523    3.523    3.523    3.523   0.00      0
            dbus-daemon[722]      1         1     6.743    6.743    6.743    6.743   0.00      0
                ifplugd[741]      1         1    58.826   58.826   58.826   58.826   0.00      0
        wpa_supplicant[1490]      1         1    13.302   13.302   13.302   13.302   0.00      0
           wpa_actiond[1492]      1         2     4.064    0.168    2.032    3.896  91.72      0
               dockerd[1500]      1         1     0.055    0.055    0.055    0.055   0.00      0
        ...
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20161208144755.16673-6-namhyung@kernel.org
      Link: http://lkml.kernel.org/r/20161213080632.19099-2-namhyung@kernel.org
      [ Merged fix sent by Namhyumg, as posted in the second Link: tag ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      07235f84
  23. 07 12月, 2016 1 次提交
  24. 02 12月, 2016 1 次提交
    • D
      perf report: Add option to specify time window of interest · 46690a80
      David Ahern 提交于
      Add option to allow user to control analysis window. e.g., collect data
      for time window and analyze a segment of interest within that window.
      
      Committer notes:
      
      Testing it:
      
      Using the perf.data file captured via 'perf kmem record':
      
        # perf report --header-only
        # ========
        # captured on: Tue Nov 29 16:01:53 2016
        # hostname : jouet
        # os release : 4.8.8-300.fc25.x86_64
        # perf version : 4.9.rc6.g5a6aca
        # arch : x86_64
        # nrcpus online : 4
        # nrcpus avail : 4
        # cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
        # cpuid : GenuineIntel,6,61,4
        # total memory : 20254660 kB
        # cmdline : /home/acme/bin/perf kmem record usleep 1
        # event : name = kmem:kmalloc, , id = { 931980, 931981, 931982, 931983 }, type = 2, size = 112, config = 0x1b9, { sample_period, sample_freq } = 1, sample_typ
        # event : name = kmem:kmalloc_node, , id = { 931984, 931985, 931986, 931987 }, type = 2, size = 112, config = 0x1b7, { sample_period, sample_freq } = 1, sampl
        # event : name = kmem:kfree, , id = { 931988, 931989, 931990, 931991 }, type = 2, size = 112, config = 0x1b5, { sample_period, sample_freq } = 1, sample_type
        # event : name = kmem:kmem_cache_alloc, , id = { 931992, 931993, 931994, 931995 }, type = 2, size = 112, config = 0x1b8, { sample_period, sample_freq } = 1, s
        # event : name = kmem:kmem_cache_alloc_node, , id = { 931996, 931997, 931998, 931999 }, type = 2, size = 112, config = 0x1b6, { sample_period, sample_freq } =
        # event : name = kmem:kmem_cache_free, , id = { 932000, 932001, 932002, 932003 }, type = 2, size = 112, config = 0x1b4, { sample_period, sample_freq } = 1, sa
        # HEADER_CPU_TOPOLOGY info available, use -I to display
        # HEADER_NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: cpu = 4, intel_pt = 7, intel_bts = 6, uncore_arb = 13, cstate_pkg = 15, breakpoint = 5, uncore_cbox_1 = 12, power = 9, software = 1, uncore_im
        # HEADER_CACHE info available, use -I to display
        # missing features: HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT
        # ========
        #
        # # Looking at just the histogram entries for the first event:
        #
        # perf report  | head -33
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 40  of event 'kmem:kmalloc'
        # Event count (approx.): 40
        #
        # Overhead  Trace output
        # ........  ...............................................................................................................
        #
          37.50%  call_site=ffffffffb91ad3c7 ptr=0xffff88895fc05000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
          10.00%  call_site=ffffffffb9258416 ptr=0xffff888a1dc61f00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
           7.50%  call_site=ffffffffb9258416 ptr=0xffff888a2640ac00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
           2.50%  call_site=ffffffffb92759ba ptr=0xffff888a26776000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb9276864 ptr=0xffff8886f6b82600 bytes_req=136 bytes_alloc=192 gfp_flags=GFP_KERNEL|__GFP_ZERO
           2.50%  call_site=ffffffffb9276903 ptr=0xffff888aefcf0460 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c98a00 bytes_req=392 bytes_alloc=512 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c9ba00 bytes_req=504 bytes_alloc=512 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb92ad301 ptr=0xffff888a31747600 bytes_req=128 bytes_alloc=128 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb92ad511 ptr=0xffff888a9d26a2a0 bytes_req=28 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c11a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c12c0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1540 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c16e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1c20 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb936a7fb ptr=0xffff888a9d26a2a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
           2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931240 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
           2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931980 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
           2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931a00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
      
        #
        # # And then limiting using the example for 'perf kmem stat --time' used
        # # in the previous changeset committer note we see that there were no
        # # kmem:kmalloc in that last part of the file, but there were some
        # # kmem:kmem_cache_alloc ones:
        #
        # perf report --time 20119.782088, --stdio
        #
        # Total Lost Samples: 0
        #
        # Samples: 0  of event 'kmem:kmalloc'
        # Event count (approx.): 0
        #
        # Overhead  Trace output
        # ........  ............
        #
      
        # Samples: 0  of event 'kmem:kmalloc_node'
        # Event count (approx.): 0
        #
        # Overhead  Trace output
        # ........  ............
        #
      
        # Samples: 0  of event 'kmem:kfree'
        # Event count (approx.): 0
        #
        # Overhead  Trace output
        # ........  ............
        #
      
        # Samples: 8  of event 'kmem:kmem_cache_alloc'
        # Event count (approx.): 8
        #
        # Overhead  Trace output
        # ........  ..................................................................................................................
        #
          75.00%  call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
          12.50%  call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
          12.50%  call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
        #
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1480439746-42695-7-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      46690a80