1. 23 3月, 2022 1 次提交
    • K
      perf script: Add 'brstackinsnlen' for branch stacks · 6f680c6a
      Kan Liang 提交于
      When analyzing with 'perf script', it's useful to understand the
      captured instruction and the next sequential instruction.
      
      To calculate the address of the next sequential instruction, the length
      of the captured instruction is required.
      
      For example, you can’t know the next sequential instruction after an
      unconditional branch unless you calculate that based on its length.
      
      For branch stacks, 'perf script' only prints the instruction bytes with
      'brstackinsn', but lacks the instruction length.
      
      Add 'brstackinsnlen' to print the instruction length.
      
        $ perf script -F ip,brstackinsn,brstackinsnlen --xed
           7fa555be8f75
              _start:
              00007fa555be8090    mov %rsp, %rdi              ilen: 3
              00007fa555be8093    callq  0x7fa555be8ea0       ilen: 5 # PRED 102 cycles [102] 0.02 IPC
              _dl_start+38:
              00007fa555be8ec6    movq  %rdx,0x227853(%rip)   ilen: 7
              00007fa555be8ecd    leaq  0x227f94(%rip),%rdx   ilen: 7
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Ahmad Yasin <ahmad.yasin@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/1647871212-184070-1-git-send-email-kan.liang@linux.intel.com
      [ Added the new field to tools/perf/Documentation/perf-script.txt ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6f680c6a
  2. 16 2月, 2022 1 次提交
  3. 28 10月, 2021 1 次提交
  4. 11 8月, 2021 1 次提交
  5. 02 7月, 2021 3 次提交
  6. 22 5月, 2021 1 次提交
  7. 09 2月, 2021 1 次提交
    • J
      perf script: Support filtering by hex address · 61d9fc44
      Jin Yao 提交于
      'perf script' supports '-S' or '--symbol' options to only list the
      records for these symbols. A symbol is typically a name or hex address.
      If it's hex address, it is the start address of one symbol.
      
      While it would be useful if we can filter trace records by any hex
      address (not only the start address of symbol). So now we support
      filtering trace records by more conditions, such as:
      
      - symbol name
      - start address of symbol
      - any hexadecimal address
      - address range
      
      The comparison order is defined as:
      
      1. symbol name comparison
      2. symbol start address comparison.
      3. any hexadecimal address comparison.
      4. address range comparison.
      
      The idea is if we can get a valid address from -S list, we add the
      address to addr_list for address comparison otherwise we still leave
      it to sym_list for symbol comparison.
      
      Some examples:
      
        root@kbl-ppc:~# ./perf script -S ffffffff9a477308
                  perf  8562 [000] 347303.578858:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf  8562 [000] 347303.578860:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf  8562 [000] 347303.578861:         11   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf  8562 [001] 347303.578903:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf  8562 [001] 347303.578905:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf  8562 [001] 347303.578906:         15   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf  8562 [002] 347303.578952:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf  8562 [002] 347303.578953:          1   cycles:  ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
      
      Filter the traced records by hex address ffffffff9a477308.
      
        root@kbl-ppc:~# ./perf script -S ffffffff9a4dd4ce,ffffffff9a4d2de9,ffffffff9a6bf9f4
                  perf  8562 [001] 347303.578911:     311706   cycles:  ffffffff9a6bf9f4 __kmalloc_node+0x204 ([kernel.kallsyms])
                  perf  8562 [002] 347303.578960:     354477   cycles:  ffffffff9a4d2de9 sched_setaffinity+0x49 ([kernel.kallsyms])
                  perf  8562 [003] 347303.579015:     450958   cycles:  ffffffff9a4dd4ce dequeue_task_fair+0x1ae ([kernel.kallsyms])
      
      Filter the traced records by hex address ffffffff9a4dd4ce, ffffffff9a4d2de9, ffffffff9a6bf9f4.
      
        root@kbl-ppc:~# ./perf script -S ffffffff9a477309 --addr-range 16
                  perf  8562 [000] 347303.578863:        291   cycles:  ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
                  perf  8562 [001] 347303.578907:        411   cycles:  ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
                  perf  8562 [002] 347303.578956:        462   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
                  perf  8562 [003] 347303.579010:        497   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
                  perf  8562 [004] 347303.579059:        429   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
                  perf  8562 [005] 347303.579109:        408   cycles:  ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
                  perf  8562 [006] 347303.579159:        460   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
                  perf  8562 [007] 347303.579213:        436   cycles:  ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
      
      Filter the traced records from address range [ffffffff9a477309, ffffffff9a477309 + 15].
      
        root@kbl-ppc:~# ./perf script -S "ffffffff9b163046,rcu_nmi_exit"
                  perf  8562 [004] 347303.579060:      12013   cycles:  ffffffff9b163046 exc_nmi+0x166 ([kernel.kallsyms])
                  perf  8562 [007] 347303.579214:      12138   cycles:  ffffffff9b165944 rcu_nmi_exit+0x34 ([kernel.kallsyms])
      
      Filter by address + symbol
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210207080935.31784-2-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61d9fc44
  8. 04 2月, 2021 1 次提交
    • J
      perf script: Support DSO filter like in other perf tools · 4b799a9b
      Jin Yao 提交于
      Other perf tool builtins already supported a DSO filter.
      
      For example:
      
        $ perf report --dsos a,b,c
      
      which only considers symbols in these dsos.
      
      Now the DSO filter is supported in 'perf script':
      
        root@kbl-ppc:~# ./perf script --dsos "[kernel.kallsyms]"
                  perf 18123 [000] 6142863.075104:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf 18123 [000] 6142863.075107:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf 18123 [000] 6142863.075108:         10   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf 18123 [000] 6142863.075109:        273   cycles:  ffffffff9ca7730a native_write_msr+0xa ([kernel.kallsyms])
                  perf 18123 [000] 6142863.075110:       7684   cycles:  ffffffff9ca3c9c0 native_sched_clock+0x50 ([kernel.kallsyms])
                  perf 18123 [000] 6142863.075112:     213017   cycles:  ffffffff9d765a92 syscall_exit_to_user_mode+0x32 ([kernel.kallsyms])
                  perf 18123 [001] 6142863.075156:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf 18123 [001] 6142863.075158:          1   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
                  perf 18123 [001] 6142863.075159:         17   cycles:  ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
      
      Committer testing:
      
        $ perf script
                      ls 2364888 29303.010949:          1 cycles:u:  ffffffffa4bbc6a9 [unknown] ([unknown])
                      ls 2364888 29303.010957:          1 cycles:u:  ffffffffa429ef48 [unknown] ([unknown])
                      ls 2364888 29303.010961:          1 cycles:u:  ffffffffa4260133 [unknown] ([unknown])
                      ls 2364888 29303.010964:          5 cycles:u:  ffffffffa429efad [unknown] ([unknown])
                      ls 2364888 29303.010967:         41 cycles:u:  ffffffffa42a4586 [unknown] ([unknown])
                      ls 2364888 29303.010972:        435 cycles:u:  ffffffffa429efe0 [unknown] ([unknown])
                      ls 2364888 29303.010978:       5142 cycles:u:      7f9b95bc2abf __GI___tunables_init+0x11f (/usr/lib64/ld-2.32.so)
                      ls 2364888 29303.011006:      38551 cycles:u:  ffffffffa4290f61 [unknown] ([unknown])
                      ls 2364888 29303.011486:     238234 cycles:u:      7f9b95bb7741 _dl_relocate_object+0xa71 (/usr/lib64/ld-2.32.so)
                      ls 2364888 29303.011937:     415870 cycles:u:      7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
        $
      
      Before:
      
        $ perf script --dsos /usr/lib64/libc-2.32.so |& head -5
          Error: unknown option `dsos'
      
         Usage: perf script [<options>]
            or: perf script [<options>] record <script> [<record-options>] <command>
            or: perf script [<options>] report <script> [script-args]
        $
      
      After:
      
        $ perf script --dsos /usr/lib64/libc-2.32.so
                      ls 2364888 29303.011937:     415870 cycles:u:      7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
        $
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210124232750.19170-2-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4b799a9b
  9. 21 1月, 2021 1 次提交
  10. 20 12月, 2020 1 次提交
  11. 10 7月, 2020 1 次提交
  12. 18 4月, 2020 1 次提交
    • K
      perf script: Add option to enable the LBR stitching approach · 680d125c
      Kan Liang 提交于
      With the LBR stitching approach, the reconstructed LBR call stack can
      break the HW limitation. However, it may reconstruct invalid call stacks
      in some cases, e.g. exception handing such as setjmp/longjmp.  Also, it
      may impact the processing time especially when the number of samples
      with stitched LBRs are huge.
      
      Add an option to enable the approach.
      
      Committer testing:
      
      Using the same perf.data as with the latest cset committer testing
      section:
      
        $ perf script --stitch-lbr
        <SNIP>
        tchain_edit 11131 15164.984292:     437491 cycles:u:
                          401106 f43+0x0 (/wb/tchain_edit)
                          40114c f42+0x18 (/wb/tchain_edit)
                          401172 f41+0xe (/wb/tchain_edit)
                          401194 f40+0x0 (/wb/tchain_edit)
                          40119b f39+0x0 (/wb/tchain_edit)
                          4011a2 f38+0x0 (/wb/tchain_edit)
                          4011a9 f37+0x0 (/wb/tchain_edit)
                          4011b0 f36+0x0 (/wb/tchain_edit)
                          4011b7 f35+0x0 (/wb/tchain_edit)
                          4011be f34+0x0 (/wb/tchain_edit)
                          4011c5 f33+0x0 (/wb/tchain_edit)
                          4011cc f32+0x0 (/wb/tchain_edit)
                          401207 f31+0x34 (/wb/tchain_edit)
                          401212 f30+0x0 (/wb/tchain_edit)
                          401219 f29+0x0 (/wb/tchain_edit)
                          401220 f28+0x0 (/wb/tchain_edit)
                          401227 f27+0x0 (/wb/tchain_edit)
                          40122e f26+0x0 (/wb/tchain_edit)
                          401235 f25+0x0 (/wb/tchain_edit)
                          40123c f24+0x0 (/wb/tchain_edit)
                          401243 f23+0x0 (/wb/tchain_edit)
                          40124a f22+0x0 (/wb/tchain_edit)
                          401251 f21+0x0 (/wb/tchain_edit)
                          401258 f20+0x0 (/wb/tchain_edit)
                          40125f f19+0x0 (/wb/tchain_edit)
                          401266 f18+0x0 (/wb/tchain_edit)
                          40126d f17+0x0 (/wb/tchain_edit)
                          401274 f16+0x0 (/wb/tchain_edit)
                          40127b f15+0x0 (/wb/tchain_edit)
                          401282 f14+0x0 (/wb/tchain_edit)
                          401289 f13+0x0 (/wb/tchain_edit)
                          401290 f12+0x0 (/wb/tchain_edit)
                          401297 f11+0x0 (/wb/tchain_edit)
                          40129e f10+0x0 (/wb/tchain_edit)
                          4012a5 f9+0x0 (/wb/tchain_edit)
                          4012ac f8+0x0 (/wb/tchain_edit)
                          4012b3 f7+0x0 (/wb/tchain_edit)
                          4012ba f6+0x0 (/wb/tchain_edit)
                          4012c1 f5+0x0 (/wb/tchain_edit)
                          4012c8 f4+0x0 (/wb/tchain_edit)
                          4012cf f3+0x0 (/wb/tchain_edit)
                          4012d6 f2+0x0 (/wb/tchain_edit)
                          4012dd f1+0x0 (/wb/tchain_edit)
                          4012e4 main+0x0 (/wb/tchain_edit)
                    7f41a5016f41 __libc_start_main+0xf1 (/usr/lib64/libc-2.29.so)
        <SNIP>
        $
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lore.kernel.org/lkml/20200319202517.23423-15-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      680d125c
  13. 03 4月, 2020 2 次提交
    • I
      perf script: add -S/--symbols documentation · 628d736d
      Ian Rogers 提交于
      Capture both that this option exists and that symbols can be hexadecimal
      addresses.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200402174130.140319-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      628d736d
    • N
      perf script: Add --show-cgroup-events option · 160d4af9
      Namhyung Kim 提交于
      The --show-cgroup-events option is to print CGROUP events in the
      output like others.
      
      Committer testing:
      
        [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.039 MB perf.data (487 samples) ]
        [root@seventh ~]# perf script --show-cgroup-events | grep PERF_RECORD_CGROUP -B2 -A2
                 swapper     0     0.000000: PERF_RECORD_CGROUP cgroup: 1 /
                    perf 12145 11200.440730:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                    perf 12145 11200.440733:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        --
                  cgtest 12145 11200.440739:     193472 cycles:  ffffffffb90f6fbc commit_creds+0x1fc (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12145 11200.440790:    2691608 cycles:      7fa2cb43019b _dl_sysdep_start+0x7cb (/usr/lib64/ld-2.29.so)
                  cgtest 12145 11200.440962: PERF_RECORD_CGROUP cgroup: 83 /sub
                  cgtest 12147 11200.441054:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12147 11200.441057:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        --
                  cgtest 12148 11200.441103:      10227 cycles:  ffffffffb9a0153d end_repeat_nmi+0x48 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12148 11200.441106:     273295 cycles:  ffffffffb99ecbc7 copy_page+0x7 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12147 11200.441133: PERF_RECORD_CGROUP cgroup: 88 /sub/cgrp1
                  cgtest 12147 11200.441143:    2788845 cycles:  ffffffffb94676c2 security_genfs_sid+0x102 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12148 11200.441162: PERF_RECORD_CGROUP cgroup: 93 /sub/cgrp2
                  cgtest 12148 11200.441182:    2669546 cycles:            401020 _init+0x20 (/wb/cgtest)
                  cgtest 12149 11200.441247:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        [root@seventh ~]#
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-10-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      160d4af9
  14. 27 3月, 2020 1 次提交
    • H
      perf script: Introduce --deltatime option · 26567ed7
      Hagen Paul Pfeifer 提交于
      For some kind of analysis a deltatime output is more human friendly and
      reduce the cognitive load for further analysis.
      
      The following output demonstrate the new option "deltatime": calculate
      the time difference in relation to the previous event.
      
        $ perf script --deltatime
        test  2525 [001]     0.000000:            sdt_libev:ev_add: (5635e72a5ebd)
        test  2525 [001]     0.000091:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000051: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000685:            sdt_libev:ev_add: (5635e72a5ebd)
        test  2525 [001]     0.000048:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000104: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.003895:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.996034: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000058:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000004: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000064:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.999934: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000056:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.999930: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
      
      Committer testing:
      
      So go from default output to --reltime and then this new --deltatime, to
      contrast the various timestamp presentation modes for a random perf.data file I
      had laying around:
      
        [root@five ~]# perf script --reltime | head
           perf 442394 [000]     0.000000:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000004:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000006:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000009: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000036:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000038:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000040:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000041:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000044: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]# perf script --deltatime | head
           perf 442394 [000]     0.000000:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000001:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000001:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000027:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000001:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000001:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000002: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]# perf script | head
           perf 442394 [000]  7600.157861:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157864:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157866:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157867:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157870: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157897:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157900:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157901:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157903:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157906: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]#
      
      Andi suggested we better implement it as a new field, i.e. -F deltatime, like:
      
        [root@five ~]# perf script -F deltatime
        Invalid field requested.
      
         Usage: perf script [<options>]
            or: perf script [<options>] record <script> [<record-options>] <command>
            or: perf script [<options>] report <script> [script-args]
            or: perf script [<options>] <script> [<record-options>] <command>
            or: perf script [<options>] <top-script> [script-args]
      
            -F, --fields <str>    comma separated output fields prepend with 'type:'. +field to add and -field to remove.Valid types: hw,sw,trace,raw,synth. Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,srcline,period,iregs,uregs,brstack,brstacksym,flags,bpf-output,brstackinsn,brstackoff,callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc
        [root@five ~]#
      
      I.e. we have -F for maximum flexibility:
      
        [root@five ~]# perf script -F comm,pid,cpu,time | head
                  perf 442394 [000]  7600.157861:
                  perf 442394 [000]  7600.157864:
                  perf 442394 [000]  7600.157866:
                  perf 442394 [000]  7600.157867:
                  perf 442394 [000]  7600.157870:
                  perf 442394 [001]  7600.157897:
                  perf 442394 [001]  7600.157900:
                  perf 442394 [001]  7600.157901:
                  perf 442394 [001]  7600.157903:
                  perf 442394 [001]  7600.157906:
        [root@five ~]#
      
      But since we already have --reltime, having --deltatime, documented one after
      the other is sensible.
      Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lore.kernel.org/lkml/20200204173709.489161-1-hagen@jauu.net
      [ Added 'perf script' man page entry for --deltatime ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      26567ed7
  15. 11 3月, 2020 1 次提交
  16. 15 8月, 2019 3 次提交
    • A
      perf script: Allow specifying event to switch off processing of other events · dd41f660
      Arnaldo Carvalho de Melo 提交于
      Counterpart of --switch-on:
      
        # perf record -e sched:*,syscalls:sys_*_nanosleep sleep 1
        [ perf record: Woken up 36 times to write data ]
        [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
        #
        # perf script
            :20918 20918 [002] 109866.143696:            sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
            :20918 20918 [002] 109866.143702:            sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109866.144081:      sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
             sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109867.144614:   syscalls:sys_exit_nanosleep: 0x0
             sleep 20919 [001] 109867.144753:      sched:sched_process_exit: comm=sleep pid=20919 prio=120
        #
        # perf script --switch-off syscalls:sys_exit_nanosleep
            :20918 20918 [002] 109866.143696:            sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
            :20918 20918 [002] 109866.143702:            sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109866.144081:      sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
             sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109867.144753:      sched:sched_process_exit: comm=sleep pid=20919 prio=120
        #
        # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
        #
        # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep --show-on-off
             sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109867.144614:   syscalls:sys_exit_nanosleep: 0x0
        #
      
      Now think about using this together with 'perf probe' to create custom on/off
      events in your app :-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-li3j01c4tmj9kw6ydsl8swej@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd41f660
    • A
      perf script: Allow showing the --switch-on event · 6469eb6d
      Arnaldo Carvalho de Melo 提交于
      One may want to see the --switch-on event as well, allow for that, using
      the previous cset example:
      
        # perf script --switch-on syscalls:sys_enter_nanosleep --show-on-off
              sleep 13638 [001] 108237.582286: syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
              sleep 13638 [001] 108237.582289:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
              sleep 13638 [001] 108237.582291:           sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
            swapper     0 [001] 108238.582428:           sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
            swapper     0 [001] 108238.582458:           sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
              sleep 13638 [001] 108238.582698:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
              sleep 13638 [001] 108238.582782:     sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
        # perf script --switch-on syscalls:sys_enter_nanosleep
              sleep 13638 [001] 108237.582289:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
              sleep 13638 [001] 108237.582291:           sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
            swapper     0 [001] 108238.582428:           sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
            swapper     0 [001] 108238.582458:           sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
              sleep 13638 [001] 108238.582698:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
              sleep 13638 [001] 108238.582782:     sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Link: https://lkml.kernel.org/n/tip-0omwwoywj1v63gu8cz0tr0cy@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6469eb6d
    • A
      perf script: Allow specifying event to switch on processing of other events · f90a2417
      Arnaldo Carvalho de Melo 提交于
      Sometime we want to only consider events after something happens, so
      allow discarding events till such events is found, e.g.:
      
      Record all scheduler tracepoints and the sys_enter_nanosleep syscall
      event for the 'sleep 1' workload:
      
        # perf record -e sched:*,syscalls:sys_enter_nanosleep sleep 1
        [ perf record: Woken up 31 times to write data ]
        [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
        #
      
      So we have these events in the generated perf data file:
      
        # perf evlist
        sched:sched_kthread_stop
        sched:sched_kthread_stop_ret
        sched:sched_waking
        sched:sched_wakeup
        sched:sched_wakeup_new
        sched:sched_switch
        sched:sched_migrate_task
        sched:sched_process_free
        sched:sched_process_exit
        sched:sched_wait_task
        sched:sched_process_wait
        sched:sched_process_fork
        sched:sched_process_exec
        sched:sched_stat_wait
        sched:sched_stat_sleep
        sched:sched_stat_iowait
        sched:sched_stat_blocked
        sched:sched_stat_runtime
        sched:sched_pi_setprio
        sched:sched_move_numa
        sched:sched_stick_numa
        sched:sched_swap_numa
        sched:sched_wake_idle_without_ipi
        syscalls:sys_enter_nanosleep
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
        #
      
      Then show all of the events that actually took place in this 'perf record' session:
      
        # perf script
                :13637 13637 [002] 108237.581529:            sched:sched_waking: comm=perf pid=13638 prio=120 target_cpu=001
                :13637 13637 [002] 108237.581537:            sched:sched_wakeup: perf:13638 [120] success=1 CPU:001
                 sleep 13638 [001] 108237.581992:      sched:sched_process_exec: filename=/usr/bin/sleep pid=13638 old_pid=13638
                 sleep 13638 [001] 108237.582286:  syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
                 sleep 13638 [001] 108237.582289:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
                 sleep 13638 [001] 108237.582291:            sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
               swapper     0 [001] 108238.582428:            sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
               swapper     0 [001] 108238.582458:            sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
                 sleep 13638 [001] 108238.582698:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
                 sleep 13638 [001] 108238.582782:      sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
      
      Now lets see only the ones that took place after a certain "marker":
      
        # perf script --switch-on syscalls:sys_enter_nanosleep
                 sleep 13638 [001] 108237.582289:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
                 sleep 13638 [001] 108237.582291:            sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
               swapper     0 [001] 108238.582428:            sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
               swapper     0 [001] 108238.582458:            sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
                 sleep 13638 [001] 108238.582698:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
                 sleep 13638 [001] 108238.582782:      sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-f1oo0ufdhrkx6nhy2lj1ierm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f90a2417
  17. 23 7月, 2019 2 次提交
  18. 11 6月, 2019 2 次提交
  19. 05 6月, 2019 1 次提交
    • A
      perf script: Add output of IPC ratio · 68fb45bf
      Adrian Hunter 提交于
      Add field 'ipc' to display instructions-per-cycle.
      
      Example:
      
       perf record -e intel_pt/cyc/u ls
       perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
      
       ls  2670177.697113434:  7f0dfdbcd090 _start+0x0      mov %rsp, %rdi   IPC: 0.00 (1/877)
       ls  2670177.697113434:  7f0dfdbcd093 _start+0x3      callq  0x7f0dfdbce030
       ls  2670177.697113434:  7f0dfdbce030 _dl_start+0x0   pushq  %rbp
       ls  2670177.697113434:  7f0dfdbce031 _dl_start+0x1   mov %rsp, %rbp
       ls  2670177.697113434:  7f0dfdbce034 _dl_start+0x4   pushq  %r15
       ls  2670177.697113434:  7f0dfdbce036 _dl_start+0x6   pushq  %r14
       ls  2670177.697113434:  7f0dfdbce038 _dl_start+0x8   pushq  %r13
       ls  2670177.697113434:  7f0dfdbce03a _dl_start+0xa   pushq  %r12
       ls  2670177.697113434:  7f0dfdbce03c _dl_start+0xc   mov %rdi, %r12
       ls  2670177.697113434:  7f0dfdbce03f _dl_start+0xf   pushq  %rbx
       ls  2670177.697113434:  7f0dfdbce040 _dl_start+0x10  sub $0x38, %rsp
       ls  2670177.697113434:  7f0dfdbce044 _dl_start+0x14  rdtsc
       ls  2670177.697113434:  7f0dfdbce046 _dl_start+0x16  mov %eax, %eax
       ls  2670177.697113434:  7f0dfdbce048 _dl_start+0x18  shl $0x20, %rdx
       ls  2670177.697113434:  7f0dfdbce04c _dl_start+0x1c  or %rax, %rdx
       ls  2670177.697114471:  7f0dfdbce04f _dl_start+0x1f  movq  0x27e22(%rip), %rax        IPC: 0.00 (15/1685)
       ls  2670177.697116177:  7f0dfdbce056 _dl_start+0x26  movq  %rdx, 0x27683(%rip)        IPC: 0.00 (1/881)
      
      Note, the IPC values are low due to page faults at the beginning of
      execution. The additional cycles are due to the time to enter the
      kernel, not the actual kernel page fault handler.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/20190520113728.14389-9-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      68fb45bf
  20. 29 5月, 2019 1 次提交
    • J
      perf script: Add --show-bpf-events to show eBPF related events · 490c8cc9
      Jiri Olsa 提交于
      Add the --show-bpf-events command line option to show the eBPF related events:
      
        PERF_RECORD_KSYMBOL
        PERF_RECORD_BPF_EVENT
      
      Usage:
      
        # perf record -a
        ...
        # perf script --show-bpf-events
        ...
        swapper     0 [000]     0.000000: PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0ef971d len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
        swapper     0 [000]     0.000000: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 36
        ...
      
      Committer testing:
      
        # perf script --show-bpf-events | egrep -i 'PERF_RECORD_(BPF|KSY)'
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029a6c3 len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 47
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc029c1ae len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 48
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02ddd1c len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 49
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc02dfc11 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 50
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc045da0a len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 51
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc04ef4b4 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 52
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc09e15da len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 53
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0d2b1a3 len 229 type 1 flags 0x0 name bpf_prog_2a142ef67aaad174
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 54
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0fd9850 len 381 type 1 flags 0x0 name bpf_prog_819967866022f1e1_sys_enter
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 179
          0 PERF_RECORD_KSYMBOL ksymbol event with addr ffffffffc0feb1ec len 191 type 1 flags 0x0 name bpf_prog_c1bd85c092d6e4aa_sys_exit
          0 PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 180
        ^C[root@quaco pt]# perf evlist
        intel_pt//ku
        dummy:u
        #
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190508132010.14512-11-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      490c8cc9
  21. 20 3月, 2019 1 次提交
    • A
      perf script: Support relative time · 90b10f47
      Andi Kleen 提交于
      When comparing time stamps in 'perf script' traces it can be annoying to
      work with the full perf time stamps.
      
      Add a --reltime option that displays time stamps relative to the trace
      start to make it easier to read the traces.
      
      Note: not currently supported for --time. Report an error in this
      case.
      
      Before:
      
        % perf script
            swapper 0 [000] 245402.891216:    1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
            swapper 0 [000] 245402.891223:    1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
            swapper 0 [000] 245402.891227:    5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
            swapper 0 [000] 245402.891231:   41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms])
            swapper 0 [000] 245402.891235:  355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms])
            swapper 0 [000] 245402.891239: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms])
      
      After:
      
        % perf script --reltime
      
            swapper 0 [000]     0.000000:    1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
            swapper 0 [000]     0.000006:    1 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
            swapper 0 [000]     0.000010:    5 cycles:ppp: ffffffffa0068814 native_write_msr+0x4 ([kernel.kallsyms])
            swapper 0 [000]     0.000014:   41 cycles:ppp: ffffffffa0068816 native_write_msr+0x6 ([kernel.kallsyms])
            swapper 0 [000]     0.000018:  355 cycles:ppp: ffffffffa000dd51 intel_bts_enable_local+0x21 ([kernel.kallsyms])
            swapper 0 [000]     0.000022: 3084 cycles:ppp: ffffffffa0a0150a end_repeat_nmi+0x48 ([kernel.kallsyms])
      
      Committer notes:
      
      Do not use 'time' as the name of a variable, as this breaks the build on
      older glibcs:
      
        cc1: warnings being treated as errors
        builtin-script.c: In function 'perf_sample__fprintf_start':
        builtin-script.c:691: warning: declaration of 'time' shadows a global declaration
        /usr/include/time.h:187: warning: shadowed declaration is here
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      LPU-Reference: 20190314225002.30108-8-andi@firstfloor.org
      Link: https://lkml.kernel.org/n/tip-bpahyi6pr9r399mvihu65fvc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      90b10f47
  22. 21 2月, 2019 1 次提交
    • J
      perf script: Allow +- operator for type specific fields option · 6ef362fd
      Jiri Olsa 提交于
      Add support to add/remove fields for specific event types in -F option.
      It's now possible to use '+-' after event type, like:
      
        # cat > test.c
        #include <stdio.h>
      
        int main(void)
        {
           printf("Hello world\n");
           while(1) {}
        }
        ^D
        # gcc -g -o test test.c
        # perf probe -x test 'test.c:5'
        # perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
        ...
      
        # perf script -Ftrace:+period,-cpu
                  test  3859 396291.117343:      10275 cpu/cpu-cycles,period=10000/:      7f..
                  test  3859 396291.118234:      11041 cpu/cpu-cycles,period=10000/:  ffffff..
                  test  3859 396291.118234:          1              probe_test:main:
                  test  3859 396291.118248:       8668 cpu/cpu-cycles,period=10000/:  ffffff..
                  test  3859 396291.118263:      10139 cpu/cpu-cycles,period=10000/:  ffffff..
      
      Committer testing:
      
      Couldn't make the test above work, but tested it with:
      
        # perf probe -x hello main
        Added new event:
          probe_hello:main     (on main in /home/acme/c/hello)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_hello:main -aR sleep 1
      
        # perf record -e probe_hello:main ./hello
        hello, world
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ]
        # perf script
                 hello 21454 [002] 254116.874005: probe_hello:main: (401126)
        #
        # perf script -Ftrace:+period,-cpu
                 hello 21454 254116.874005:          1 probe_hello:main: (401126)
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190220122800.864-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ef362fd
  23. 18 12月, 2018 1 次提交
    • A
      perf tools: Support 'srccode' output · dd2e18e9
      Andi Kleen 提交于
      When looking at PT or brstackinsn traces with 'perf script' it can be
      very useful to see the source code. This adds a simple facility to print
      them with 'perf script', if the information is available through dwarf
      
        % perf record ...
        % perf script -F insn,ip,sym,srccode
        ...
      
                  4004c6 main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004c6 main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004cd main
        5               for (i = 0; i < 10000000; i++)
                   4004b3 main
        6                       v++;
      
        % perf record -b ...
        % perf script -F insn,ip,sym,srccode,brstackinsn
      
        ...
               main+22:
                0000000000400543        insn: e8 ca ff ff ff            # PRED
        |18                     f1();
                f1:
                0000000000400512        insn: 55
        |10       {
                0000000000400513        insn: 48 89 e5
                0000000000400516        insn: b8 00 00 00 00
        |11             f2();
                000000000040051b        insn: e8 d6 ff ff ff            # PRED
                f2:
                00000000004004f6        insn: 55
        |5        {
                00000000004004f7        insn: 48 89 e5
                00000000004004fa        insn: 8b 05 2c 0b 20 00
        |6              c = a / b;
                0000000000400500        insn: 8b 0d 2a 0b 20 00
                0000000000400506        insn: 99
                0000000000400507        insn: f7 f9
                0000000000400509        insn: 89 05 29 0b 20 00
                000000000040050f        insn: 90
        |7        }
                0000000000400510        insn: 5d
                0000000000400511        insn: c3                        # PRED
                f1+14:
                0000000000400520        insn: b8 00 00 00 00
        |12             f2();
                0000000000400525        insn: e8 cc ff ff ff            # PRED
                f2:
                00000000004004f6        insn: 55
        |5        {
                00000000004004f7        insn: 48 89 e5
                00000000004004fa        insn: 8b 05 2c 0b 20 00
        |6              c = a / b;
      
      Not supported for callchains currently, would need some layout changes
      there.
      
      Committer notes:
      
      Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this
      warning:
      
        In file included from util/srccode.c:19:0:
        /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp]
         #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
          ^~~~~~~
        cc1: all warnings being treated as errors
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd2e18e9
  24. 25 10月, 2018 3 次提交
    • A
      perf script: Implement --graph-function · 99f753f0
      Andi Kleen 提交于
      Add a ftrace style --graph-function argument to 'perf script' that
      allows to print itrace function calls only below a given function. This
      makes it easier to find the code of interest in a large trace.
      
      % perf record -e intel_pt//k -a sleep 1
      % perf script --graph-function group_sched_in --call-trace
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          group_sched_in
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              event_sched_in.isra.107
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_event_set_state.part.71
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_time
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_disable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_log_itrace_start
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_userpage
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          calc_timer_values
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_cpu
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          arch_perf_update_userpage
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              __fentry__
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              using_native_sched_clock
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_stable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_enable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])          group_sched_in
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              event_sched_in.isra.107
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_event_set_state.part.71
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_time
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_pmu_disable
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_log_itrace_start
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_userpage
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          calc_timer_values
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_cpu
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          arch_perf_update_userpage
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              __fentry__
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              using_native_sched_clock
               swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_stable
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/r/20180920180540.14039-5-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      99f753f0
    • A
      tools script: Add --call-trace and --call-ret-trace · d1b1552e
      Andi Kleen 提交于
      Add short cut options to print PT call trace and call-ret-trace, for
      calls and call and returns. Roughly corresponds to ftrace function
      tracer and function graph tracer.
      
      Just makes these common use cases nicer to use.
      
      % perf record -a -e intel_pt// sleep 1
      % perf script --call-trace
      	    perf   900 [000] 194167.205652203: ([kernel.kallsyms])          perf_pmu_enable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          event_filter_match
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])          group_sched_in
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])              event_sched_in.isra.107
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_event_set_state.part.71
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_time
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_disable
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_log_itrace_start
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_userpage
      
      % perf script --call-ret-trace
      	    perf   900 [000] 194167.205652203:   tr strt     ([unknown])        pt_config
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            pt_config
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            pt_event_add
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            perf_pmu_enable
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            perf_pmu_nop_void
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            event_sched_in.isra.107
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            perf_pmu_nop_int
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            group_sched_in
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            event_filter_match
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])            event_filter_match
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])            group_sched_in
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])                __x86_indirect_thunk_rax
                  perf   900 [000] 194167.205652203:   return      ([kernel.kallsyms])                perf_pmu_nop_txn
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])                event_sched_in.isra.107
                  perf   900 [000] 194167.205652203:   call        ([kernel.kallsyms])                    perf_event_set_state.part.71
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Link: http://lkml.kernel.org/r/20180920180540.14039-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d1b1552e
    • A
      perf script: Add --insn-trace for instruction decoding · b585ebdb
      Andi Kleen 提交于
      Add a --insn-trace short hand option for decoding and disassembling
      instruction streams for intel_pt. This automatically pipes the output
      into the xed disassembler to generate disassembled instructions.  This
      just makes this use model much nicer to use.
      
      Before
      
        % perf record -e intel_pt// ...
        % perf script --itrace=i0ns --ns -F +insn,-event,-period | xed -F insn: -A -64
         swapper 0 [000] 17276.429606186: ffffffff81010486 pt_config ([kernel.kallsyms])    nopl  %eax, (%rax,%rax,1)
         swapper 0 [000] 17276.429606186: ffffffff8101048b pt_config ([kernel.kallsyms])    add $0x10, %rsp
         swapper 0 [000] 17276.429606186: ffffffff8101048f pt_config ([kernel.kallsyms])    popq  %rbx
         swapper 0 [000] 17276.429606186: ffffffff81010490 pt_config ([kernel.kallsyms])    popq  %rbp
         swapper 0 [000] 17276.429606186: ffffffff81010491 pt_config ([kernel.kallsyms])    popq  %r12
         swapper 0 [000] 17276.429606186: ffffffff81010493 pt_config ([kernel.kallsyms])    popq  %r13
         swapper 0 [000] 17276.429606186: ffffffff81010495 pt_config ([kernel.kallsyms])    popq  %r14
         swapper 0 [000] 17276.429606186: ffffffff81010497 pt_config ([kernel.kallsyms])    popq  %r15
         swapper 0 [000] 17276.429606186: ffffffff81010499 pt_config ([kernel.kallsyms])    retq
         swapper 0 [000] 17276.429606186: ffffffff8101063e pt_event_add ([kernel.kallsyms])         cmpl  $0x1, 0x1b0(%rbx)
         swapper 0 [000] 17276.429606186: ffffffff81010645 pt_event_add ([kernel.kallsyms])         mov $0xffffffea, %eax
         swapper 0 [000] 17276.429606186: ffffffff8101064a pt_event_add ([kernel.kallsyms])         mov $0x0, %edx
         swapper 0 [000] 17276.429606186: ffffffff8101064f pt_event_add ([kernel.kallsyms])         popq  %rbx
         swapper 0 [000] 17276.429606186: ffffffff81010650 pt_event_add ([kernel.kallsyms])         cmovnz %edx, %eax
         swapper 0 [000] 17276.429606186: ffffffff81010653 pt_event_add ([kernel.kallsyms])         jmp 0xffffffff81010635
         swapper 0 [000] 17276.429606186: ffffffff81010635 pt_event_add ([kernel.kallsyms])         retq
         swapper 0 [000] 17276.429606186: ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms])       test %eax, %eax
      
      Now:
      
        % perf record -e intel_pt// ...
        % perf script --insn-trace --xed
        ... same output ...
      
      XED needs to be installed with:
      
        $ git clone https://github.com/intelxed/mbuild.git mbuild
        $ git clone https://github.com/intelxed/xed
        $ cd xed
        $ ./mfile.py
        $ ./mfile.py examples
        $ sudo ./mfile.py --prefix=/usr/local install
        $ sudo cp obj/examples/xed /usr/local/bin
        $ xed | head -3
        ERROR: required argument(s) were missing
        Copyright (C) 2017, Intel Corporation. All rights reserved.
        XED version: [v10.0-328-g7d62c8c49b7b]
        $
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20180920180540.14039-2-andi@firstfloor.org
      [ Fixed up whitespace damage, added the 'mfile.py examples + cp obj/examples/xed ... ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b585ebdb
  25. 17 4月, 2018 1 次提交
    • A
      perf script: Extend misc field decoding with switch out event type · bf30cc18
      Alexey Budankov 提交于
      Append 'p' sign to 'S' tag designating the type of context switch out event so
      'Sp' means preemption context switch. Documentation is extended to cover
      new presentation changes.
      
        $ perf script --show-switch-events -F +misc -I -i perf.data:
      
                hdparm 4073 [004] U  762.198265:     380194 cycles:ppp:      7faf727f5a23 strchr (/usr/lib64/ld-2.26.so)
                hdparm 4073 [004] K  762.198366:     441572 cycles:ppp:  ffffffffb9218435 alloc_set_pte (/lib/modules/4.16.0-rc6+/build/vmlinux)
                hdparm 4073 [004] S  762.198391: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    0/0
               swapper    0 [004]    762.198392: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 4073/4073
               swapper    0 [004] Sp 762.198477: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 4073/4073
                hdparm 4073 [004]    762.198478: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    0/0
               swapper    0 [007] K  762.198514:    2303073 cycles:ppp:  ffffffffb98b0c66 intel_idle (/lib/modules/4.16.0-rc6+/build/vmlinux)
               swapper    0 [007] Sp 762.198561: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 1134/1134
        kworker/u16:18 1134 [007]    762.198562: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    0/0
        kworker/u16:18 1134 [007] S  762.198567: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    0/0
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/5fc65ce7-8ca5-53ae-8858-8ddd27290575@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bf30cc18
  26. 16 2月, 2018 1 次提交
  27. 17 1月, 2018 1 次提交
  28. 10 1月, 2018 1 次提交
  29. 08 1月, 2018 2 次提交
  30. 30 11月, 2017 1 次提交
    • A
      perf script: Allow computing 'perf stat' style metrics · 4bd1bef8
      Andi Kleen 提交于
      Add support for computing 'perf stat' style metrics in 'perf script'.
      
      When using leader sampling we can get metrics for each sampling period
      by computing formulas over the values of the different group members.
      
      This allows things like fine grained IPC tracking through sampling, much
      more fine grained than with 'perf stat'.
      
      The metric is still averaged over the sampling period, it is not just
      for the sampling point.
      
      This patch adds a new metric output field for 'perf script' that uses
      the existing 'perf stat' metrics infrastructure to compute any metrics
      supported by 'perf stat'.
      
      For example to sample IPC:
      
        $ perf record -e '{ref-cycles,cycles,instructions}:S' -a sleep 1
        $ perf script -F metric,ip,sym,time,cpu,comm
        ...
         alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
         alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
         alsa-sink-ALC32 [000] 42815.856074:      7fd65937d6cc [unknown]
         alsa-sink-ALC32 [000] 42815.856074:    metric:    0.13  insn per cycle
                 swapper [000] 42815.857961:  ffffffff81655df0 __schedule
                 swapper [000] 42815.857961:  ffffffff81655df0 __schedule
                 swapper [000] 42815.857961:  ffffffff81655df0 __schedule
                 swapper [000] 42815.857961:    metric:    0.23  insn per cycle
         qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
         qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
         qemu-system-x86 [000] 42815.858130:  ffffffff8165ad0e _raw_spin_unlock_irqrestore
         qemu-system-x86 [000] 42815.858130:    metric:    0.46  insn per cycle
                   :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
                   :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
                   :4972 [000] 42815.858312:  ffffffffa080e5f2 vmx_vcpu_run
                   :4972 [000] 42815.858312:    metric:    0.45  insn per cycle
      
      TopDown:
      
      This requires disabling SMT if you have it enabled, because SMT would
      require sampling per core, which is not supported.
      
        $ perf record -e '{ref-cycles,topdown-fetch-bubbles,\
                           topdown-recovery-bubbles,\
                           topdown-slots-retired,topdown-total-slots,\
                           topdown-slots-issued}:S' -a sleep 1
        $ perf script --header -I -F cpu,ip,sym,event,metric,period
        ...
        [000]     121108               ref-cycles:  ffffffff8165222e copy_user_enhanced_fast_string
        [000]     190350    topdown-fetch-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
        [000]       2055 topdown-recovery-bubbles:  ffffffff8165222e copy_user_enhanced_fast_string
        [000]     148729    topdown-slots-retired:  ffffffff8165222e copy_user_enhanced_fast_string
        [000]     144324      topdown-total-slots:  ffffffff8165222e copy_user_enhanced_fast_string
        [000]     160852     topdown-slots-issued:  ffffffff8165222e copy_user_enhanced_fast_string
        [000]   metric:     33.0% frontend bound
        [000]   metric:      3.5% bad speculation
        [000]   metric:     25.8% retiring
        [000]   metric:     37.7% backend bound
        [000]     112112               ref-cycles:  ffffffff8165aec8 _raw_spin_lock_irqsave
        [000]     357222    topdown-fetch-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
        [000]       3325 topdown-recovery-bubbles:  ffffffff8165aec8 _raw_spin_lock_irqsave
        [000]     323553    topdown-slots-retired:  ffffffff8165aec8 _raw_spin_lock_irqsave
        [000]     270507      topdown-total-slots:  ffffffff8165aec8 _raw_spin_lock_irqsave
        [000]     341226     topdown-slots-issued:  ffffffff8165aec8 _raw_spin_lock_irqsave
        [000]   metric:     33.0% frontend bound
        [000]   metric:      2.9% bad speculation
        [000]   metric:     29.9% retiring
        [000]   metric:     34.2% backend bound
      ...
      
      v2:
      Use evsel->priv for new fields
      Port to new base line, support fp output.
      Handle stats in ->stats, not ->priv
      Minor cleanups
      
      Extra explanation about the use of the term 'averaging', from Andi in the
      thread in the Link: tag below:
      
      <quote Andi>
      The current samples contains the sum of event counts for a sampling period.
      
      EventA-1           EventA-2                EventA-3      EventA-4
      EventB-1     EventB-2                             EventC-3
      
                               gap with no events                overflow
      |-----------------------------------------------------------------|
      period-start                                             period-end
      ^                                                                 ^
      |                                                                 |
      previous sample                                      current sample
      
      So EventA = 4 and EventB = 3 at the sample point
      
      I generate a metric, let's say EventA / EventB. It applies to the whole period.
      
      But the metric is over a longer time which does not have the same behavior. For
      example the gap above doesn't have any events, while they are clustered at the
      beginning and end of the sample period.
      
      But we're summing everything together. The metric doesn't know that the gap is
      different than the busy period.
      
      That's what I'm trying to express with averaging.
      </quote>
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20171117214300.32746-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4bd1bef8