1. 23 8月, 2019 12 次提交
  2. 22 8月, 2019 3 次提交
  3. 20 8月, 2019 12 次提交
    • J
      libperf: Fix arch include paths · b81d39c7
      Jiri Olsa 提交于
      Guenter Roeck reported problem with compilation when the ARCH is
      specified:
      
        $ make ARCH=x86_64
        In file included from tools/include/asm/atomic.h:6:0,
                         from include/linux/atomic.h:5,
                         from tools/include/linux/refcount.h:41,
                         from cpumap.c:4: tools/include/asm/../../arch/x86/include/asm/atomic.h:11:10:
        fatal error: asm/cmpxchg.h: No such file or directory
      
      The problem is that we don't use SRCARCH (the sanitized ARCH version)
      and we don't get the proper include path.
      Reported-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NGuenter Roeck <linux@roeck-us.net>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 31435049 ("libperf: Make libperf.a part of the perf build")
      Link: http://lkml.kernel.org/r/20190820124624.GG24105@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b81d39c7
    • A
      perf top: Show info message while collecting samples · 5c959b6d
      Arnaldo Carvalho de Melo 提交于
      Give visual cue about what is happening while initially collecting the
      minimal set of samples to collect/sort/display.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-xcui60p1v6ozijfam2o89ya8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5c959b6d
    • A
      perf ui browser: Allow specifying message to show when no samples are available to display · 2284cf80
      Arnaldo Carvalho de Melo 提交于
      The 'perf top' tool will use that to avoid having a initial blank screen
      while collecting the minimum number of samples to sort and display.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-89ciceg8cy4442he3t0jzo3f@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2284cf80
    • A
      perf ui: Introduce non-interactive ui__info_window() function · 9b016119
      Arnaldo Carvalho de Melo 提交于
      Sometimes we want just to print a message on the center of the screen,
      like in 'perf top' while we wait for the minimum amount of samples to be
      collected before sorting and showing them.
      
      Also expose __ui__info_window() as an optimization for cases where such
      message is to be printed while holding the ui lock.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-uat0f89vfwl2w52kv9wzwd8a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9b016119
    • A
      perf ui: Make 'exit_msg' optional in ui__question_window() · 9e79ff77
      Arnaldo Carvalho de Melo 提交于
      We will not need it when refactoring this function to be
      non-interactive, so make it optional.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-pnx1dn17bsz7lqt9ty95nnjx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9e79ff77
    • L
      perf cs-etm: Support sample flags 'insn' and 'insnlen' · a4973d8f
      Leo Yan 提交于
      The synthetic branch and instruction samples are missed to set
      instruction related info, thus the perf tool fails to display samples
      with flags '-F,+insn,+insnlen'.
      
      The CoreSight trace decoder provides sufficient information to decide
      the instruction size based on the ISA type: A64/A32 instructions are
      32-bit size, but one exception is the T32 instruction size, which might
      be 32-bit or 16-bit.
      
      This patch handles these cases and it reads the instruction values from
      DSO file; thus can support the flags '-F,+insn,+insnlen'.
      
      Before:
      
        # perf script -F,insn,insnlen,ip,sym
                      0 [unknown] ilen: 0
           ffff97174044 _start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
           ffff97174938 _dl_start ilen: 0
      
        [...]
      
      After:
      
        # perf script -F,insn,insnlen,ip,sym
                      0 [unknown] ilen: 0
           ffff97174044 _start ilen: 4 insn: 2f 02 00 94
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
           ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
      
        [...]
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by: NMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190815082854.18191-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a4973d8f
    • A
      perf report: Prefer DWARF callstacks to LBR ones when captured both · 10ccbc1c
      Alexey Budankov 提交于
      Display DWARF based callchains when the perf.data file contains raw thread
      stack data as LBR callstack data.
      
      Commiter testing:
      
      This changes the output from the branch stack based one, i.e. without
      this patch, for the same file as in the previous csets:
      
        # perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 13  of event 'cycles'
        # Event count (approx.): 13
        #
        # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
        # ........  .......  ....................  ...........................  .........................................  ..................
        #
             7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
             7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
             7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
             7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
             7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
             7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
             7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
             7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
             7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
             7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
             7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
             7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
             7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116
      
        #
        # (Tip: For memory address profiling, try: perf mem record / perf mem report)
        #
      
      To the one that shows call chains:
      
        # perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 10  of event 'cycles'
        # Event count (approx.): 3204047
        #
        # Children      Self  Command  Shared Object       Symbol
        # ........  ........  .......  ..................  .........................................
        #
            55.01%     0.00%  ls       [kernel.vmlinux]    [k] entry_SYSCALL_64_after_hwframe
                    |
                    ---entry_SYSCALL_64_after_hwframe
                       do_syscall_64
                       |
                        --16.01%--__x64_sys_execve
                                  __do_execve_file.isra.0
                                  search_binary_handler
                                  load_elf_binary
                                  elf_map
                                  vm_mmap_pgoff
                                  do_mmap
                                  mmap_region
                                  perf_event_mmap
                                  perf_iterate_sb
                                  perf_iterate_ctx
                                  perf_event_mmap_output
                                  perf_output_copy
                                  memcpy_erms
      
            55.01%    39.00%  ls       [kernel.vmlinux]    [k] do_syscall_64
                    |
                    |--39.00%--0xffffffffffffffff
                    |          _dl_map_object
                    |          open_verify.constprop.0
                    |          __lseek64 (inlined)
                    |          entry_SYSCALL_64_after_hwframe
                    |          do_syscall_64
                    |
                     --16.01%--do_syscall_64
                               __x64_sys_execve
                               __do_execve_file.isra.0
                               search_binary_handler
                               load_elf_binary
                               elf_map
                               vm_mmap_pgoff
                               do_mmap
                               mmap_region
                               perf_event_mmap
                               perf_iterate_sb
                               perf_iterate_ctx
                               perf_event_mmap_output
                               perf_output_copy
                               memcpy_erms
      
            42.95%    42.95%  ls       libpthread-2.29.so  [.] __pthread_initialize_minimal_internal
                    |
                    ---_init
                       __pthread_initialize_minimal_internal
      
            42.95%     0.00%  ls       libpthread-2.29.so  [.] _init
                    |
                    ---_init
                       __pthread_initialize_minimal_internal
      
        <SNIP>
      
        #
        # (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
        #
        #
      
      The branch stack view be explicitely selected using:
      
        # perf report -h branch-stack
      
         Usage: perf report [<options>]
      
            -b, --branch-stack    use branch records for per branch histogram filling
      
        #
      
      I.e. after this patch:
      
        # perf report -b --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 13  of event 'cycles'
        # Event count (approx.): 13
        #
        # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
        # ........  .......  ....................  ...........................  .........................................  ..................
        #
             7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
             7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
             7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
             7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
             7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
             7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
             7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
             7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
             7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
             7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
             7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
             7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
             7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116
      
        #
        # (Tip: Show current config key-value pairs: perf config --list)
        #
        #
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      10ccbc1c
    • A
      perf report: Dump LBR callstack data by -D jointly with thread stack · d2720c3d
      Alexey Budankov 提交于
      Make perf report -D command print captured LBR callstack chain when it is
      collected together with raw thread stack data:
      
        2752673087247083 0x5d10 [0x548]: PERF_RECORD_SAMPLE(IP, 0x4002): 5841/5841: 0x40121f period: 1543862 addr: 0
        ... FP chain: nr:0
        ... branch callstack: nr:3
        .....  0: 00000000004011d0
        .....  1: 00007f393c388411
        .....  2: 0000000000401098
        ... user regs: mask 0xff0fff ABI 64-bit
        .... AX    0x34e7
        .... BX    0x7fff5f6dd3c0
        .... CX    0xffffffff
        .... DX    0x34e6
        .... SI    0x7f393c5268d0
        .... DI    0x0
        .... BP    0x401260
        .... SP    0x7fff5f6dd3c0
        .... IP    0x40121f
        .... FLAGS 0x29f
        .... CS    0x33
        .... SS    0x2b
        .... R8    0x7f393c526800
        .... R9    0x7f393c525da0
        .... R10   0xfffffffffffff70a
        .... R11   0x246
        .... R12   0x401070
        .... R13   0x7fff5f6ddcb0
        .... R14   0x0
        .... R15   0x0
        ... ustack: size 1024, offset 0x130
         . data_src: 0x5080021
         ... thread: stack_test:5841
         ...... dso: /root/abudanko/stacks/stack_test
      
      Committer testing:
      
        # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.042 MB perf.data (10 samples) ]
        #
      
      Before:
      
        # perf report -D |& grep PERF_RECORD_SAMPLE -A28 | tail -29
        67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
        ... FP chain: nr:0
        ... user regs: mask 0xff0fff ABI 64-bit
        .... AX    0x7f441b2b1000
        .... BX    0x7f441b55b970
        .... CX    0x7fff6e2db218
        .... DX    0x7fff6e2db218
        .... SI    0x7fff6e2db208
        .... DI    0x1
        .... BP    0x1
        .... SP    0x7fff6e2db178
        .... IP    0x7f441b2b1e20
        .... FLAGS 0x20a
        .... CS    0x33
        .... SS    0x2b
        .... R8    0x1
        .... R9    0x7f441b371c18
        .... R10   0x7f441b5a5f10
        .... R11   0x202
        .... R12   0x7fff6e2db208
        .... R13   0x7fff6e2db218
        .... R14   0x7f441b5a7150
        .... R15   0x0
        ... ustack: size 1024, offset 0x148
         . data_src: 0x5080021
         ... thread: ls:9721
         ...... dso: /usr/lib64/libpthread-2.29.so
      
        0xad00 [0x60]: event: 10
        #
      
      After:
      
        # perf report -D |& grep PERF_RECORD_SAMPLE -A31 | tail -32
        67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
        ... FP chain: nr:0
        ... branch callstack: nr:4
        .....  0: 00007f441b2b1e20
        .....  1: 00007f441b58af1a
        .....  2: 00007f441b58b0e1
        .....  3: 00007f441b57c145
        ... user regs: mask 0xff0fff ABI 64-bit
        .... AX    0x7f441b2b1000
        .... BX    0x7f441b55b970
        .... CX    0x7fff6e2db218
        .... DX    0x7fff6e2db218
        .... SI    0x7fff6e2db208
        .... DI    0x1
        .... BP    0x1
        .... SP    0x7fff6e2db178
        .... IP    0x7f441b2b1e20
        .... FLAGS 0x20a
        .... CS    0x33
        .... SS    0x2b
        .... R8    0x1
        .... R9    0x7f441b371c18
        .... R10   0x7f441b5a5f10
        .... R11   0x202
        .... R12   0x7fff6e2db208
        .... R13   0x7fff6e2db218
        .... R14   0x7f441b5a7150
        .... R15   0x0
        ... ustack: size 1024, offset 0x148
         . data_src: 0x5080021
         ... thread: ls:9721
         ...... dso: /usr/lib64/libpthread-2.29.so
        #
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/aa82e5dd-def2-0ca8-a064-db9e2e8ad076@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d2720c3d
    • A
      perf record: Enable LBR callstack capture jointly with thread stack · 25663496
      Alexey Budankov 提交于
      Enable '-j stack' applicability together with '--call-graph dwarf'
      option so thread stack data and LBR call stack could be captured
      jointly:
      
        $ perf record -g --call-graph dwarf,1024 -j stack,u -- stack_test
      
      Collected LBR call stack can be used to augment DWARF call stack
      calculated from the raw thread stack data and to provide more
      comprehensive call stack information for cases when collected SIZE is
      not enough to cover complete thread stack.
      
      Such cases are typical for workloads that allocate large arrays of data
      on its threads stacks or the possible SIZE to collect can't be large
      enough due to workload nature or system configuration and this is where
      hardware captured LBR call stacks can provide missing stack frames.
      Possible DWARF plus LBR call stacks consolidation algorithm description
      follows.
      
      With this patch set perf report command UI currently ignores collected
      LBR call stack data and still provides DWARF based call stacks
      information.
      
        ===========================================================================
      
        Overview:
      
         Legend:
      
         THS - thread stack
         CTX - thread register context
         SWS - software stack
         SSF - skipped stack frames
         PSS - Perf sample stack
      
         ip,sp,bp - HW registers values
         d        - allocated stack regions
         kip      - ip address in the kernel space
         K        - captured thread stack size
      
              THS
      
             -----
             |   |<-stack bottom
              ...
             |---|
             |ip4|
             |---|         PSS = SWS(THS(K))
             |   |
         --> |   |
         |   |d3 |                  user/
         |   |---|         user PSS kernel PSS
         |   |ip3|         ------   ------
         |   |---|         |SSF |   |SSF |
         |   |   |          ....     ....
         |   |   |         ------   ------
         |   |d2 |         | -1 |   | -1 |
             |---|   user  ------   ------
         K   |ip2|   CTX   |ip3 |   |ip3 |
             |---|         |----|   |----|
         |   |d1 |   ...   |ip2 | , |ip2 |
         |   |---|  |---|  |----|   |----|
         |   |ip1|  |bp0|  |ip1 |   |ip1 |
         |   |---|  |---|  |----|   |----|
         |   |   |  |ip0|->|ip0 |   |ip0 |<-user stack top
         |   |   |  |---|  ------   ------
         |   |   |<-|sp0|<-stack    |kip0|<-kernel stack bottom
         --> -----  -----   top     |----|
                                    |kip1|
                                    |----|
      		              |kip2|
      		              |----|
                                     ....
      			      |    |<-kernel stack top
                                    ------
      
        Algorithm details:
      
         Legend:
      
         HWS - hardware stack
         K-SWS - kernel software stack
      
      			 BRANCH
      			 TABLE
      
      		 HWS      ip   ip
      			  from to
      		 ------  -----------
      		 |ip7`|  |ip7`|    |
      		 |----|  |----|----|
      		 |ip6`|  |ip6`|    |
      	user PSS |----|  |----|----|
      		 |ip5`|  |ip5`|    |
      	------   |----|  |----|----|
      	| -1 |   |ip4`|  |ip4`|    |
      	------   |----|  |----|----|
      	|ip3 |~~~|ip3`|  |ip3`|    |
      	|----|   |----|  |----|----|
      	|ip2 |~~~|ip2`|  |ip2`|    |
      	|----| 	 |----|  |----|----|
      	|ip1 |~~~|ip1`|  |ip1`|ip0`|
      	|----| 	 |----|  -----------
      	|ip0 |~~~|ip0`|<---------'
      	------   ------
      
      	1. if (sym(ipj) == sym(ipj`)), j=0-3 ===> user PSS
      	2. ipj`                      , j=4-7 ===> user PSS
      
        Augmented PSS = A_SWS(SWS(THS(K)), HWS):
      
      	         user/
             user PSS  kernel PSS
      
      	------   ------
      	|ip7`|   |ip7`|<-user PSS bottom
      	|----|   |----|
      	|ip6`|   |ip6`|
      	|----|   |----|
          HWS	|ip5`|   |ip5`|
      	|----|   |----|
      	|ip4`|   |ip4`|
      	------   ------
      	|ip3 |   |ip3 |
      	|----|   |----|
          SWS |ip2 |   |ip2 |
      	|----|   |----|
      	|ip1 |   |ip1 |
      	|----|   |----|
      	|ip0 |   |ip0 |<-user PSS top
      	------   ------
      		 |kip0|<-kernel PSS bottom
      		 |----|
      		 |kip1|
      	   K-SWS |----|
      		 |kip2|
      		 |----|
      		 |kip3|<-kernel PSS top
      		 ------
      
                        APSS
      
      Committer testing:
      
      Before:
      
        # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
        unknown branch filter stack, check man page
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -j, --branch-filter <branch filter mask>
                                  branch stack filter modes
        # perf record -g --call-graph dwarf,1024 -j u ls > /dev/null
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.054 MB perf.data (12 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY, sample_regs_user: 0xff0fff, sample_stack_user: 1024
         #
      
      After:
      
        # perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.044 MB perf.data (11 samples) ]
        [root@quaco ~]# perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK, sample_regs_user: 0xff0fff, sample_stack_user: 1024
        #
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/e9e00090-66fb-d2a4-c90f-1d12344f7788@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      25663496
    • A
      perf evsel: Add comment for 'idx' member in 'struct perf_sample_id · 3c84e65a
      Adrian Hunter 提交于
      The 'idx' member was added as preparation for AUX area sampling. Add a
      comment to describe why.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/83ff264f-84c3-5372-8976-dd9293d20c6f@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3c84e65a
    • A
      tools headers: Grab copy of linux/const.h, needed by linux/bits.h · aaa6ef8a
      Arnaldo Carvalho de Melo 提交于
      So that can update the copy of linux/bits.h that now uses macros defined
      in const.h and that are not available in older systems.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-c2qfcbl58hxyfb5u5xivp7is@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aaa6ef8a
    • A
      perf tools: tools/include should come before tools/uapi/include · 146dc303
      Arnaldo Carvalho de Melo 提交于
      The next cset will grap const.h copies from the kernel to keep bits.h
      in sync as it started to use linux/const.h, that in turn includes
      uapi/linux/const.h.
      
      So now we have a file with the same name in tools/include and
      tools/uapi/include, and one includes the other, we need to have
      tools/include/uapi/ after tools/include/ for this to work, fix it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-qzjqxa1wdrt51kwadyqawnuj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      146dc303
  4. 16 8月, 2019 5 次提交
    • J
      perf unwind: Remove unnecessary test · e2736219
      John Keeping 提交于
      If dwarf_callchain_users is false, then unwind__prepare_access() will
      not set unwind_libunwind_ops so the remaining test here is sufficient.
      Signed-off-by: NJohn Keeping <john@metanate.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: john keeping <john@metanate.com>
      Link: http://lkml.kernel.org/r/20190815100146.28842-3-john@metanate.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e2736219
    • J
      perf unwind: Fix libunwind when tid != pid · e8ba2906
      John Keeping 提交于
      Commit e5adfc3e ("perf map: Synthesize maps only for thread group
      leader") changed the recording side so that we no longer get mmap events
      for threads other than the thread group leader (when synthesising these
      events for threads which exist before perf is started).
      
      When a file recorded after this change is loaded, the lack of mmap
      records mean that unwinding is not set up for any other threads.
      
      This can be seen in a simple record/report scenario:
      
      	perf record --call-graph=dwarf -t $TID
      	perf report
      
      If $TID is a process ID then the report will show call graphs, but if
      $TID is a secondary thread the output is as if --call-graph=none was
      specified.
      
      Following the rationale in that commit, move the libunwind fields into
      struct map_groups and update the libunwind functions to take this
      instead of the struct thread.  This is only required for
      unwind__finish_access which must now be called from map_groups__delete
      and the others are changed for symmetry.
      
      Note that unwind__get_entries keeps the thread argument since it is
      required for symbol lookup and the libdw unwind provider uses the thread
      ID.
      Signed-off-by: NJohn Keeping <john@metanate.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: e5adfc3e ("perf map: Synthesize maps only for thread group leader")
      Link: http://lkml.kernel.org/r/20190815100146.28842-2-john@metanate.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e8ba2906
    • J
      perf map: Use zalloc for map_groups · ab6cd0e5
      John Keeping 提交于
      In the next commit we will add new fields to map_groups and we need
      these to be null if no value is assigned.  The simplest way to achieve
      this is to request zeroed memory from the allocator.
      Signed-off-by: NJohn Keeping <john@metanate.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: john keeping <john@metanate.com>
      Link: http://lkml.kernel.org/r/20190815100146.28842-1-john@metanate.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab6cd0e5
    • A
      perf report: Add --switch-on/--switch-off events · ef4b1a53
      Arnaldo Carvalho de Melo 提交于
      Since 'perf top' shares the histogram browser with 'perf report', then
      the same explanation in the previous cset applies.
      
      An additional example uses a pair of SDT events available for systemtap:
      
        # perf probe --exec=/usr/bin/stap '%*:*'
        Added new events:
          sdt_stap:benchmark__thread__start (on %* in /usr/bin/stap)
          sdt_stap:benchmark   (on %* in /usr/bin/stap)
          sdt_stap:benchmark__thread__end (on %* in /usr/bin/stap)
          sdt_stap:pass6__start (on %* in /usr/bin/stap)
          sdt_stap:pass6__end  (on %* in /usr/bin/stap)
          sdt_stap:pass5__start (on %* in /usr/bin/stap)
          sdt_stap:pass5__end  (on %* in /usr/bin/stap)
          sdt_stap:pass0__start (on %* in /usr/bin/stap)
          sdt_stap:pass0__end  (on %* in /usr/bin/stap)
          sdt_stap:pass1a__start (on %* in /usr/bin/stap)
          sdt_stap:pass1b__start (on %* in /usr/bin/stap)
          sdt_stap:pass1__end  (on %* in /usr/bin/stap)
          sdt_stap:pass2__start (on %* in /usr/bin/stap)
          sdt_stap:pass2__end  (on %* in /usr/bin/stap)
          sdt_stap:pass3__start (on %* in /usr/bin/stap)
          sdt_stap:pass3__end  (on %* in /usr/bin/stap)
          sdt_stap:pass4__start (on %* in /usr/bin/stap)
          sdt_stap:pass4__end  (on %* in /usr/bin/stap)
          sdt_stap:benchmark__start (on %* in /usr/bin/stap)
          sdt_stap:benchmark__end (on %* in /usr/bin/stap)
          sdt_stap:cache__get  (on %* in /usr/bin/stap)
          sdt_stap:cache__clean (on %* in /usr/bin/stap)
          sdt_stap:cache__add__module (on %* in /usr/bin/stap)
          sdt_stap:cache__add__source (on %* in /usr/bin/stap)
          sdt_stap:stap_system__complete (on %* in /usr/bin/stap)
          sdt_stap:stap_system__start (on %* in /usr/bin/stap)
          sdt_stap:stap_system__spawn (on %* in /usr/bin/stap)
          sdt_stap:stap_system__fork (on %* in /usr/bin/stap)
          sdt_stap:intern_string (on %* in /usr/bin/stap)
          sdt_stap:client__start (on %* in /usr/bin/stap)
          sdt_stap:client__end (on %* in /usr/bin/stap)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e sdt_stap:client__end -aR sleep 1
      
        #
      
      From these we're use the two below to run systemtap's test suite:
      
        # perf record -e sdt_stap:pass2__*,cycles:P make installcheck > /dev/null
        ^C[ perf record: Woken up 8 times to write data ]
        [ perf record: Captured and wrote 2.691 MB perf.data (39638 samples) ]
        Terminated
        # perf script | grep sdt_stap
                    stap 28979 [000] 19424.302660: sdt_stap:pass2__start: (561b9a537de3) arg1=140730364262544
                    stap 28979 [000] 19424.333083:   sdt_stap:pass2__end: (561b9a53a9e1) arg1=140730364262544
                    stap 29045 [006] 19424.933460: sdt_stap:pass2__start: (563edddcede3) arg1=140722674883152
                    stap 29045 [006] 19424.963794:   sdt_stap:pass2__end: (563edddd19e1) arg1=140722674883152
        # perf script | grep cycles |  wc -l
        39634
        #
      
      Looking at the whole perf.data file:
      
        [root@quaco testsuite]# perf report | grep cycles:P -A25
        # Samples: 39K of event 'cycles:P'
        # Event count (approx.): 34044267368
        #
        # Overhead  Command  Shared Object         Symbol
        # ........  .......  ....................  ................................
        #
             3.50%  cc1      cc1                   [.] ht_lookup_with_hash
             3.04%  cc1      cc1                   [.] _cpp_lex_token
             2.11%  cc1      cc1                   [.] ggc_internal_alloc
             1.83%  cc1      cc1                   [.] cpp_get_token_with_location
             1.68%  cc1      libc-2.29.so          [.] _int_malloc
             1.41%  cc1      cc1                   [.] linemap_position_for_column
             1.25%  cc1      cc1                   [.] ggc_internal_cleared_alloc
             1.20%  cc1      cc1                   [.] c_lex_with_flags
             1.18%  cc1      cc1                   [.] get_combined_adhoc_loc
             1.05%  cc1      libc-2.29.so          [.] malloc
             1.01%  cc1      libc-2.29.so          [.] _int_free
             0.96%  stap     stap                  [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, stringtable_hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > >
             0.78%  stap     stap                  [.] lexer::scan
             0.74%  cc1      cc1                   [.] _cpp_lex_direct
             0.70%  cc1      cc1                   [.] pop_scope
             0.70%  cc1      cc1                   [.] c_parser_declspecs
             0.69%  stap     libc-2.29.so          [.] _int_malloc
             0.68%  cc1      cc1                   [.] htab_find_slot
             0.68%  cc1      [kernel.vmlinux]      [k] prepare_exit_to_usermode
             0.64%  cc1      [kernel.vmlinux]      [k] clear_page_erms
        [root@quaco testsuite]#
      
      And now only what happens in slices demarcated by those start/end SDT
      events:
      
        [root@quaco testsuite]# perf report --switch-on=sdt_stap:pass2__start --switch-off=sdt_stap:pass2__end | grep cycles:P -A100
        # Samples: 240  of event 'cycles:P'
        # Event count (approx.): 206491934
        #
        # Overhead  Command  Shared Object        Symbol
        # ........  .......  ...................  ................................................
        #
            38.99%  stap     stap                 [.] systemtap_session::register_library_aliases
            19.47%  stap     stap                 [.] match_key::operator<
            15.01%  stap     libc-2.29.so         [.] __memcmp_avx2_movbe
             5.19%  stap     libc-2.29.so         [.] _int_malloc
             2.50%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_insert_and_rebalance
             2.30%  stap     stap                 [.] match_node::build_no_more
             2.07%  stap     libc-2.29.so         [.] malloc
             1.66%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::find
             1.66%  stap     stap                 [.] match_node::bind
             1.58%  stap     [kernel.vmlinux]     [k] prepare_exit_to_usermode
             1.17%  stap     [kernel.vmlinux]     [k] native_irq_return_iret
             0.87%  stap     stap                 [.] 0x0000000000032ec4
             0.77%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_increment
             0.47%  stap     stap                 [.] std::vector<derived_probe_builder*, std::allocator<derived_probe_builder*> >::_M_realloc_insert<derived_probe_builder* const&>
             0.47%  stap     [kernel.vmlinux]     [k] get_page_from_freelist
             0.47%  stap     [kernel.vmlinux]     [k] swapgs_restore_regs_and_return_to_usermode
             0.47%  stap     [kernel.vmlinux]     [k] do_user_addr_fault
             0.46%  stap     [kernel.vmlinux]     [k] __pagevec_lru_add_fn
             0.46%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::_M_emplace_unique<std::pair<match_key, match_node*> >
             0.42%  stap     libstdc++.so.6.0.26  [.] 0x00000000000c18fa
             0.40%  stap     [kernel.vmlinux]     [k] interrupt_entry
             0.40%  stap     [kernel.vmlinux]     [k] update_load_avg
             0.40%  stap     [kernel.vmlinux]     [k] __intel_pmu_disable_all
             0.40%  stap     [kernel.vmlinux]     [k] clear_page_erms
             0.39%  stap     [kernel.vmlinux]     [k] __mod_node_page_state
             0.39%  stap     [kernel.vmlinux]     [k] error_entry
             0.39%  stap     [kernel.vmlinux]     [k] sync_regs
             0.38%  stap     [kernel.vmlinux]     [k] __handle_mm_fault
             0.38%  stap     stap                 [.] derive_probes
      
        #
        # (Tip: System-wide collection from all CPUs: perf record -a)
        #
        [root@quaco testsuite]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-408hvumcnyn93a0auihnawew@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ef4b1a53
    • A
      perf top: Add --switch-on/--switch-off events · 2f53ae34
      Arnaldo Carvalho de Melo 提交于
      Just like 'perf trace' and 'perf script', should be useful for instance
      to only consider samples after the initialization phase of some
      workload.
      
      The man page has some examples and considerations about its current
      interface, that still doesn't handle the on/off events in a special way,
      behaving just like when multiple events are specified, i.e.:
      
      - In non-group mode (when the event list is not enclosed in {}) show a
        a menu to allow choosing which event the user wants to see in the
        histograms browser
      
      - In group mode, be it using {} or asking for --group, show one column
        per event.
      
      Try for instance:
      
        # perf top -e '{cycles,instructions,probe:icmp_rcv}' --switch-on=probe:icmp_rcv
      
      Replace probe:icmp_rcv, that I put in place using:
      
        # perf probe icmp_rcv:59
      
      To hit when broadcast packets arrive, with a probe installed after an
      initialization phase is over or after some other point of interest, some
      garbage collection, etc, and also use --switch-off, for instance, on a
      probe installed after said garbage collection is over.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-c7q7qjeqtyvc9mkeipxza6ne@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f53ae34
  5. 15 8月, 2019 8 次提交
    • A
      perf trace: Add --switch-on/--switch-off events · 22ac4318
      Arnaldo Carvalho de Melo 提交于
      Just like with 'perf script':
      
        # perf trace -e sched:*,syscalls:*sleep* sleep 1
             0.000 :28345/28345 sched:sched_waking:comm=perf pid=28346 prio=120 target_cpu=005
             0.005 :28345/28345 sched:sched_wakeup:perf:28346 [120] success=1 CPU:005
             0.383 sleep/28346 sched:sched_process_exec:filename=/usr/bin/sleep pid=28346 old_pid=28346
             0.613 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=607375 [ns] vruntime=23289041218 [ns]
             0.689 sleep/28346 syscalls:sys_enter_nanosleep:rqtp: 0x7ffc491789b0
             0.693 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=72021 [ns] vruntime=23289113239 [ns]
             0.694 sleep/28346 sched:sched_switch:sleep:28346 [120] S ==> swapper/5:0 [120]
          1000.787 :0/0 sched:sched_waking:comm=sleep pid=28346 prio=120 target_cpu=005
          1000.824 :0/0 sched:sched_wakeup:sleep:28346 [120] success=1 CPU:005
          1000.908 sleep/28346 syscalls:sys_exit_nanosleep:0x0
          1001.218 sleep/28346 sched:sched_process_exit:comm=sleep pid=28346 prio=120
        # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep sleep 1
             0.000 sleep/28349 sched:sched_stat_runtime:comm=sleep pid=28349 runtime=603036 [ns] vruntime=23873537697 [ns]
             0.001 sleep/28349 sched:sched_switch:sleep:28349 [120] S ==> swapper/4:0 [120]
          1000.392 :0/0 sched:sched_waking:comm=sleep pid=28349 prio=120 target_cpu=004
          1000.443 :0/0 sched:sched_wakeup:sleep:28349 [120] success=1 CPU:004
          1000.540 sleep/28349 syscalls:sys_exit_nanosleep:0x0
          1000.852 sleep/28349 sched:sched_process_exit:comm=sleep pid=28349 prio=120
        # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 1
             0.000 sleep/28352 sched:sched_stat_runtime:comm=sleep pid=28352 runtime=610543 [ns] vruntime=24811686681 [ns]
             0.001 sleep/28352 sched:sched_switch:sleep:28352 [120] S ==> swapper/0:0 [120]
          1000.397 :0/0 sched:sched_waking:comm=sleep pid=28352 prio=120 target_cpu=000
          1000.440 :0/0 sched:sched_wakeup:sleep:28352 [120] success=1 CPU:000
        #
        # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep --show-on-off sleep 1
             0.000 sleep/28367 syscalls:sys_enter_nanosleep:rqtp: 0x7fffd1a25fc0
             0.004 sleep/28367 sched:sched_stat_runtime:comm=sleep pid=28367 runtime=628760 [ns] vruntime=22170052672 [ns]
             0.005 sleep/28367 sched:sched_switch:sleep:28367 [120] S ==> swapper/2:0 [120]
          1000.367 :0/0 sched:sched_waking:comm=sleep pid=28367 prio=120 target_cpu=002
          1000.412 :0/0 sched:sched_wakeup:sleep:28367 [120] success=1 CPU:002
          1000.512 sleep/28367 syscalls:sys_exit_nanosleep:0x0
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-t3ngpt1brcc1fm9gep9gxm4q@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      22ac4318
    • A
      perf evswitch: Add hint when not finding specified on/off events · 8b3c9ea7
      Arnaldo Carvalho de Melo 提交于
      If the user specifies a on or off switch event and it isn't in the
      perf.data file, provide a hint about how to see the events in the
      perf.data evlist:
      
        # perf script --switch-on=syscall:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep
        ERROR: event_on event not found (syscall:sys_enter_nanosleep)
        HINT:  use 'perf evlist' to see the available event names
        #
        # perf evlist
        sched:sched_kthread_stop
        sched:sched_kthread_stop_ret
        sched:sched_waking
        sched:sched_wakeup
        sched:sched_wakeup_new
        sched:sched_switch
        sched:sched_migrate_task
        sched:sched_process_free
        sched:sched_process_exit
        sched:sched_wait_task
        sched:sched_process_wait
        sched:sched_process_fork
        sched:sched_process_exec
        sched:sched_stat_wait
        sched:sched_stat_sleep
        sched:sched_stat_iowait
        sched:sched_stat_blocked
        sched:sched_stat_runtime
        sched:sched_pi_setprio
        sched:sched_move_numa
        sched:sched_stick_numa
        sched:sched_swap_numa
        sched:sched_wake_idle_without_ipi
        syscalls:sys_enter_clock_nanosleep
        syscalls:sys_exit_clock_nanosleep
        syscalls:sys_enter_nanosleep
        syscalls:sys_exit_nanosleep
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
        #
        # perf script --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep
             sleep 20919 [001] 109866.144411:  sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [ns]
             sleep 20919 [001] 109866.144412:        sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:        sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:        sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-iijjvdlyad973oskdq8gmi5w@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b3c9ea7
    • A
      perf evswitch: Move enoent error message printing to separate function · c9a42699
      Arnaldo Carvalho de Melo 提交于
      Allows adding hints there, will be done in followup patch.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-1kvrdi7weuz3hxycwvarcu6v@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c9a42699
    • A
      perf evswitch: Introduce init() method to set the on/off evsels from the command line · 124e02be
      Arnaldo Carvalho de Melo 提交于
      Another step in having all the boilerplate in just one place to then use
      in the other tools.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-snreb1wmwyjei3eefwotxp1l@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      124e02be
    • A
      perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing · add3a719
      Arnaldo Carvalho de Melo 提交于
      All tools will want those, so provide a convenient way to get them.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-v16pe3sbf3wjmn152u18f649@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      add3a719
    • A
      perf evswitch: Add the names of on/off events · 0b495b12
      Arnaldo Carvalho de Melo 提交于
      So that we can have macros for the OPT_ entries and also for finding
      those in an evlist, this way other tools will use this very easily.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-q0og1xoqqi0w38ve5u0a43k2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0b495b12
    • A
      perf evswitch: Move switch logic to use in other tools · 8829e56f
      Arnaldo Carvalho de Melo 提交于
      Now other tools that want switching can use an evswitch for that, just
      set it up and add it to the PERF_RECORD_SAMPLE processing function.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-b1trj1q97qwfv251l66q3noj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8829e56f
    • A
      perf evswitch: Move struct to a separate header to use in other tools · d2360442
      Arnaldo Carvalho de Melo 提交于
      Now that we see that the simple userspace-based "slicing" of events
      using delimiter events ("markers") works, lets move it to a separate
      header to make it available to other tools, next step will be having
      the switch on/off check done at the PERF_RECORD_SAMPLE processing
      function moved too.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-z0cyi9ifzlr37cedr9xztc1k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d2360442