1. 07 11月, 2019 40 次提交
    • J
      perf report: Sort by sampled cycles percent per block for stdio · 6f7164fa
      Jin Yao 提交于
      It would be useful to support sorting for all blocks by the sampled
      cycles percent per block. This is useful to concentrate on the globally
      hottest blocks.
      
      This patch implements a new option "--total-cycles" which sorts all
      blocks by 'Sampled Cycles%'. The 'Sampled Cycles%' is the percent:
      
       percent = block sampled cycles aggregation / total sampled cycles
      
      Note that, this patch only supports "--stdio" mode.
      
      For example,
      
        # perf record -b ./div
        # perf report --total-cycles --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 2M of event 'cycles'
        # Event count (approx.): 2753248
        #
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                             [Program Block Range]      Shared Object
        # ...............  ..............  ...........  ..........  ................................................  .................
        #
                   26.04%            2.8M        0.40%          18                            [div.c:42 -> div.c:39]                div
                   15.17%            1.2M        0.16%           7                [random_r.c:357 -> random_r.c:380]       libc-2.27.so
                    5.11%          402.0K        0.04%           2                            [div.c:27 -> div.c:28]                div
                    4.87%          381.6K        0.04%           2                    [random.c:288 -> random.c:291]       libc-2.27.so
                    4.53%          381.0K        0.04%           2                            [div.c:40 -> div.c:40]                div
                    3.85%          300.9K        0.02%           1                            [div.c:22 -> div.c:25]                div
                    3.08%          241.1K        0.02%           1                          [rand.c:26 -> rand.c:27]       libc-2.27.so
                    3.06%          240.0K        0.02%           1                    [random.c:291 -> random.c:291]       libc-2.27.so
                    2.78%          215.7K        0.02%           1                    [random.c:298 -> random.c:298]       libc-2.27.so
                    2.52%          198.3K        0.02%           1                    [random.c:293 -> random.c:293]       libc-2.27.so
                    2.36%          184.8K        0.02%           1                          [rand.c:28 -> rand.c:28]       libc-2.27.so
                    2.33%          180.5K        0.02%           1                    [random.c:295 -> random.c:295]       libc-2.27.so
                    2.28%          176.7K        0.02%           1                    [random.c:295 -> random.c:295]       libc-2.27.so
                    2.20%          168.8K        0.02%           1                        [rand@plt+0 -> rand@plt+0]                div
                    1.98%          158.2K        0.02%           1                [random_r.c:388 -> random_r.c:388]       libc-2.27.so
                    1.57%          123.3K        0.02%           1                            [div.c:42 -> div.c:44]                div
                    1.44%          116.0K        0.42%          19                [random_r.c:357 -> random_r.c:394]       libc-2.27.so
                    0.25%          182.5K        0.02%           1                [random_r.c:388 -> random_r.c:391]       libc-2.27.so
                    0.00%              48        1.07%          48        [x86_pmu_enable+284 -> x86_pmu_enable+298]  [kernel.kallsyms]
                    0.00%              74        1.64%          74             [vm_mmap_pgoff+0 -> vm_mmap_pgoff+92]  [kernel.kallsyms]
                    0.00%              73        1.62%          73                         [vm_mmap+0 -> vm_mmap+48]  [kernel.kallsyms]
                    0.00%              63        0.69%          31                       [up_write+0 -> up_write+34]  [kernel.kallsyms]
                    0.00%              13        0.29%          13      [setup_arg_pages+396 -> setup_arg_pages+413]  [kernel.kallsyms]
                    0.00%               3        0.07%           3      [setup_arg_pages+418 -> setup_arg_pages+450]  [kernel.kallsyms]
                    0.00%             616        6.84%         308   [security_mmap_file+0 -> security_mmap_file+72]  [kernel.kallsyms]
                    0.00%              23        0.51%          23  [security_mmap_file+77 -> security_mmap_file+87]  [kernel.kallsyms]
                    0.00%               4        0.02%           1                  [sched_clock+0 -> sched_clock+4]  [kernel.kallsyms]
                    0.00%               4        0.02%           1                 [sched_clock+9 -> sched_clock+12]  [kernel.kallsyms]
                    0.00%               1        0.02%           1                [rcu_nmi_exit+0 -> rcu_nmi_exit+9]  [kernel.kallsyms]
      
      Committer testing:
      
      This should provide material for hours of endless joy, both from looking
      for suspicious things in the implementation of this patch, such as the
      top one:
      
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
                    2.17%            1.7M        0.08%         607   [compiler.h:199 -> common.c:221]              [kernel.vmlinux]
      
      As well from things that look legit:
      
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                          [Program Block Range]     Shared Object
                    0.16%          123.0K        0.60%        4.7K   [nospec-branch.h:265 -> nospec-branch.h:278]  [kernel.vmlinux]
      
      :-)
      
      Very short system wide taken branches session:
      
        # perf record -h -b
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -b, --branch-any      sample any taken branches
      
        #
        # perf record -b
        ^C[ perf record: Woken up 595 times to write data ]
        [ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]
      
        #
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
        #
        # perf report --total-cycles --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 6M of event 'cycles'
        # Event count (approx.): 6299936
        #
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
        # ...............  ..............  ...........  ..........  ......................................................................  ....................
        #
                    2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
                    1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151]          libc-2.29.so
                    0.72%          544.5K        0.03%         230                                      [entry_64.S:657 -> entry_64.S:662]      [kernel.vmlinux]
                    0.56%          541.8K        0.09%         672                                        [compiler.h:199 -> common.c:300]      [kernel.vmlinux]
                    0.39%          293.2K        0.01%         104                                    [list_debug.c:43 -> list_debug.c:61]      [kernel.vmlinux]
                    0.36%          278.6K        0.03%         272                                    [entry_64.S:1289 -> entry_64.S:1308]      [kernel.vmlinux]
                    0.30%          260.8K        0.07%         564                              [clear_page_64.S:47 -> clear_page_64.S:50]      [kernel.vmlinux]
                    0.28%          215.3K        0.05%         369                                            [traps.c:623 -> traps.c:628]      [kernel.vmlinux]
                    0.23%          178.1K        0.04%         278                                      [entry_64.S:271 -> entry_64.S:275]      [kernel.vmlinux]
                    0.20%          152.6K        0.09%         706                                      [paravirt.c:177 -> paravirt.c:179]      [kernel.vmlinux]
                    0.20%          155.8K        0.05%         373                                      [entry_64.S:153 -> entry_64.S:175]      [kernel.vmlinux]
                    0.18%          136.6K        0.03%         222                                                [msr.h:105 -> msr.h:166]      [kernel.vmlinux]
                    0.16%          123.0K        0.60%        4.7K                            [nospec-branch.h:265 -> nospec-branch.h:278]      [kernel.vmlinux]
                    0.16%          118.3K        0.01%          44                                      [entry_64.S:632 -> entry_64.S:657]      [kernel.vmlinux]
                    0.14%          104.5K        0.00%          28                                          [rwsem.c:1541 -> rwsem.c:1544]      [kernel.vmlinux]
                    0.13%           99.2K        0.01%          53                                      [spinlock.c:150 -> spinlock.c:152]      [kernel.vmlinux]
                    0.13%           95.5K        0.00%          35                                              [swap.c:456 -> swap.c:471]      [kernel.vmlinux]
                    0.12%           96.2K        0.05%         407                              [copy_user_64.S:175 -> copy_user_64.S:209]      [kernel.vmlinux]
                    0.11%           85.9K        0.00%          31                                        [swap.c:400 -> page-flags.h:188]      [kernel.vmlinux]
                    0.10%           73.0K        0.01%          52                                          [paravirt.h:763 -> list.h:131]      [kernel.vmlinux]
                    0.07%           56.2K        0.03%         214                                      [filemap.c:1524 -> filemap.c:1557]      [kernel.vmlinux]
                    0.07%           54.2K        0.02%         145                                        [memory.c:1032 -> memory.c:1049]      [kernel.vmlinux]
                    0.07%           50.3K        0.00%          39                                            [mmzone.c:49 -> mmzone.c:69]      [kernel.vmlinux]
                    0.06%           48.3K        0.01%          40                                   [paravirt.h:768 -> page_alloc.c:3304]      [kernel.vmlinux]
                    0.06%           46.7K        0.02%         155                                        [memory.c:1032 -> memory.c:1056]      [kernel.vmlinux]
                    0.06%           46.9K        0.01%         103                                              [swap.c:867 -> swap.c:902]      [kernel.vmlinux]
                    0.06%           47.8K        0.00%          34                                    [entry_64.S:1201 -> entry_64.S:1202]      [kernel.vmlinux]
      
       -----------------------------------------------------------
      
       v7:
       ---
       Use use_browser in report__browse_block_hists for supporting
       stdio and potential tui mode.
      
       v6:
       ---
       Create report__browse_block_hists in block-info.c (codes are
       moved from builtin-report.c). It's called from
       perf_evlist__tty_browse_hists.
      
       v5:
       ---
       1. Move all block functions to block-info.c
      
       2. Move the code of setting ms in block hist_entry to
          other patch.
      
       v4:
       ---
       1. Use new option '--total-cycles' to replace
          '-s total_cycles' in v3.
      
       2. Move block info collection out of block info
          printing.
      
       v3:
       ---
       1. Use common function block_info__process_sym to
          process the blocks per symbol.
      
       2. Remove the nasty hack for skipping calculation
          of column length
      
       3. Some minor cleanup
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-6-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6f7164fa
    • J
      perf hist: Support block formats with compare/sort/display · b65a7d37
      Jin Yao 提交于
      This patch provides helper routines to support new columns for block
      info output.
      
      The new columns are:
      
        Sampled Cycles%
        Sampled Cycles
        Avg Cycles%
        Avg Cycles
        [Program Block Range]
        Shared Object
      
       v5:
       ---
       1. Move more block related functions from builtin-report.c to
          block-info.c
      
       2. Set ms (map+sym) in block hist_entry. Because this info
          is needed for reporting the block range (i.e. source line)
      
      Committer notes:
      
      Remove unused set_fmt() function, some build were not completing with:
      
        util/block-info.c:396:20: error: unused function 'set_fmt' [-Werror,-Wunused-function]
        static inline void set_fmt(struct block_fmt *block_fmt,
                           ^
        1 error generated.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-5-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b65a7d37
    • J
      perf hist: Count the total cycles of all samples · 7841f40a
      Jin Yao 提交于
      We can get the per sample cycles by hist__account_cycles(). It's also
      useful to know the total cycles of all samples in order to get the
      cycles coverage for a single program block in further. For example:
      
        coverage = per block sampled cycles / total sampled cycles
      
      This patch creates a new argument 'total_cycles' in hist__account_cycles(),
      which will be added with the cycles of each sample.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-4-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7841f40a
    • J
      perf block: Cleanup and refactor block info functions · 60414418
      Jin Yao 提交于
      We have already implemented some block-info related functions.
      Now it's time to do some cleanup, refactoring and move the
      functions and structures to new block-info.h/block-info.c.
      
       v4:
       ---
       Move code for skipping column length calculation to patch:
       'perf diff: Don't use hack to skip column length calculation'
      
       v3:
       ---
       1. Rename the patch title
       2. Rename from block.h/block.c to block-info.h/block-info.c
       3. Move more common part to block-info, such as
          block_info__process_sym.
       4. Remove the nasty hack for skipping calculation of column
          length
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-3-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      60414418
    • J
      perf diff: Don't use hack to skip column length calculation · 0bdf181f
      Jin Yao 提交于
      Previously we use a nasty hack to skip the hists__calc_col_len for block
      since this function is not very suitable for block column length
      calculation.
      
      This patch removes the hack code and add a check at the entry of
      hists__calc_col_len to skip for block case.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-2-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0bdf181f
    • L
      perf tests: Fix out of bounds memory access · af8490eb
      Leo Yan 提交于
      The test case 'Read backward ring buffer' failed on 32-bit architectures
      which were found by LKFT perf testing.  The test failed on arm32 x15
      device, qemu_arm32, qemu_i386, and found intermittent failure on i386;
      the failure log is as below:
      
        50: Read backward ring buffer                  :
        --- start ---
        test child forked, pid 510
        Using CPUID GenuineIntel-6-9E-9
        mmap size 1052672B
        mmap size 8192B
        Finished reading overwrite ring buffer: rewind
        free(): invalid next size (fast)
        test child interrupted
        ---- end ----
        Read backward ring buffer: FAILED!
      
      The log hints there have issue for memory usage, thus free() reports
      error 'invalid next size' and directly exit for the case.  Finally, this
      issue is root caused as out of bounds memory access for the data array
      'evsel->id'.
      
      The backward ring buffer test invokes do_test() twice.  'evsel->id' is
      allocated at the first call with the flow:
      
        test__backward_ring_buffer()
          `-> do_test()
      	  `-> evlist__mmap()
      	        `-> evlist__mmap_ex()
      	              `-> perf_evsel__alloc_id()
      
      So 'evsel->id' is allocated with one item, and it will be used in
      function perf_evlist__id_add():
      
         evsel->id[0] = id
         evsel->ids   = 1
      
      At the second call for do_test(), it skips to initialize 'evsel->id'
      and reuses the array which is allocated in the first call.  But
      'evsel->ids' contains the stale value.  Thus:
      
         evsel->id[1] = id    -> out of bound access
         evsel->ids   = 2
      
      To fix this issue, we will use evlist__open() and evlist__close() pair
      functions to prepare and cleanup context for evlist; so 'evsel->id' and
      'evsel->ids' can be initialized properly when invoke do_test() and avoid
      the out of bounds memory access.
      
      Fixes: ee74701e ("perf tests: Add test to check backward ring buffer")
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: stable@vger.kernel.org # v4.10+
      Link: http://lore.kernel.org/lkml/20191107020244.2427-1-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af8490eb
    • J
      perf record: Add support for limit perf output file size · 6d575816
      Jiwei Sun 提交于
      The patch adds a new option to limit the output file size, then based on
      it, we can create a wrapper of the perf command that uses the option to
      avoid exhausting the disk space by the unconscious user.
      
      In order to make the perf.data parsable, we just limit the sample data
      size, since the perf.data consists of many headers and sample data and
      other data, the actual size of the recorded file will bigger than the
      setting value.
      
      Testing it:
      
        # ./perf record -a -g --max-size=10M
        Couldn't synthesize bpf events.
        [ perf record: perf size limit reached (10249 KB), stopping session ]
        [ perf record: Woken up 32 times to write data ]
        [ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ]
      
        # ls -lh perf.data
        -rw------- 1 root root 11M Oct 22 14:32 perf.data
      
        # ./perf record -a -g --max-size=10K
        [ perf record: perf size limit reached (10 KB), stopping session ]
        Couldn't synthesize bpf events.
        [ perf record: Woken up 0 times to write data ]
        [ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ]
      
        # ls -l perf.data
        -rw------- 1 root root 1626952 Oct 22 14:36 perf.data
      
      Committer notes:
      
      Fixed the build in multiple distros by using PRIu64 to print u64 struct
      members, fixing this:
      
        builtin-record.c: In function 'record__write':
        builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
             rec->bytes_written >> 10);
             ^
          CC       /tmp/build/pe
      Signed-off-by: NJiwei Sun <jiwei.sun@windriver.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Danter <richard.danter@windriver.com>
      Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d575816
    • M
      perf probe: Skip overlapped location on searching variables · dee36a2a
      Masami Hiramatsu 提交于
      Since debuginfo__find_probes() callback function can be called with  the
      location which already passed, the callback function must filter out
      such overlapped locations.
      
      add_probe_trace_event() has already done it by commit 1a375ae7
      ("perf probe: Skip same probe address for a given line"), but
      add_available_vars() doesn't. Thus perf probe -v shows same address
      repeatedly as below:
      
        # perf probe -V vfs_read:18
        Available variables at vfs_read:18
                @<vfs_read+217>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
                @<vfs_read+217>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
                @<vfs_read+226>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
      
      With this fix, perf probe -V shows it correctly:
      
        # perf probe -V vfs_read:18
        Available variables at vfs_read:18
                @<vfs_read+217>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
                @<vfs_read+226>
                        char*   buf
                        loff_t* pos
                        ssize_t ret
                        struct file*    file
      
      Fixes: cf6eb489 ("perf probe: Show accessible local variables")
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241938927.32002.4026859017790562751.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dee36a2a
    • M
      perf probe: Fix to show calling lines of inlined functions · 86c0bf85
      Masami Hiramatsu 提交于
      Fix to show calling lines of inlined functions (where an inline function
      is called).
      
      die_walk_lines() filtered out the lines inside inlined functions based
      on the address. However this also filtered out the lines which call
      those inlined functions from the target function.
      
      To solve this issue, check the call_file and call_line attributes and do
      not filter out if it matches to the line information.
      
      Without this fix, perf probe -L doesn't show some lines correctly.
      (don't see the lines after 17)
      
        # perf probe -L vfs_read
        <vfs_read@/home/mhiramat/ksrc/linux/fs/read_write.c:0>
              0  ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
              1  {
              2         ssize_t ret;
      
              4         if (!(file->f_mode & FMODE_READ))
                                return -EBADF;
              6         if (!(file->f_mode & FMODE_CAN_READ))
                                return -EINVAL;
              8         if (unlikely(!access_ok(buf, count)))
                                return -EFAULT;
      
             11         ret = rw_verify_area(READ, file, pos, count);
             12         if (!ret) {
             13                 if (count > MAX_RW_COUNT)
                                        count =  MAX_RW_COUNT;
             15                 ret = __vfs_read(file, buf, count, pos);
             16                 if (ret > 0) {
                                        fsnotify_access(file);
                                        add_rchar(current, ret);
                                }
      
      With this fix:
      
        # perf probe -L vfs_read
        <vfs_read@/home/mhiramat/ksrc/linux/fs/read_write.c:0>
              0  ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
              1  {
              2         ssize_t ret;
      
              4         if (!(file->f_mode & FMODE_READ))
                                return -EBADF;
              6         if (!(file->f_mode & FMODE_CAN_READ))
                                return -EINVAL;
              8         if (unlikely(!access_ok(buf, count)))
                                return -EFAULT;
      
             11         ret = rw_verify_area(READ, file, pos, count);
             12         if (!ret) {
             13                 if (count > MAX_RW_COUNT)
                                        count =  MAX_RW_COUNT;
             15                 ret = __vfs_read(file, buf, count, pos);
             16                 if (ret > 0) {
             17                         fsnotify_access(file);
             18                         add_rchar(current, ret);
                                }
             20                 inc_syscr(current);
                        }
      
      Fixes: 4cc9cec6 ("perf probe: Introduce lines walker interface")
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241937995.32002.17899884017011512577.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      86c0bf85
    • M
      perf probe: Filter out instances except for inlined subroutine and subprogram · da6cb952
      Masami Hiramatsu 提交于
      Filter out instances except for inlined_subroutine and subprogram DIE in
      die_walk_instances() and die_is_func_instance().
      
      This fixes an issue that perf probe sets some probes on calling address
      instead of a target function itself.
      
      When perf probe walks on instances of an abstruct origin (a kind of
      function prototype of inlined function), die_walk_instances() can also
      pass a GNU_call_site (a GNU extension for call site) to callback. Since
      it is not an inlined instance of target function, we have to filter out
      when searching a probe point.
      
      Without this patch, perf probe sets probes on call site address too.This
      can happen on some function which is marked "inlined", but has actual
      symbol. (I'm not sure why GCC mark it "inlined"):
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+2500017
        p:probe/vfs_read_1 _text+2499468
        p:probe/vfs_read_2 _text+2499563
        p:probe/vfs_read_3 _text+2498876
        p:probe/vfs_read_4 _text+2498512
        p:probe/vfs_read_5 _text+2498627
      
      With this patch:
      
      Slightly different results, similar tho:
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+2498512
      
      Committer testing:
      
        # uname -a
        Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
      
      Before:
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+3131557
        p:probe/vfs_read_1 _text+3130975
        p:probe/vfs_read_2 _text+3131047
        p:probe/vfs_read_3 _text+3130380
        p:probe/vfs_read_4 _text+3130000
        # uname -a
        Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
        #
      
      After:
      
        # perf probe -D vfs_read
        p:probe/vfs_read _text+3130000
        #
      
      Fixes: db0d2c64 ("perf probe: Search concrete out-of-line instances")
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241937063.32002.11024544873990816590.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      da6cb952
    • M
      perf probe: Skip end-of-sequence and non statement lines · f4d99bdf
      Masami Hiramatsu 提交于
      Skip end-of-sequence and non-statement lines while walking through lines
      list.
      
      The "end-of-sequence" line information means:
      
       "the current address is that of the first byte after the
        end of a sequence of target machine instructions."
       (DWARF version 4 spec 6.2.2)
      
      This actually means out of scope and we can not probe on it.
      
      On the other hand, the statement lines (is_stmt) means:
      
       "the current instruction is a recommended breakpoint location.
        A recommended breakpoint location is intended to “represent”
        a line, a statement and/or a semantically distinct subpart
        of a statement."
      
       (DWARF version 4 spec 6.2.2)
      
      So, non-statement line info also should be skipped.
      
      These can reduce unneeded probe points and also avoid an error.
      
      E.g. without this patch:
      
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new events:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_3 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_4 (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask_4 -aR sleep 1
      
        #
      
      This puts 5 probes on one line, but acutally it's not inlined function.
      This is because there are many non statement instructions at the
      function prologue.
      
      With this patch:
      
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new event:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
      
        #
      
      Now perf-probe skips unneeded addresses.
      
      Committer testing:
      
      Slightly different results, but similar:
      
      Before:
      
        # uname -a
        Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
        #
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new events:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:1)
          probe:clear_tasks_mm_cpumask_2 (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask_2 -aR sleep 1
      
        #
      
      After:
      
        # perf probe -a "clear_tasks_mm_cpumask:1"
        Added new event:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask:1)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
      
        # perf probe -l
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c)
        #
      
      Fixes: 4cc9cec6 ("perf probe: Introduce lines walker interface")
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157241936090.32002.12156347518596111660.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f4d99bdf
    • M
      perf probe: Return a better scope DIE if there is no best scope · c701636a
      Masami Hiramatsu 提交于
      Make find_best_scope() returns innermost DIE at given address if there
      is no best matched scope DIE. Since Gcc sometimes generates intuitively
      strange line info which is out of inlined function address range, we
      need this fixup.
      
      Without this, sometimes perf probe failed to probe on a line inside an
      inlined function:
      
        # perf probe -D ksys_open:3
        Failed to find scope of probe point.
          Error: Failed to add events.
      
      With this fix, 'perf probe' can probe it:
      
        # perf probe -D ksys_open:3
        p:probe/ksys_open _text+25707308
        p:probe/ksys_open_1 _text+25710596
        p:probe/ksys_open_2 _text+25711114
        p:probe/ksys_open_3 _text+25711343
        p:probe/ksys_open_4 _text+25714058
        p:probe/ksys_open_5 _text+2819653
        p:probe/ksys_open_6 _text+2819701
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157291300887.19771.14936015360963292236.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c701636a
    • I
      perf annotate: Fix heap overflow · 5c65b1c0
      Ian Rogers 提交于
      Fix expand_tabs that copies the source lines '\0' and then appends
      another '\0' at a potentially out of bounds address.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20191026035644.217548-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5c65b1c0
    • A
      perf machine: Add kernel_dso() method · 93730f85
      Arnaldo Carvalho de Melo 提交于
      To reduce boilerplate in some places.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-9s1bgoxxhlnu037e1nqx0tw3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      93730f85
    • A
      perf symbols: Remove needless checks for map->groups->machine · b0c76fc4
      Arnaldo Carvalho de Melo 提交于
      Its sufficient to check if map->groups is NULL before using it to get
      ->machine value.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-utiepyiv8b1tf8f79ok9d6j8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b0c76fc4
    • I
      perf parse: Add a deep delete for parse event terms · 1dc92556
      Ian Rogers 提交于
      Add a parse_events_term deep delete function so that owned strings and
      arrays are freed.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-10-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1dc92556
    • I
      perf parse: If pmu configuration fails free terms · 38f2c422
      Ian Rogers 提交于
      Avoid a memory leak when the configuration fails.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-9-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      38f2c422
    • I
      perf parse: Before yyabort-ing free components · cabbf268
      Ian Rogers 提交于
      Yyabort doesn't destruct inputs and so this must be done manually before
      using yyabort.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-8-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cabbf268
    • I
      perf parse: Add destructors for parse event terms · f2a8ecd8
      Ian Rogers 提交于
      If parsing fails then destructors are ran to clean the up the stack.
      Rename the head union member to make the term and evlist use cases more
      distinct, this simplifies matching the correct destructor.
      
      Committer notes:
      
      Jiri: "Nice did not know about this.. looks like it's been in bison for some time, right?"
      
      Ian:  "Looks like it wasn't in Bison 1 but in Bison 2, we're at Bison 3 and
             Bison 2 is > 14 years old:
             https://web.archive.org/web/20050924004158/http://www.gnu.org/software/bison/manual/html_mono/bison.html#Destructor-Decl"
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-7-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f2a8ecd8
    • I
      perf parse: Ensure config and str in terms are unique · b6645a72
      Ian Rogers 提交于
      Make it easier to release memory associated with parse event terms by
      duplicating the string for the config name and ensuring the val string
      is a duplicate.
      
      Currently the parser may memory leak terms and this is addressed in a
      later patch.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-6-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b6645a72
    • I
      perf parse: Add parse events handle error · 448d732c
      Ian Rogers 提交于
      Parse event error handling may overwrite one error string with another
      creating memory leaks. Introduce a helper routine that warns about
      multiple error messages as well as avoiding the memory leak.
      
      A reproduction of this problem can be seen with:
      
        perf stat -e c/c/
      
      After this change this produces:
      WARNING: multiple event parsing errors
      event syntax error: 'c/c/'
                             \___ unknown term
      
      valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore
      Run 'perf list' for a list of valid events
      
       Usage: perf stat [<options>] [<command>]
      
          -e, --event <event>   event selector. use 'perf list' to list available events
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191030223448.12930-2-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      448d732c
    • A
      perf inject: Make --strip keep evsels · ef5502a1
      Adrian Hunter 提交于
      create_gcov (refer to the autofdo example in tools/perf/Documentation/intel-pt.txt)
      now needs the evsels to read the perf.data file. So don't strip them.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lore.kernel.org/lkml/20191105100057.21465-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ef5502a1
    • J
      perf tools: Fix cross compile for ARM64 · 71f69907
      John Garry 提交于
      Currently when cross compiling perf tool for ARM64 on my x86 machine I
      get this error:
      
        arch/arm64/util/sym-handling.c:9:10: fatal error: gelf.h: No such file or directory
         #include <gelf.h>
      
      For the build, libelf is reported off:
      
        Auto-detecting system features:
        ...
        ...                        libelf: [ OFF ]
      
      Indeed, test-libelf is not built successfully:
      
        more ./build/feature/test-libelf.make.output
        test-libelf.c:2:10: fatal error: libelf.h: No such file or directory
         #include <libelf.h>
                ^~~~~~~~~~
        compilation terminated.
      
      I have no such problems natively compiling on ARM64, and I did not
      previously have this issue for cross compiling. Fix by relocating the
      gelf.h include.
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/1573045254-39833-1-git-send-email-john.garry@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      71f69907
    • J
      perf stat: Add --per-node agregation support · 86895b48
      Jiri Olsa 提交于
      Adding new --per-node option to aggregate counts per NUMA
      nodes for system-wide mode measurements.
      
      You can specify --per-node in live mode:
      
        # perf stat  -a -I 1000 -e cycles --per-node
        #           time node   cpus             counts unit events
             1.000542550 N0       20          6,202,097      cycles
             1.000542550 N1       20            639,559      cycles
             2.002040063 N0       20          7,412,495      cycles
             2.002040063 N1       20          2,185,577      cycles
             3.003451699 N0       20          6,508,917      cycles
             3.003451699 N1       20            765,607      cycles
        ...
      
      Or in the record/report stat session:
      
        # perf stat record -a -I 1000 -e cycles
        #           time             counts unit events
             1.000536937         10,008,468      cycles
             2.002090152          9,578,539      cycles
             3.003625233          7,647,869      cycles
             4.005135036          7,032,086      cycles
        ^C     4.340902364          3,923,893      cycles
      
        # perf stat report --per-node
        #           time node   cpus             counts unit events
             1.000536937 N0       20          9,355,086      cycles
             1.000536937 N1       20            653,382      cycles
             2.002090152 N0       20          7,712,838      cycles
             2.002090152 N1       20          1,865,701      cycles
             3.003625233 N0       20          6,604,441      cycles
             3.003625233 N1       20          1,043,428      cycles
             4.005135036 N0       20          6,350,522      cycles
             4.005135036 N1       20            681,564      cycles
             4.340902364 N0       20          3,403,188      cycles
             4.340902364 N1       20            520,705      cycles
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20190904073415.723-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      86895b48
    • J
      perf env: Add perf_env__numa_node() · 389799a7
      Jiri Olsa 提交于
      To speed up cpu to node lookup, add perf_env__numa_node(), that creates
      cpu array on the first lookup, that holds numa nodes for each stored
      cpu.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20190904073415.723-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      389799a7
    • H
      perf vendor events intel: Update all the Intel JSON metrics from TMAM 3.6. · 61ec07f5
      Haiyan Song 提交于
      New Metrics:
      
      - DSB_Switches: fraction of cycles CPU was stalled due to switches from DSB to MITE pipeline [all]
      - L2_Evictions_{Silent|NonSilent}_PKI: L2 {silent|non silent} ecivtions rate per Kilo instruction [SKX+]
      - IpFarBranch - Instructions per Far Branch
      
      Other Enhancements & fixes:
      
      - KBLR/CFL & CLX move to separate columns (no column sharing via if #model)
      - Re-organized/renamed Metric Group
      Signed-off-by: NHaiyan Song <haiyanx.song@intel.com>
      Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Link: http://lore.kernel.org/lkml/20191030082308.10919-1-haiyanx.song@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61ec07f5
    • H
      perf vendor events intel: Update CascadelakeX events to v1.05 · 7fcf1b89
      Haiyan Song 提交于
      Update CascadelakeX events to v1.05.
      
      Other changes:
      
       remove duplicated and without description events.
      Signed-off-by: NHaiyan Song <haiyanx.song@intel.com>
      Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Link: http://lore.kernel.org/lkml/20191030082308.10919-1-haiyanx.song@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7fcf1b89
    • I
      perf tools: Splice events onto evlist even on error · 8e8714c3
      Ian Rogers 提交于
      If event parsing fails the event list is leaked, instead splice the list
      onto the out result and let the caller cleanup.
      
      An example input for parse_events found by libFuzzer that reproduces
      this memory leak is 'm{'.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: clang-built-linux@googlegroups.com
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20191025180827.191916-5-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8e8714c3
    • J
      libsubcmd: Use -O0 with DEBUG=1 · 22bd8f1b
      James Clark 提交于
      When a 'make DEBUG=1' build is done, the command parser is still built
      with -O6 and is hard to step through, fix it making it use -O0 in that
      case.
      Signed-off-by: NJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: nd <nd@arm.com>
      Link: http://lore.kernel.org/lkml/20191028113340.4282-1-james.clark@arm.com
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      22bd8f1b
    • J
      libsubcmd: Move EXTRA_FLAGS to the end to allow overriding existing flags · d894967f
      James Clark 提交于
      Move EXTRA_WARNINGS and EXTRA_FLAGS to the end of the compilation line,
      otherwise they cannot be used to override the default values.
      Signed-off-by: NJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: nd <nd@arm.com>
      Link: http://lore.kernel.org/lkml/20191028113340.4282-1-james.clark@arm.com
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d894967f
    • A
      perf map_groups: Introduce for_each_entry() and for_each_entry_safe() iterators · 50481461
      Arnaldo Carvalho de Melo 提交于
      To reduce boilerplate, providing a more compact form to iterate over the
      maps in a map_group.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-gc3go6fmdn30twusg91t2q56@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      50481461
    • A
      perf maps: Add for_each_entry()/_safe() iterators · 8efc4f05
      Arnaldo Carvalho de Melo 提交于
      To reduce boilerplate, provide a more compact form using an idiom
      present in other trees of data structures.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-59gmq4kg1r68ou1wknyjl78x@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8efc4f05
    • A
      perf map: Allow map__next() to receive a NULL arg · 20419d3a
      Arnaldo Carvalho de Melo 提交于
      Just like free(), return NULL in that case, will simplify the
      for_each_entry_safe() iterators.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-pbde2ucn49khnrebclys9pny@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      20419d3a
    • A
      perf map: Check if the map still has some refcounts on exit · ee2555b6
      Arnaldo Carvalho de Melo 提交于
      We were checking just if it was still on some rb tree, but that is not
      the only way that this map can still have references, map->refcnt is
      there exactly for this, use it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-hany65tbeavsax7n3xvwl9pc@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ee2555b6
    • A
      perf dso: Add dso__data_write_cache_addr() · b86a9d91
      Adrian Hunter 提交于
      Add functions to write into the dso file data cache, but not change the
      file itself.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20191025130000.13032-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b86a9d91
    • A
      perf dso: Refactor dso_cache__read() · 366df726
      Adrian Hunter 提交于
      Refactor dso_cache__read() to separate populating the cache from copying
      data from it.  This is preparation for adding a cache "write" that will
      update the data in the cache.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20191025130000.13032-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      366df726
    • A
      perf auxtrace: Add auxtrace_cache__remove() · fd62c109
      Adrian Hunter 提交于
      Add auxtrace_cache__remove(). Intel PT uses an auxtrace_cache to store
      the results of code-walking, so that the same block of instructions does
      not have to be decoded repeatedly. However, when there are text poke
      events, the associated cache entries need to be removed.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20191025130000.13032-6-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fd62c109
    • M
      perf probe: Fix to show ranges of variables in functions without entry_pc · af04dd2f
      Masami Hiramatsu 提交于
      Fix to show ranges of variables (--range and --vars option) in functions
      which DIE has only ranges but no entry_pc attribute.
      
      Without this fix:
      
        # perf probe --range -V clear_tasks_mm_cpumask
        Available variables at clear_tasks_mm_cpumask
        	@<clear_tasks_mm_cpumask+0>
        		(No matched variables)
      
      With this fix:
      
        # perf probe --range -V clear_tasks_mm_cpumask
        Available variables at clear_tasks_mm_cpumask
      	@<clear_tasks_mm_cpumask+0>
      		[VAL]	int	cpu	@<clear_tasks_mm_cpumask+[0-35,317-317,2052-2059]>
      
      Committer testing:
      
      Before:
      
        [root@quaco ~]# perf probe --range -V clear_tasks_mm_cpumask
        Available variables at clear_tasks_mm_cpumask
                @<clear_tasks_mm_cpumask+0>
                        (No matched variables)
        [root@quaco ~]#
      
      After:
      
        [root@quaco ~]# perf probe --range -V clear_tasks_mm_cpumask
        Available variables at clear_tasks_mm_cpumask
                @<clear_tasks_mm_cpumask+0>
                        [VAL]   int     cpu     @<clear_tasks_mm_cpumask+[0-23,23-105,105-106,106-106,1843-1850,1850-1862]>
        [root@quaco ~]#
      
      Using it:
      
        [root@quaco ~]# perf probe clear_tasks_mm_cpumask cpu
        Added new event:
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask with cpu)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:clear_tasks_mm_cpumask -aR sleep 1
      
        [root@quaco ~]# perf probe -l
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c with cpu)
        [root@quaco ~]#
        [root@quaco ~]# perf trace -e probe:*cpumask
        ^C[root@quaco ~]#
      
      Fixes: 349e8d26 ("perf probe: Add --range option to show a variable's location range")
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157199323018.8075.8179744380479673672.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af04dd2f
    • M
      perf probe: Fix to show inlined function callsite without entry_pc · 18e21eb6
      Masami Hiramatsu 提交于
      Fix 'perf probe --line' option to show inlined function callsite lines
      even if the function DIE has only ranges.
      
      Without this:
      
        # perf probe -L amd_put_event_constraints
        ...
            2  {
            3         if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw))
                              __amd_put_nb_event_constraints(cpuc, event);
            5  }
      
      With this patch:
      
        # perf probe -L amd_put_event_constraints
        ...
            2  {
            3         if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw))
            4                 __amd_put_nb_event_constraints(cpuc, event);
            5  }
      
      Committer testing:
      
      Before:
      
        [root@quaco ~]# perf probe -L amd_put_event_constraints
        <amd_put_event_constraints@/usr/src/debug/kernel-5.2.fc30/linux-5.2.18-200.fc30.x86_64/arch/x86/events/amd/core.c:0>
              0  static void amd_put_event_constraints(struct cpu_hw_events *cpuc,
                                                      struct perf_event *event)
              2  {
              3         if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw))
                                __amd_put_nb_event_constraints(cpuc, event);
              5  }
      
                 PMU_FORMAT_ATTR(event, "config:0-7,32-35");
                 PMU_FORMAT_ATTR(umask, "config:8-15"   );
      
        [root@quaco ~]#
      
      After:
      
        [root@quaco ~]# perf probe -L amd_put_event_constraints
        <amd_put_event_constraints@/usr/src/debug/kernel-5.2.fc30/linux-5.2.18-200.fc30.x86_64/arch/x86/events/amd/core.c:0>
              0  static void amd_put_event_constraints(struct cpu_hw_events *cpuc,
                                                      struct perf_event *event)
              2  {
              3         if (amd_has_nb(cpuc) && amd_is_nb_event(&event->hw))
              4                 __amd_put_nb_event_constraints(cpuc, event);
              5  }
      
                 PMU_FORMAT_ATTR(event, "config:0-7,32-35");
                 PMU_FORMAT_ATTR(umask, "config:8-15"   );
      
        [root@quaco ~]# perf probe amd_put_event_constraints:4
        Added new event:
          probe:amd_put_event_constraints (on amd_put_event_constraints:4)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:amd_put_event_constraints -aR sleep 1
      
        [root@quaco ~]#
      
        [root@quaco ~]# perf probe -l
          probe:amd_put_event_constraints (on amd_put_event_constraints:4@arch/x86/events/amd/core.c)
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c)
        [root@quaco ~]#
      
      Using it:
      
        [root@quaco ~]# perf trace -e probe:*
        ^C[root@quaco ~]#
      
      Ok, Intel system here... :-)
      
      Fixes: 4cc9cec6 ("perf probe: Introduce lines walker interface")
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157199322107.8075.12659099000567865708.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      18e21eb6
    • M
      perf probe: Fix to list probe event with correct line number · 3895534d
      Masami Hiramatsu 提交于
      Since debuginfo__find_probe_point() uses dwarf_entrypc() for finding the
      entry address of the function on which a probe is, it will fail when the
      function DIE has only ranges attribute.
      
      To fix this issue, use die_entrypc() instead of dwarf_entrypc().
      
      Without this fix, perf probe -l shows incorrect offset:
      
        # perf probe -l
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask+18446744071579263632@work/linux/linux/kernel/cpu.c)
          probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask+18446744071579263752@work/linux/linux/kernel/cpu.c)
      
      With this:
      
        # perf probe -l
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@work/linux/linux/kernel/cpu.c)
          probe:clear_tasks_mm_cpumask_1 (on clear_tasks_mm_cpumask:21@work/linux/linux/kernel/cpu.c)
      
      Committer testing:
      
      Before:
      
        [root@quaco ~]# perf probe -l
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask+18446744071579765152@kernel/cpu.c)
        [root@quaco ~]#
      
      After:
      
        [root@quaco ~]# perf probe -l
          probe:clear_tasks_mm_cpumask (on clear_tasks_mm_cpumask@kernel/cpu.c)
        [root@quaco ~]#
      
      Fixes: 1d46ea2a ("perf probe: Fix listing incorrect line number with inline function")
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/157199321227.8075.14655572419136993015.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3895534d