1. 06 5月, 2020 4 次提交
  2. 18 4月, 2020 1 次提交
    • K
      perf callchain: Stitch LBR call stack · ff165628
      Kan Liang 提交于
      In LBR call stack mode, the depth of reconstructed LBR call stack limits
      to the number of LBR registers.
      
        For example, on skylake, the depth of reconstructed LBR call stack is
        always <= 32.
      
        # To display the perf.data header info, please use
        # --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 6K of event 'cycles'
        # Event count (approx.): 6487119731
        #
        # Children      Self  Command          Shared Object       Symbol
        # ........  ........  ...............  ..................
        # ................................
      
          99.97%    99.97%  tchain_edit      tchain_edit        [.] f43
                  |
                   --99.64%--f11
                             f12
                             f13
                             f14
                             f15
                             f16
                             f17
                             f18
                             f19
                             f20
                             f21
                             f22
                             f23
                             f24
                             f25
                             f26
                             f27
                             f28
                             f29
                             f30
                             f31
                             f32
                             f33
                             f34
                             f35
                             f36
                             f37
                             f38
                             f39
                             f40
                             f41
                             f42
                             f43
      
      For a call stack which is deeper than LBR limit, HW will overwrite the
      LBR register with oldest branch. Only partial call stacks can be
      reconstructed.
      
      However, the overwritten LBRs may still be retrieved from previous
      sample. At that moment, HW hasn't overwritten the LBR registers yet.
      Perf tools can stitch those overwritten LBRs on current call stacks to
      get a more complete call stack.
      
      To determine if LBRs can be stitched, perf tools need to compare current
      sample with previous sample.
      
      - They should have identical LBR records (Same from, to and flags
        values, and the same physical index of LBR registers).
      
      - The searching starts from the base-of-stack of current sample.
      
      Once perf determines to stitch the previous LBRs, the corresponding LBR
      cursor nodes will be copied to 'lists'.  The 'lists' is to track the LBR
      cursor nodes which are going to be stitched.
      
      When the stitching is over, the nodes will not be freed immediately.
      They will be moved to 'free_lists'. Next stitching may reuse the space.
      Both 'lists' and 'free_lists' will be freed when all samples are
      processed.
      
      Committer notes:
      
      Fix the intel-pt.c initialization of the union with 'struct
      branch_flags', that breaks the build with its unnamed union on older gcc
      versions.
      
      Uninline thread__free_stitch_list(), as it grew big and started dragging
      includes to thread.h, so move it to thread.c where what it needs in
      terms of headers are already there.
      
      This fixes the build in several systems such as debian:experimental when
      cross building to the MIPS32 architecture, i.e. in the other cases what
      was needed was being included by sheer luck.
      
        In file included from builtin-sched.c:11:
        util/thread.h: In function 'thread__free_stitch_list':
        util/thread.h:169:3: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
          169 |   free(pos);
              |   ^~~~
        util/thread.h:169:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
        util/thread.h:19:1: note: include '<stdlib.h>' or provide a declaration of 'free'
           18 | #include "callchain.h"
          +++ |+#include <stdlib.h>
           19 |
        util/thread.h:174:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
          174 |   free(pos);
              |   ^~~~
        util/thread.h:174:3: note: include '<stdlib.h>' or provide a declaration of 'free'
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lore.kernel.org/lkml/20200319202517.23423-13-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ff165628
  3. 16 4月, 2020 2 次提交
    • A
      perf intel-pt: Add support for synthesizing callchains for regular events · 2855c05c
      Adrian Hunter 提交于
      Currently, callchains can be synthesized only for synthesized events.
      Support also synthesizing callchains for regular events.
      
      Example:
      
       # perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
       Linux
       [ perf record: Woken up 3 times to write data ]
       [ perf record: Captured and wrote 0.532 MB perf.data ]
       # perf script --itrace=Ge | head -20
       uname  4864 2419025.358181:      10000     cycles:
              ffffffffbba56965 apparmor_bprm_committing_creds+0x35 ([kernel.kallsyms])
              ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
              ffffffffbba07422 security_bprm_committing_creds+0x22 ([kernel.kallsyms])
              ffffffffbb89805d install_exec_creds+0xd ([kernel.kallsyms])
              ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])
      
       uname  4864 2419025.358185:      10000     cycles:
              ffffffffbba56db0 apparmor_bprm_committed_creds+0x20 ([kernel.kallsyms])
              ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
              ffffffffbba07452 security_bprm_committed_creds+0x22 ([kernel.kallsyms])
              ffffffffbb89809a install_exec_creds+0x4a ([kernel.kallsyms])
              ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])
      
       uname  4864 2419025.358189:      10000     cycles:
              ffffffffbb86fdf6 vma_adjust_trans_huge+0x6 ([kernel.kallsyms])
              ffffffffbb821660 __vma_adjust+0x160 ([kernel.kallsyms])
              ffffffffbb897be7 shift_arg_pages+0x97 ([kernel.kallsyms])
              ffffffffbb897ed9 setup_arg_pages+0x1e9 ([kernel.kallsyms])
              ffffffffbb90d9f2 load_elf_binary+0x3f2 ([kernel.kallsyms])
      
      Committer testing:
      
        # perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.233 MB perf.data ]
        #
      
      Then, before this patch:
      
        # perf script --itrace=Ge | head -20
           uname 28642 168664.856384: 10000 cycles: ffffffff9810aeaa commit_creds+0x2a ([kernel.kallsyms])
           uname 28642 168664.856388: 10000 cycles: ffffffff982a24f1 mprotect_fixup+0x151 ([kernel.kallsyms])
           uname 28642 168664.856392: 10000 cycles: ffffffff982a385b move_page_tables+0xbcb ([kernel.kallsyms])
           uname 28642 168664.856396: 10000 cycles: ffffffff982fd4ec __mod_memcg_state+0x1c ([kernel.kallsyms])
           uname 28642 168664.856400: 10000 cycles: ffffffff9829fddd do_mmap+0xfd ([kernel.kallsyms])
           uname 28642 168664.856404: 10000 cycles: ffffffff9829c879 __vma_adjust+0x479 ([kernel.kallsyms])
           uname 28642 168664.856408: 10000 cycles: ffffffff98238e94 __perf_addr_filters_adjust+0x34 ([kernel.kallsyms])
           uname 28642 168664.856412: 10000 cycles: ffffffff98a38e0b down_write+0x1b ([kernel.kallsyms])
           uname 28642 168664.856416: 10000 cycles: ffffffff983006a0 memcg_kmem_get_cache+0x0 ([kernel.kallsyms])
           uname 28642 168664.856421: 10000 cycles: ffffffff98396eaf load_elf_binary+0x92f ([kernel.kallsyms])
           uname 28642 168664.856425: 10000 cycles: ffffffff982e0222 kfree+0x62 ([kernel.kallsyms])
           uname 28642 168664.856428: 10000 cycles: ffffffff9846dfd4 file_has_perm+0x54 ([kernel.kallsyms])
           uname 28642 168664.856433: 10000 cycles: ffffffff98288911 vma_interval_tree_insert+0x51 ([kernel.kallsyms])
           uname 28642 168664.856437: 10000 cycles: ffffffff9823e577 perf_event_mmap_output+0x27 ([kernel.kallsyms])
           uname 28642 168664.856441: 10000 cycles: ffffffff98a26fa0 xas_load+0x40 ([kernel.kallsyms])
           uname 28642 168664.856445: 10000 cycles: ffffffff98004f30 arch_setup_additional_pages+0x0 ([kernel.kallsyms])
           uname 28642 168664.856448: 10000 cycles: ffffffff98a297c0 copy_user_generic_unrolled+0xa0 ([kernel.kallsyms])
           uname 28642 168664.856452: 10000 cycles: ffffffff9853a87a strnlen_user+0x10a ([kernel.kallsyms])
           uname 28642 168664.856456: 10000 cycles: ffffffff986638a7 randomize_page+0x27 ([kernel.kallsyms])
           uname 28642 168664.856460: 10000 cycles: ffffffff98a3b645 _raw_spin_lock+0x5 ([kernel.kallsyms])
      
        #
      
      And after:
      
        # perf script --itrace=Ge | head -20
        uname 28642 168664.856384:      10000     cycles:
        	ffffffff9810aeaa commit_creds+0x2a ([kernel.kallsyms])
        	ffffffff9831fe87 install_exec_creds+0x17 ([kernel.kallsyms])
        	ffffffff983968d9 load_elf_binary+0x359 ([kernel.kallsyms])
        	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
        	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
      
        uname 28642 168664.856388:      10000     cycles:
        	ffffffff982a24f1 mprotect_fixup+0x151 ([kernel.kallsyms])
        	ffffffff9831fa83 setup_arg_pages+0x123 ([kernel.kallsyms])
        	ffffffff9839691f load_elf_binary+0x39f ([kernel.kallsyms])
        	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
        	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
      
        uname 28642 168664.856392:      10000     cycles:
        	ffffffff982a385b move_page_tables+0xbcb ([kernel.kallsyms])
        	ffffffff9831f889 shift_arg_pages+0xa9 ([kernel.kallsyms])
        	ffffffff9831fb4f setup_arg_pages+0x1ef ([kernel.kallsyms])
        	ffffffff9839691f load_elf_binary+0x39f ([kernel.kallsyms])
        	ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
        #
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20200401101613.6201-12-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2855c05c
    • A
      perf intel-pt: Implement ->evsel_is_auxtrace() callback · 6b52bb07
      Adrian Hunter 提交于
      Implement ->evsel_is_auxtrace() callback.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20200401101613.6201-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6b52bb07
  4. 10 3月, 2020 1 次提交
    • K
      perf tools: Add hw_idx in struct branch_stack · 42bbabed
      Kan Liang 提交于
      The low level index of raw branch records for the most recent branch can
      be recorded in a sample with PERF_SAMPLE_BRANCH_HW_INDEX
      branch_sample_type. Extend struct branch_stack to support it.
      
      However, if the PERF_SAMPLE_BRANCH_HW_INDEX is not applied, only nr and
      entries[] will be output by kernel. The pointer of entries[] could be
      wrong, since the output format is different with new struct
      branch_stack.  Add a variable no_hw_idx in struct perf_sample to
      indicate whether the hw_idx is output.  Add get_branch_entry() to return
      corresponding pointer of entries[0].
      
      To make dummy branch sample consistent as new branch sample, add hw_idx
      in struct dummy_branch_stack for cs-etm and intel-pt.
      
      Apply the new struct branch_stack for synthetic events as well.
      
      Extend test case sample-parsing to support new struct branch_stack.
      
      Committer notes:
      
      Renamed get_branch_entries() to perf_sample__branch_entries() to have
      proper namespacing and pave the way for this to be moved to libperf,
      eventually.
      
      Add 'static' to that inline as it is in a header.
      
      Add 'hw_idx' to 'struct dummy_branch_stack' in cs-etm.c to fix the build
      on arm64.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
      Link: http://lore.kernel.org/lkml/20200228163011.19358-2-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      42bbabed
  5. 26 11月, 2019 1 次提交
    • A
      perf maps: Merge 'struct maps' with 'struct map_groups' · 79b6bb73
      Arnaldo Carvalho de Melo 提交于
      And pick the shortest name: 'struct maps'.
      
      The split existed because we used to have two groups of maps, one for
      functions and one for variables, but that only complicated things,
      sometimes we needed to figure out what was at some address and then had
      to first try it on the functions group and if that failed, fall back to
      the variables one.
      
      That split is long gone, so for quite a while we had only one struct
      maps per struct map_groups, simplify things by combining those structs.
      
      First patch is the minimum needed to merge both, follow up patches will
      rename 'thread->mg' to 'thread->maps', etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      79b6bb73
  6. 22 11月, 2019 1 次提交
  7. 25 9月, 2019 2 次提交
  8. 20 9月, 2019 1 次提交
  9. 01 9月, 2019 1 次提交
  10. 30 8月, 2019 1 次提交
  11. 29 8月, 2019 2 次提交
  12. 14 8月, 2019 1 次提交
  13. 30 7月, 2019 3 次提交
    • J
      libperf: Move perf_event_attr field from perf's evsel to libperf's perf_evsel · 1fc632ce
      Jiri Olsa 提交于
      Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.
      
      Committer notes:
      
      Fixed up these:
      
       tools/perf/arch/arm/util/auxtrace.c
       tools/perf/arch/arm/util/cs-etm.c
       tools/perf/arch/arm64/util/arm-spe.c
       tools/perf/arch/s390/util/auxtrace.c
       tools/perf/util/cs-etm.c
      
      Also
      
        cc1: warnings being treated as errors
        tests/sample-parsing.c: In function 'do_test':
        tests/sample-parsing.c:162: error: missing initializer
        tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')
      
         	struct evsel evsel = {
         		.needs_swap = false,
        -		.core.attr = {
        -			.sample_type = sample_type,
        -			.read_format = read_format,
        +		.core = {
        +			. attr = {
        +				.sample_type = sample_type,
        +				.read_format = read_format,
        +			},
      
        [perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
        gcc (GCC) 4.4.7
      
      Also we don't need to include perf_event.h in
      tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
      perf_event_attr' is enough. And this even fixes the build in some
      systems where things are used somewhere down the include path from
      perf_event.h without defining __always_inline.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1fc632ce
    • J
      perf evlist: Rename struct perf_evlist to struct evlist · 63503dba
      Jiri Olsa 提交于
      Rename struct perf_evlist to struct evlist, so we don't have a name
      clash when we add struct perf_evlist in libperf.
      
      Committer notes:
      
      Added fixes to build on arm64, from Jiri and from me
      (tools/perf/util/cs-etm.c)
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63503dba
    • J
      perf evsel: Rename struct perf_evsel to struct evsel · 32dcd021
      Jiri Olsa 提交于
      Rename struct perf_evsel to struct evsel, so we don't have a name clash
      when we add struct perf_evsel in libperf.
      
      Committer notes:
      
      Added fixes for arm64, provided by Jiri.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      32dcd021
  14. 09 7月, 2019 2 次提交
    • L
      perf intel-pt: Fix potential NULL pointer dereference found by the smatch tool · 323fd749
      Leo Yan 提交于
      Based on the following report from Smatch, fix the potential NULL
      pointer dereference check.
      
        tools/perf/util/intel-pt.c:3200
        intel_pt_process_auxtrace_info() error: we previously assumed
        'session->itrace_synth_opts' could be null (see line 3196)
      
        tools/perf/util/intel-pt.c:3206
        intel_pt_process_auxtrace_info() warn: variable dereferenced before
        check 'session->itrace_synth_opts' (see line 3200)
      
        tools/perf/util/intel-pt.c
        3196         if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
        3197                 pt->synth_opts = *session->itrace_synth_opts;
        3198         } else {
        3199                 itrace_synth_opts__set_default(&pt->synth_opts,
        3200                                 session->itrace_synth_opts->default_no_sample);
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^
        3201                 if (!session->itrace_synth_opts->default_no_sample &&
        3202                     !session->itrace_synth_opts->inject) {
        3203                         pt->synth_opts.branches = false;
        3204                         pt->synth_opts.callchain = true;
        3205                 }
        3206                 if (session->itrace_synth_opts)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
        3207                         pt->synth_opts.thread_stack =
        3208                                 session->itrace_synth_opts->thread_stack;
        3209         }
      
      'session->itrace_synth_opts' is impossible to be a NULL pointer in
      intel_pt_process_auxtrace_info(), thus this patch removes the NULL test
      for 'session->itrace_synth_opts'.
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190708143937.7722-4-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      323fd749
    • A
      tools lib: Adopt zalloc()/zfree() from tools/perf · 7f7c536f
      Arnaldo Carvalho de Melo 提交于
      Eroding a bit more the tools/perf/util/util.h hodpodge header.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7f7c536f
  15. 25 6月, 2019 1 次提交
  16. 18 6月, 2019 9 次提交
  17. 11 6月, 2019 3 次提交
  18. 05 6月, 2019 2 次提交
  19. 29 5月, 2019 2 次提交