1. 02 4月, 2022 1 次提交
    • I
      perf evlist: Rename cpus to user_requested_cpus · 0df6ade7
      Ian Rogers 提交于
      evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps
      of all evsels.
      
      For non-task targets, cpus is set to be cpus requested from the command
      line, defaulting to all online cpus if no cpus are specified.
      
      For an uncore event, all_cpus may be just CPU 0 or every online CPU.
      
      This causes all_cpus to have fewer values than the cpus variable which
      is confusing given the 'all' in the name.
      
      To try to make the behavior clearer, rename cpus to user_requested_cpus
      and add comments on the two struct variables.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20220328232648.2127340-3-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0df6ade7
  2. 19 3月, 2022 1 次提交
  3. 11 2月, 2022 1 次提交
  4. 18 1月, 2022 2 次提交
    • A
      perf evlist: No need to setup affinities when disabling events for pid targets · 0d3d2376
      Arnaldo Carvalho de Melo 提交于
      When the target is a pid, not started by 'perf stat' we need to disable
      the events, and in that case there is no need to setup affinities as we
      use a dummy CPU map, with just one entry set to -1.
      
      So stop doing it to avoid this needless call to sched_getaffinity():
      
        # strace -ke sched_getaffinity perf stat -e cycles -p 241957 sleep 1
        <SNIP>
        sched_getaffinity(0, 512, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) = 8
         > /usr/lib64/libc-2.33.so(sched_getaffinity@@GLIBC_2.3.4+0x1a) [0xe6eea]
         > /var/home/acme/bin/perf(affinity__setup+0x6a) [0x532a2a]
         > /var/home/acme/bin/perf(__evlist__disable.constprop.0+0x27) [0x4b9827]
         > /var/home/acme/bin/perf(cmd_stat+0x29b5) [0x431725]
         > /var/home/acme/bin/perf(run_builtin+0x6a) [0x4a2cfa]
         > /var/home/acme/bin/perf(main+0x612) [0x40f8c2]
         > /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
         > /var/home/acme/bin/perf(_start+0x2d) [0x40fadd]
        <SNIP>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220117160931.1191712-5-acme@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0d3d2376
    • A
      perf evlist: No need to setup affinities when enabling events for pid targets · f350ee95
      Arnaldo Carvalho de Melo 提交于
      When the target is a pid, not started by 'perf stat' we need to enable
      the events, and in that case there is no need to setup affinities as we
      use a dummy CPU map, with just one entry set to -1.
      
      So stop doing it to avoid this needless call to sched_getaffinity():
      
        # strace -ke sched_getaffinity perf stat -e cycles -p 241957 sleep 1
        <SNIP>
        sched_getaffinity(0, 512, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) = 8
         > /usr/lib64/libc-2.33.so(sched_getaffinity@@GLIBC_2.3.4+0x1a) [0xe6eea]
         > /var/home/acme/bin/perf(affinity__setup+0x6a) [0x5329ca]
         > /var/home/acme/bin/perf(__evlist__enable.constprop.0+0x23) [0x4b9693]
         > /var/home/acme/bin/perf(enable_counters+0x14d) [0x42de5d]
         > /var/home/acme/bin/perf(cmd_stat+0x2358) [0x4310c8]
         > /var/home/acme/bin/perf(run_builtin+0x6a) [0x4a2cfa]
         > /var/home/acme/bin/perf(main+0x612) [0x40f8c2]
         > /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
         > /var/home/acme/bin/perf(_start+0x2d) [0x40fadd]
        <SNIP>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220117160931.1191712-4-acme@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f350ee95
  5. 16 1月, 2022 1 次提交
  6. 13 1月, 2022 2 次提交
    • I
      perf cpumap: Give CPUs their own type · 6d18804b
      Ian Rogers 提交于
      A common problem is confusing CPU map indices with the CPU, by wrapping
      the CPU with a struct then this is avoided. This approach is similar to
      atomic_t.
      
      Committer notes:
      
      To make it build with BUILD_BPF_SKEL=1 these files needed the
      conversions to 'struct perf_cpu' usage:
      
        tools/perf/util/bpf_counter.c
        tools/perf/util/bpf_counter_cgroup.c
        tools/perf/util/bpf_ftrace.c
      
      Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
      cpu_map__build_map to cpu function".
      
      Additionally these needed to be fixed for the ARM builds to complete:
      
        tools/perf/arch/arm/util/cs-etm.c
        tools/perf/arch/arm64/util/pmu.c
      Suggested-by: NJohn Garry <john.garry@huawei.com>
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d18804b
    • I
      perf evlist: Refactor evlist__for_each_cpu() · 472832d2
      Ian Rogers 提交于
      Previously evlist__for_each_cpu() needed to iterate over the evlist in
      an inner loop and call "skip" routines. Refactor this so that the
      iteratr is smarter and the next function can update both the current CPU
      and evsel.
      
      By using a cpu map index, fix apparent off-by-1 in __run_perf_stat's
      call to perf_evsel__close_cpu().
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-35-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      472832d2
  7. 12 8月, 2021 2 次提交
    • J
      perf tools: Enable on a list of CPUs for hybrid · 1d3351e6
      Jin Yao 提交于
      The 'perf record' and 'perf stat' commands have supported the option
      '-C/--cpus' to count or collect only on the list of CPUs provided. This
      option needs to be supported for hybrid as well.
      
      For hybrid support, it needs to check that the cpu list are available
      on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
      is 'cpu_atom'.
      
      Before:
      
        # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
      
         Performance counter stats for 'CPU(s) 11':
      
           <not supported>      cpu_core/cycles/
      
               1.006179431 seconds time elapsed
      
      The 'perf stat' command silently returned "<not supported>" without any
      helpful information. It should error out pointing out that that cpu11
      was not 'cpu_core'.
      
      After:
      
        # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
        WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
        failed to use cpu list 11
      
      We also need to support the events without pmu prefix specified.
      
        # perf stat -e cycles -C11 -- sleep 1
        WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
      
         Performance counter stats for 'CPU(s) 11':
      
                 1,067,373      cpu_atom/cycles/
      
               1.005544738 seconds time elapsed
      
      The perf tool creates two cycles events automatically, cpu_core/cycles/ and
      cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
      for cpu_core/cycles/ and only count the cpu_atom/cycles/.
      
      If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example,
      
        # perf stat -e cycles -C0,11 -- sleep 1
        WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
      
         Performance counter stats for 'CPU(s) 0,11':
      
                 1,914,704      cpu_core/cycles/
                 2,036,983      cpu_atom/cycles/
      
               1.005815641 seconds time elapsed
      
      It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
      cpu_atom/cycles/, and output with some warnings.
      
      Some more complex examples,
      
        # perf stat -e cycles,instructions -C0,11 -- sleep 1
        WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
        WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.
      
         Performance counter stats for 'CPU(s) 0,11':
      
                 2,780,387      cpu_core/cycles/
                 1,583,432      cpu_atom/cycles/
                 3,957,277      cpu_core/instructions/
                 1,167,089      cpu_atom/instructions/
      
               1.006005124 seconds time elapsed
      
        # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
        WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.
      
         Performance counter stats for 'CPU(s) 0,11':
      
                 3,290,301      cpu_core/cycles/
                 1,953,073      cpu_atom/cycles/
                 1,407,869      cpu_atom/instructions/
      
               1.006260912 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https //lore.kernel.org/r/20210723063433.7318-4-yao.jin@linux.intel.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d3351e6
    • J
      perf tools: Create hybrid flag in target · b726e363
      Jin Yao 提交于
      The user may count or collect only on a cpu list via '-C/--cpus' option.
      
      Previously cpus for an evsel were retrieved from PMU's sysfs. But if the
      target cpu list is defined, the retrieved cpus are not kept and the
      target cpu list is used instead.
      
      But for hybrid system, we can't directly use target cpu list. The cpu
      list may not be available on hybrid pmu (e.g. cpu_core or cpu_atom).  So
      we should not set the 'has_user_cpus' flag for hybrid system.
      
      The difficulity is that we can't call perf_pmu__has_hybrid() in evlist.c
      to check hybrid system otherwise 'perf test python' would be failed
      (undefined symbol for perf_pmu__has_hybrid). If we add pmu.c to
      python-ext-sources, too many symbol dependencies are hard to resolve.
      
      We use an alternative method by using a new 'hybrid' flag in target
      for hybrid system checking.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https //lore.kernel.org/r/20210723063433.7318-3-yao.jin@linux.intel.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b726e363
  8. 10 7月, 2021 4 次提交
  9. 01 6月, 2021 1 次提交
  10. 22 5月, 2021 1 次提交
  11. 29 4月, 2021 2 次提交
    • J
      perf record: Create two hybrid 'cycles' events by default · b53a0755
      Jin Yao 提交于
      When evlist is empty, for example no '-e' specified in perf record,
      one default 'cycles' event is added to evlist.
      
      While on hybrid platform, it needs to create two default 'cycles'
      events. One is for cpu_core, the other is for cpu_atom.
      
      This patch actually calls evsel__new_cycles() two times to create
      two 'cycles' events.
      
        # ./perf record -vv -a -- sleep 1
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          freq                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 6
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 7
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 9
        sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 10
        sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 11
        sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 12
        sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 13
        sys_perf_event_open: pid -1  cpu 8  group_fd -1  flags 0x8 = 14
        sys_perf_event_open: pid -1  cpu 9  group_fd -1  flags 0x8 = 15
        sys_perf_event_open: pid -1  cpu 10  group_fd -1  flags 0x8 = 16
        sys_perf_event_open: pid -1  cpu 11  group_fd -1  flags 0x8 = 17
        sys_perf_event_open: pid -1  cpu 12  group_fd -1  flags 0x8 = 18
        sys_perf_event_open: pid -1  cpu 13  group_fd -1  flags 0x8 = 19
        sys_perf_event_open: pid -1  cpu 14  group_fd -1  flags 0x8 = 20
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 21
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          freq                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 22
        sys_perf_event_open: pid -1  cpu 17  group_fd -1  flags 0x8 = 23
        sys_perf_event_open: pid -1  cpu 18  group_fd -1  flags 0x8 = 24
        sys_perf_event_open: pid -1  cpu 19  group_fd -1  flags 0x8 = 25
        sys_perf_event_open: pid -1  cpu 20  group_fd -1  flags 0x8 = 26
        sys_perf_event_open: pid -1  cpu 21  group_fd -1  flags 0x8 = 27
        sys_perf_event_open: pid -1  cpu 22  group_fd -1  flags 0x8 = 28
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 29
        ------------------------------------------------------------
      
      We have to create evlist-hybrid.c otherwise due to the symbol
      dependency the perf test python would be failed.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b53a0755
    • S
      perf stat: Introduce bpf_counter_ops->disable() · 5508c9da
      Song Liu 提交于
      Introduce bpf_counter_ops->disable(), which is used stop counting the
      event.
      
      Committer notes:
      
      Added a dummy bpf_counter__disable() to the python binding to avoid
      having 'perf test python' failing.
      
      bpf_counter isn't supported in the python binding.
      Signed-off-by: NSong Liu <song@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: kernel-team@fb.com
      Link: https://lore.kernel.org/r/20210425214333.1090950-6-song@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5508c9da
  12. 16 4月, 2021 1 次提交
  13. 24 3月, 2021 1 次提交
  14. 15 3月, 2021 1 次提交
    • A
      perf evlist: Change the COMM when preparing the workload · a7672d1d
      Arnaldo Carvalho de Melo 提交于
      It was reported that --exclude-perf wasn't working, as tracepoints were
      appearing in 'perf script' output as having the 'perf' COMM, that is
      just the window in evlist__prepare_workload() after the fork() and
      before the execvp() call for workloads specified in the command line.
      
      Example:
      
        # perf record -e kmem:kmalloc --filter 'bytes_alloc<650 && bytes_alloc>620' --exclude-perf -e kmem:kfree --exclude-perf -aR sleep 30
      
      Then:
      
        # perf script
                perf 15905 [009] 1498.356094: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
                perf 15905 [009] 1498.356116: kmem:kfree: call_site=free_bprm+0x8f ptr=(nil)
                perf 15905 [009] 1498.356116: kmem:kfree: call_site=do_execveat_common+0x19d ptr=0xffff9cf750421c00
                perf 15905 [009] 1498.356138: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
                perf 15905 [009] 1498.356148: kmem:kfree: call_site=free_bprm+0x8f ptr=(nil)
                perf 15905 [009] 1498.356148: kmem:kfree: call_site=do_execveat_common+0x19d ptr=0xffff9cf750421c00
                perf 15905 [009] 1498.356168: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
                perf 15905 [009] 1498.356176: kmem:kfree: call_site=free_bprm+0x8f ptr=(nil)
        <SNIP>
                perf 15905 [009] 1498.356348: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
                perf 15905 [014] 1498.356386: kmem:kfree: call_site=security_compute_sid.part.0+0x3b2 ptr=(nil)
                perf 15905 [014] 1498.356423: kmem:kfree: call_site=load_elf_binary+0x207 ptr=0xffff9cf5b2a34220
                perf 15905 [014] 1498.356694: kmem:kfree: call_site=__free_slab+0xb5 ptr=0xffff9cf6d0b3b000
               sleep 15905 [014] 1498.356739: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
      
      Use prctl() to show that that is just the preparation of the workload:
      
        # perf script
           perf-exec 19036 [009] 2199.357582: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
           perf-exec 19036 [009] 2199.357604: kmem:kfree: call_site=free_bprm+0x8f ptr=(nil)
           perf-exec 19036 [009] 2199.357604: kmem:kfree: call_site=do_execveat_common+0x19d ptr=0xffff9cf786459800
           perf-exec 19036 [009] 2199.357630: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
        <SNIP>
           perf-exec 19036 [000] 2199.358277: kmem:kfree: call_site=__free_slab+0xb5 ptr=0xffff9cf786fb9c00
           perf-exec 19036 [000] 2199.358278: kmem:kfree: call_site=__free_slab+0xb5 ptr=0xffff9cf786458200
           perf-exec 19036 [000] 2199.358279: kmem:kfree: call_site=__free_slab+0xb5 ptr=0xffff9cf786458600
               sleep 19036 [000] 2199.358316: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
               sleep 19036 [000] 2199.358323: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=(nil)
               sleep 19036 [000] 2199.358330: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=0xffff9cf58be2d000
               sleep 19036 [000] 2199.358337: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=0xffff9cf58be2d000
               sleep 19036 [000] 2199.358339: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=0xffff9cf58be2d000
               sleep 19036 [000] 2199.358341: kmem:kfree: call_site=perf_event_mmap+0x279 ptr=0xffff9cf58be2d000
      
      Reporter: zhanweiw <wingfancy@hotmail.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212213Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a7672d1d
  15. 07 3月, 2021 1 次提交
    • N
      perf stat: Fix use-after-free when -r option is used · 513068f2
      Namhyung Kim 提交于
      I got a segfault when using -r option with event groups.  The option
      makes it run the workload multiple times and it will reuse the evlist
      and evsel for each run.
      
      While most of resources are allocated and freed properly, the id hash
      in the evlist was not and it resulted in the bug.  You can see it with
      the address sanitizer like below:
      
        $ perf stat -r 100 -e '{cycles,instructions}' true
        =================================================================
        ==693052==ERROR: AddressSanitizer: heap-use-after-free on
            address 0x6080000003d0 at pc 0x558c57732835 bp 0x7fff1526adb0 sp 0x7fff1526ada8
        WRITE of size 8 at 0x6080000003d0 thread T0
          #0 0x558c57732834 in hlist_add_head /home/namhyung/project/linux/tools/include/linux/list.h:644
          #1 0x558c57732834 in perf_evlist__id_hash /home/namhyung/project/linux/tools/lib/perf/evlist.c:237
          #2 0x558c57732834 in perf_evlist__id_add /home/namhyung/project/linux/tools/lib/perf/evlist.c:244
          #3 0x558c57732834 in perf_evlist__id_add_fd /home/namhyung/project/linux/tools/lib/perf/evlist.c:285
          #4 0x558c5747733e in store_evsel_ids util/evsel.c:2765
          #5 0x558c5747733e in evsel__store_ids util/evsel.c:2782
          #6 0x558c5730b717 in __run_perf_stat /home/namhyung/project/linux/tools/perf/builtin-stat.c:895
          #7 0x558c5730b717 in run_perf_stat /home/namhyung/project/linux/tools/perf/builtin-stat.c:1014
          #8 0x558c5730b717 in cmd_stat /home/namhyung/project/linux/tools/perf/builtin-stat.c:2446
          #9 0x558c57427c24 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
          #10 0x558c572b1a48 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
          #11 0x558c572b1a48 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
          #12 0x558c572b1a48 in main /home/namhyung/project/linux/tools/perf/perf.c:539
          #13 0x7fcadb9f7d09 in __libc_start_main ../csu/libc-start.c:308
          #14 0x558c572b60f9 in _start (/home/namhyung/project/linux/tools/perf/perf+0x45d0f9)
      
      Actually the nodes in the hash table are struct perf_stream_id and
      they were freed in the previous run.  Fix it by resetting the hash.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20210225035148.778569-2-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      513068f2
  16. 19 2月, 2021 1 次提交
    • Y
      perf record: Fix continue profiling after draining the buffer · e16c2ce7
      Yang Jihong 提交于
      Commit da231338 ("perf record: Use an eventfd to wakeup when
      done") uses eventfd() to solve a rare race where the setting and
      checking of 'done' which add done_fd to pollfd.  When draining buffer,
      revents of done_fd is 0 and evlist__filter_pollfd function returns a
      non-zero value.  As a result, perf record does not stop profiling.
      
      The following simple scenarios can trigger this condition:
      
        # sleep 10 &
        # perf record -p $!
      
      After the sleep process exits, perf record should stop profiling and exit.
      However, perf record keeps running.
      
      If pollfd revents contains only POLLERR or POLLHUP, perf record
      indicates that buffer is draining and need to stop profiling.  Use
      fdarray_flag__nonfilterable() to set done eventfd to nonfilterable
      objects, so that evlist__filter_pollfd() does not filter and check done
      eventfd.
      
      Fixes: da231338 ("perf record: Use an eventfd to wakeup when done")
      Signed-off-by: NYang Jihong <yangjihong1@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: zhangjinhao2@huawei.com
      Link: http://lore.kernel.org/lkml/20210205065001.23252-1-yangjihong1@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e16c2ce7
  17. 04 2月, 2021 1 次提交
    • K
      perf stat: Add Topdown metrics events as default events · 42641d6f
      Kan Liang 提交于
      The Topdown Microarchitecture Analysis (TMA) Method is a structured
      analysis methodology to identify critical performance bottlenecks in
      out-of-order processors. From the Ice Lake and later platforms, the
      Topdown information can be retrieved from the dedicated "metrics"
      register, which isn't impacted by other events. Also, the Topdown
      metrics support both per thread/process and per core measuring.  Adding
      Topdown metrics events as default events can enrich the default
      measuring information, and would not cost any extra multiplexing.
      
      Introduce arch_evlist__add_default_attrs() to allow architecture
      specific default events. Add the Topdown metrics events in the X86
      specific arch_evlist__add_default_attrs(). Other architectures can add
      their own default events later separately.
      
      With the patch:
      
       $ perf stat sleep 1
      
       Performance counter stats for 'sleep 1':
      
                 0.82 msec task-clock:u              #    0.001 CPUs utilized
                    0      context-switches:u        #    0.000 K/sec
                    0      cpu-migrations:u          #    0.000 K/sec
                   61      page-faults:u             #    0.074 M/sec
              319,941      cycles:u                  #    0.388 GHz
              242,802      instructions:u            #    0.76  insn per cycle
               54,380      branches:u                #   66.028 M/sec
                4,043      branch-misses:u           #    7.43% of all branches
            1,585,555      slots:u                   # 1925.189 M/sec
              238,941      topdown-retiring:u        #     15.0% retiring
              410,378      topdown-bad-spec:u        #     25.8% bad speculation
              634,222      topdown-fe-bound:u        #     39.9% frontend bound
              304,675      topdown-be-bound:u        #     19.2% backend bound
      
             1.001791625 seconds time elapsed
      
             0.000000000 seconds user
             0.001572000 seconds sys
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20210121133752.118327-1-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      42641d6f
  18. 21 1月, 2021 4 次提交
    • J
      perf tools: Add 'ping' control command · 47fddcb4
      Jiri Olsa 提交于
      Add a control 'ping' command to detect if perf is up and its control
      interface is operational.
      
      It will be used in following daemon patches to synchronize with record
      session - when control interface is up and running, we know that perf
      record is monitoring and ready to receive signals.
      
      Example session:
      
        terminal 1:
      
          # mkfifo control ack
          # perf record --control=fifo:control,ack
      
        terminal 2:
      
          # echo ping > control
          # cat ack
          ack
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      47fddcb4
    • J
      perf tools: Add 'stop' control command · f186cd61
      Jiri Olsa 提交于
      Adding control 'stop' command to stop perf record.
      
      When it is received, perf will set the 'done' variable to 1 to stop its
      mmap ring buffer reading loop.
      
      Example session:
      
        terminal 1:
          # mkfifo control ack
          # perf record --control=fifo:control,ack
      
        terminal 2:
          # echo stop > control
      
        terminal 1:
          [ perf record: Woken up 7 times to write data ]
          [ perf record: Captured and wrote 3.214 MB perf.data (38280 samples) ]
          #
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f186cd61
    • J
      perf tools: Add 'evlist' control command · 142544a9
      Jiri Olsa 提交于
      Add a new 'evlist' control command to display all the evlist events.
      When it is received, perf will scan and print current evlist into perf
      record terminal.
      
      The interface string for control file is:
      
        evlist [-v|-g|-F]
      
      The syntax follows perf evlist command:
        -F  Show just the sample frequency used for each event.
        -v  Show all fields.
        -g  Show event group information.
      
      Example session:
      
        terminal 1:
          # mkfifo control ack
          # perf record --control=fifo:control,ack -e '{cycles,instructions}'
      
        terminal 2:
          # echo evlist > control
      
        terminal 1:
          cycles
          instructions
          dummy:HG
      
        terminal 2:
          # echo 'evlist -v' > control
      
        terminal 1:
          cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type:            \
          IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1,    \
          sample_id_all: 1, exclude_guest: 1
          instructions: size: 120, config: 0x1, { sample_period, sample_freq }: 4000,      \
          sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, freq: 1,    \
          sample_id_all: 1, exclude_guest: 1
          dummy:HG: type: 1, size: 120, config: 0x9, { sample_period, sample_freq }: 4000, \
          sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, mmap: 1,    \
          comm: 1, freq: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, \
           bpf_event: 1
      
        terminal 2:
          # echo 'evlist -g' > control
      
        terminal 1:
          {cycles,instructions}
          dummy:HG
      
        terminal 2:
          # echo 'evlist -F' > control
      
        terminal 1:
          cycles: sample_freq=4000
          instructions: sample_freq=4000
          dummy:HG: sample_freq=4000
      
      This new evlist command is handy to get real event names when
      wildcards are used.
      
      Adding evsel_fprintf.c object to python/perf.so build, because
      it's now evlist.c dependency.
      
      Adding PYTHON_PERF define for python/perf.so compilation, so we
      can use it to compile in only evsel__fprintf from evsel_fprintf.c
      object.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      142544a9
    • J
      perf tools: Allow to enable/disable events via control file · 991ae4eb
      Jiri Olsa 提交于
      Adding new control events to enable/disable specific event.
      The interface string for control file are:
      
        'enable <EVENT NAME>'
        'disable <EVENT NAME>'
      
      when received the command, perf will scan the current evlist
      for <EVENT NAME> and if found it's enabled/disabled.
      
      Example session:
      
        terminal 1:
          # mkfifo control ack perf.pipe
          # perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:*' -o - > perf.pipe
      
        terminal 2:
          # cat perf.pipe | perf --no-pager script -i -
      
        terminal 1:
          Events disabled
      
        NOTE Above message will show only after read side of the pipe ('>')
        is started on 'terminal 2'. The 'terminal 1's bash does not execute
        perf before that, hence the delyaed perf record message.
      
        terminal 3:
          # echo 'enable sched:sched_process_fork' > control
      
        terminal 1:
          event sched:sched_process_fork enabled
      
        terminal 2:
          bash 33349 [034] 149587.674295: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34056
          bash 33349 [034] 149588.239521: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34057
      
        terminal 3:
          # echo 'enable sched:sched_wakeup_new' > control
      
        terminal 1:
          event sched:sched_wakeup_new enabled
      
        terminal 2:
          bash 33349 [034] 149632.228023: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34059
          bash 33349 [034] 149632.228050:   sched:sched_wakeup_new: bash:34059 [120] success=1 CPU:036
          bash 33349 [034] 149633.950005: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34060
          bash 33349 [034] 149633.950030:   sched:sched_wakeup_new: bash:34060 [120] success=1 CPU:036
      
      Committer testing:
      
      If I use 'sched:*' and then enable all events, I can't get 'perf record'
      to react to further commands, so I tested it with:
      
        [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe
        Events disabled
        Events enabled
        Events disabled
      
      And then it works as expected, so we need to fix this pre-existing
      problem.
      
      Another issue, we need to check if a event is already enabled or
      disabled and change the message to be clearer, i.e.:
      
        [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe
        Events disabled
      
      If we receive a 'disable' command, then it should say:
      
        [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe
        Events disabled
        Events already disabled
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      991ae4eb
  19. 18 12月, 2020 2 次提交
  20. 01 12月, 2020 10 次提交