1. 13 1月, 2022 6 次提交
    • I
      perf evlist: Refactor evlist__for_each_cpu() · 472832d2
      Ian Rogers 提交于
      Previously evlist__for_each_cpu() needed to iterate over the evlist in
      an inner loop and call "skip" routines. Refactor this so that the
      iteratr is smarter and the next function can update both the current CPU
      and evsel.
      
      By using a cpu map index, fix apparent off-by-1 in __run_perf_stat's
      call to perf_evsel__close_cpu().
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-35-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      472832d2
    • I
      perf cpumap: Rename cpu_map__get_X_aggr_by_cpu functions · 973aeb3c
      Ian Rogers 提交于
      The functions don't use a cpu_map so reduce them to being like
      constructors of aggr_cpu_id.
      Reviewed-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-20-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      973aeb3c
    • I
      perf cpumap: Refactor cpu_map__build_map() · 5f50e15c
      Ian Rogers 提交于
      Turn it into a cpu_aggr_map__new(). Pass helper functions. Refactor
      builtin-stat calls to manually pass function pointers. Try to reduce
      some copy-paste code.
      Reviewed-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-19-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5f50e15c
    • I
      perf cpumap: Rename empty functions · 51b826fa
      Ian Rogers 提交于
      Remove cpu_map from name as a cpu_map isn't used. Pass a const pointer
      rather than by value to avoid unnecessary copying.
      Reviewed-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-15-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51b826fa
    • I
      perf cpumap: Switch cpu_map__build_map() to cpu function · eff54c24
      Ian Rogers 提交于
      Avoid error prone cpu_map + idx variant. Remove now unused functions.
      
      Committer notes:
      
      Remove by now unused perf_env__get_cpu().
      Reviewed-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-7-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eff54c24
    • I
      perf stat: Switch to cpu version of cpu_map__get() · 88031a0d
      Ian Rogers 提交于
      Avoid possible bugs where the wrong index is passed with the cpu_map.
      Reviewed-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-6-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      88031a0d
  2. 08 12月, 2021 1 次提交
    • J
      perf stat: Support --cputype option for hybrid events · e69dc842
      Jin Yao 提交于
      In previous patch, we have supported the syntax which enables
      the event on a specified pmu, such as:
      
      cpu_core/<event>/
      cpu_atom/<event>/
      
      While this syntax is not very easy for applying on a set of
      events or applying on a group. In following example, we have to
      explicitly assign the pmu prefix.
      
        # ./perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' -- sleep 1
      
         Performance counter stats for 'sleep 1':
      
                 1,158,545      cpu_core/cycles/
                 1,003,113      cpu_core/instructions/
      
               1.002428712 seconds time elapsed
      
      A much easier way is:
      
        # ./perf stat --cputype core -e '{cycles,instructions}' -- sleep 1
      
         Performance counter stats for 'sleep 1':
      
                 1,101,071      cpu_core/cycles/
                   939,892      cpu_core/instructions/
      
               1.002363142 seconds time elapsed
      
      For this example, the '--cputype' enables the events from specified
      pmu (cpu_core).
      
      If '--cputype' conflicts with pmu prefix, '--cputype' is ignored.
      
        # ./perf stat --cputype core -e cycles,cpu_atom/instructions/ -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                21,003,407      cpu_core/cycles/
                   367,886      cpu_atom/instructions/
      
               1.002203520 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210909062215.10278-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e69dc842
  3. 08 11月, 2021 2 次提交
  4. 27 9月, 2021 1 次提交
    • L
      perf iostat: Use system-wide mode if the target cpu_list is unspecified · e4fe5d73
      Like Xu 提交于
      An iostate use case like "perf iostat 0000:16,0000:97 -- ls" should be
      implemented to work in system-wide mode to ensure that the output from
      print_header() is consistent with the user documentation perf-iostat.txt,
      rather than incorrectly assuming that the kernel does not support it:
      
       Error:
       The sys_perf_event_open() syscall returned with 22 (Invalid argument) \
       for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
       /bin/dmesg | grep -i perf may provide additional information.
      
      This error is easily fixed by assigning system-wide mode by default
      for IOSTAT_RUN only when the target cpu_list is unspecified.
      
      Fixes: f07952b1 ("perf stat: Basic support for iostat in perf")
      Signed-off-by: NLike Xu <likexu@tencent.com>
      Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20210927081115.39568-1-likexu@tencent.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e4fe5d73
  5. 31 8月, 2021 1 次提交
  6. 12 8月, 2021 1 次提交
    • J
      perf tools: Enable on a list of CPUs for hybrid · 1d3351e6
      Jin Yao 提交于
      The 'perf record' and 'perf stat' commands have supported the option
      '-C/--cpus' to count or collect only on the list of CPUs provided. This
      option needs to be supported for hybrid as well.
      
      For hybrid support, it needs to check that the cpu list are available
      on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
      is 'cpu_atom'.
      
      Before:
      
        # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
      
         Performance counter stats for 'CPU(s) 11':
      
           <not supported>      cpu_core/cycles/
      
               1.006179431 seconds time elapsed
      
      The 'perf stat' command silently returned "<not supported>" without any
      helpful information. It should error out pointing out that that cpu11
      was not 'cpu_core'.
      
      After:
      
        # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
        WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
        failed to use cpu list 11
      
      We also need to support the events without pmu prefix specified.
      
        # perf stat -e cycles -C11 -- sleep 1
        WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
      
         Performance counter stats for 'CPU(s) 11':
      
                 1,067,373      cpu_atom/cycles/
      
               1.005544738 seconds time elapsed
      
      The perf tool creates two cycles events automatically, cpu_core/cycles/ and
      cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
      for cpu_core/cycles/ and only count the cpu_atom/cycles/.
      
      If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example,
      
        # perf stat -e cycles -C0,11 -- sleep 1
        WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
      
         Performance counter stats for 'CPU(s) 0,11':
      
                 1,914,704      cpu_core/cycles/
                 2,036,983      cpu_atom/cycles/
      
               1.005815641 seconds time elapsed
      
      It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
      cpu_atom/cycles/, and output with some warnings.
      
      Some more complex examples,
      
        # perf stat -e cycles,instructions -C0,11 -- sleep 1
        WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
        WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.
      
         Performance counter stats for 'CPU(s) 0,11':
      
                 2,780,387      cpu_core/cycles/
                 1,583,432      cpu_atom/cycles/
                 3,957,277      cpu_core/instructions/
                 1,167,089      cpu_atom/instructions/
      
               1.006005124 seconds time elapsed
      
        # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
        WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
        WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.
      
         Performance counter stats for 'CPU(s) 0,11':
      
                 3,290,301      cpu_core/cycles/
                 1,953,073      cpu_atom/cycles/
                 1,407,869      cpu_atom/instructions/
      
               1.006260912 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https //lore.kernel.org/r/20210723063433.7318-4-yao.jin@linux.intel.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d3351e6
  7. 02 8月, 2021 1 次提交
  8. 14 7月, 2021 1 次提交
    • J
      perf stat: Merge uncore events by default for hybrid platform · e0a7ef2a
      Jin Yao 提交于
      On a hybrid platform, by default 'perf stat' aggregates and reports the
      event counts per PMU. For example,
      
        # perf stat -e cycles -a true
      
         Performance counter stats for 'system wide':
      
                 1,400,445      cpu_core/cycles/
                   680,881      cpu_atom/cycles/
      
               0.001770773 seconds time elapsed
      
      But for uncore events that's not a suitable method. Uncore has nothing
      to do with hybrid. So for uncore events, we aggregate event counts from
      all PMUs and report the counts without PMUs.
      
      Before:
      
        # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
      
         Performance counter stats for 'system wide':
      
                     2,058      uncore_arb_0/event=0x81,umask=0x1/
                     2,028      uncore_arb_1/event=0x81,umask=0x1/
                         0      uncore_arb_0/event=0x84,umask=0x1/
                         0      uncore_arb_1/event=0x84,umask=0x1/
      
               0.000614498 seconds time elapsed
      
      After:
      
        # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
      
         Performance counter stats for 'system wide':
      
                     3,996      arb/event=0x81,umask=0x1/
                         0      arb/event=0x84,umask=0x1/
      
               0.000630046 seconds time elapsed
      
      Of course, we also keep the '--no-merge' working for uncore events.
      
        # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true
      
         Performance counter stats for 'system wide':
      
                     1,952      uncore_arb_0/event=0x81,umask=0x1/
                     1,921      uncore_arb_1/event=0x81,umask=0x1/
                         0      uncore_arb_0/event=0x84,umask=0x1/
                         0      uncore_arb_1/event=0x84,umask=0x1/
      
               0.000575536 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210707055652.962-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e0a7ef2a
  9. 10 7月, 2021 2 次提交
    • K
      perf stat: Add Topdown metrics L2 events as default events · 5f148e7c
      Kan Liang 提交于
      The Topdown Microarchitecture Analysis (TMA) Method is a structured
      analysis methodology to identify critical performance bottlenecks in
      out-of-order processors.
      
      The Topdown metrics L1 event was added as default in 42641d6f
      ("perf stat: Add Topdown metrics events as default events")
      
      From the Sapphire Rapids server and later platforms, the same dedicated
      "metrics" register is extended to support both L1 and L2 events.
      
      Add both L1 and L2 Topdown metrics events as default to enrich the
      default measuring information if the new measurement register is
      available.
      
      On legacy systems there is no change to avoid extra multiplexing.
      
      The topdown_level indicates the max metrics level for the top-down
      statistics. Set it to 2 to display all L1 and L2 Topdown metrics events.
      
      With the patch:
      
        $ perf stat sleep 1
      
        Performance counter stats for 'sleep 1':
      
                 0.59 msec task-clock             #   0.001 CPUs utilized
                    1      context-switches       #   1.687 K/sec
                    0      cpu-migrations         #   0.000 /sec
                   76      page-faults            # 128.198 K/sec
            1,405,318      cycles                 #   2.371 GHz
            1,471,136      instructions           #   1.05  insn per cycle
              310,132      branches               # 523.136 M/sec
               10,435      branch-misses          #   3.36% of all branches
            8,431,908      slots                  #  14.223 G/sec
            1,554,116      topdown-retiring       #    18.4% retiring
            1,289,585      topdown-bad-spec       #    15.2% bad speculation
            2,810,636      topdown-fe-bound       #    33.2% frontend bound
            2,810,636      topdown-be-bound       #    33.2% backend bound
              231,464      topdown-heavy-ops      #     2.7% heavy operations   #  15.6% light operations
            1,223,453      topdown-br-mispredict  #    14.5% branch mispredict  #   0.8% machine clears
            1,884,779      topdown-fetch-lat      #    22.3% fetch latency      #  10.9% fetch bandwidth
            1,454,917      topdown-mem-bound      #    17.2% memory bound       #  16.0% Core bound
      
          1.001179699 seconds time elapsed
      
          0.000000000 seconds user
          0.001238000 seconds sys
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/1625760169-18396-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5f148e7c
    • J
      libperf: Move 'leader' from tools/perf to perf_evsel::leader · fba7c866
      Jiri Olsa 提交于
      Move evsel::leader to perf_evsel::leader, so we can move the group
      interface to libperf.
      
      Also add several evsel helpers to ease up the transition:
      
        struct evsel *evsel__leader(struct evsel *evsel);
        - get leader evsel
      
        bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
        - true if evsel has leader as leader
      
        bool evsel__is_leader(struct evsel *evsel);
        - true if evsel is itw own leader
      
        void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
        - set leader for evsel
      
      Committer notes:
      
      Fix this when building with 'make BUILD_BPF_SKEL=1'
      
        tools/perf/util/bpf_counter.c
      
        -       if (evsel->leader->core.nr_members > 1) {
        +       if (evsel->core.leader->nr_members > 1) {
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Requested-by: NShunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fba7c866
  10. 22 5月, 2021 1 次提交
  11. 29 4月, 2021 4 次提交
    • J
      perf stat: Warn group events from different hybrid PMU · 660e533e
      Jin Yao 提交于
      If a group has events which are from different hybrid PMUs,
      shows a warning:
      
      "WARNING: events in group from different hybrid PMUs!"
      
      This is to remind the user not to put the core event and atom
      event into one group.
      
      Next, just disable grouping.
      
        # perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1
        WARNING: events in group from different hybrid PMUs!
        WARNING: grouped events cpus do not match, disabling group:
          anon group { cpu_core/cycles/, cpu_atom/cycles/ }
      
         Performance counter stats for 'system wide':
      
                 5,438,125      cpu_core/cycles/
                 3,914,586      cpu_atom/cycles/
      
               1.004250966 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-17-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      660e533e
    • J
      perf stat: Add default hybrid events · ac2dc29e
      Jin Yao 提交于
      Previously if '-e' is not specified in perf stat, some software events
      and hardware events are added to evlist by default.
      
      Before:
      
        # perf stat -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 24,044.40 msec cpu-clock                 #   23.946 CPUs utilized
                        99      context-switches          #    4.117 /sec
                        24      cpu-migrations            #    0.998 /sec
                         3      page-faults               #    0.125 /sec
                 7,000,244      cycles                    #    0.000 GHz
                 2,955,024      instructions              #    0.42  insn per cycle
                   608,941      branches                  #   25.326 K/sec
                    31,991      branch-misses             #    5.25% of all branches
      
               1.004106859 seconds time elapsed
      
      Among the events, cycles, instructions, branches and branch-misses
      are hardware events.
      
      One hybrid platform, two hardware events are created for one
      hardware event.
      
      cpu_core/cycles/,
      cpu_atom/cycles/,
      cpu_core/instructions/,
      cpu_atom/instructions/,
      cpu_core/branches/,
      cpu_atom/branches/,
      cpu_core/branch-misses/,
      cpu_atom/branch-misses/
      
      These events would be added to evlist on hybrid platform.
      
      Since parse_events() has been supported to create two hardware events
      for one event on hybrid platform, so we just use parse_events(evlist,
      "cycles,instructions,branches,branch-misses") to create the default
      events and add them to evlist.
      
      After:
      
        # perf stat -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 24,043.99 msec cpu-clock                 #   23.991 CPUs utilized
                       139      context-switches          #    5.781 /sec
                        25      cpu-migrations            #    1.040 /sec
                         6      page-faults               #    0.250 /sec
                10,381,751      cpu_core/cycles/          #  431.782 K/sec
                 1,264,216      cpu_atom/cycles/          #   52.579 K/sec
                 3,406,958      cpu_core/instructions/    #  141.697 K/sec
                   414,588      cpu_atom/instructions/    #   17.243 K/sec
                   705,149      cpu_core/branches/        #   29.327 K/sec
                    82,358      cpu_atom/branches/        #    3.425 K/sec
                    40,821      cpu_core/branch-misses/   #    1.698 K/sec
                     9,086      cpu_atom/branch-misses/   #  377.891 /sec
      
               1.002228863 seconds time elapsed
      
      We can see two events are created for one hardware event.
      
      One TODO is, the shadow stats looks a bit different, now it's just
      'M/sec'.
      
      The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
      need to be improved in future if we want to get the original shadow
      stats.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-15-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ac2dc29e
    • J
      perf stat: Uniquify hybrid event name · 12279429
      Jin Yao 提交于
      It would be useful to let user know the pmu which the event belongs to.
      perf-stat has supported '--no-merge' option and it can print the pmu
      name after the event name, such as:
      
      "cycles [cpu_core]"
      
      Now this option is enabled by default for hybrid platform but change
      the format to:
      
      "cpu_core/cycles/"
      
      If user configs the name, we still use the user specified name.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      12279429
    • S
      perf stat: Introduce config stat.bpf-counter-events · 112cb561
      Song Liu 提交于
      Currently, to use BPF to aggregate perf event counters, the user uses
      --bpf-counters option. Enable "use bpf by default" events with a config
      option, stat.bpf-counter-events. Events with name in the option will use
      BPF.
      
      This also enables mixed BPF event and regular event in the same sesssion.
      For example:
      
         perf config stat.bpf-counter-events=instructions
         perf stat -e instructions,cs
      
      The second command will use BPF for "instructions" but not "cs".
      Signed-off-by: NSong Liu <song@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      112cb561
  12. 20 4月, 2021 1 次提交
  13. 24 3月, 2021 4 次提交
    • J
      perf stat: Align CSV output for summary mode · 0bdad978
      Jin Yao 提交于
      The 'perf stat' subcommand supports the request for a summary of the
      interval counter readings.  But the summary lines break the CSV output
      so it's hard for scripts to parse the result.
      
      Before:
      
        # perf stat -x, -I1000 --interval-count 1 --summary
             1.001323097,8013.48,msec,cpu-clock,8013483384,100.00,8.013,CPUs utilized
             1.001323097,270,,context-switches,8013513297,100.00,0.034,K/sec
             1.001323097,13,,cpu-migrations,8013530032,100.00,0.002,K/sec
             1.001323097,184,,page-faults,8013546992,100.00,0.023,K/sec
             1.001323097,20574191,,cycles,8013551506,100.00,0.003,GHz
             1.001323097,10562267,,instructions,8013564958,100.00,0.51,insn per cycle
             1.001323097,2019244,,branches,8013575673,100.00,0.252,M/sec
             1.001323097,106152,,branch-misses,8013585776,100.00,5.26,of all branches
        8013.48,msec,cpu-clock,8013483384,100.00,7.984,CPUs utilized
        270,,context-switches,8013513297,100.00,0.034,K/sec
        13,,cpu-migrations,8013530032,100.00,0.002,K/sec
        184,,page-faults,8013546992,100.00,0.023,K/sec
        20574191,,cycles,8013551506,100.00,0.003,GHz
        10562267,,instructions,8013564958,100.00,0.51,insn per cycle
        2019244,,branches,8013575673,100.00,0.252,M/sec
        106152,,branch-misses,8013585776,100.00,5.26,of all branches
      
      The summary line loses the timestamp column, which breaks the CSV
      output.
      
      We add a column at the original 'timestamp' position and it just says
      'summary' for the summary line.
      
      After:
      
        # perf stat -x, -I1000 --interval-count 1 --summary
             1.001196053,8012.72,msec,cpu-clock,8012722903,100.00,8.013,CPUs utilized
             1.001196053,218,,context-switches,8012753271,100.00,0.027,K/sec
             1.001196053,9,,cpu-migrations,8012769767,100.00,0.001,K/sec
             1.001196053,0,,page-faults,8012786257,100.00,0.000,K/sec
             1.001196053,15004518,,cycles,8012790637,100.00,0.002,GHz
             1.001196053,7954691,,instructions,8012804027,100.00,0.53,insn per cycle
             1.001196053,1590259,,branches,8012814766,100.00,0.198,M/sec
             1.001196053,82601,,branch-misses,8012824365,100.00,5.19,of all branches
                 summary,8012.72,msec,cpu-clock,8012722903,100.00,7.986,CPUs utilized
                 summary,218,,context-switches,8012753271,100.00,0.027,K/sec
                 summary,9,,cpu-migrations,8012769767,100.00,0.001,K/sec
                 summary,0,,page-faults,8012786257,100.00,0.000,K/sec
                 summary,15004518,,cycles,8012790637,100.00,0.002,GHz
                 summary,7954691,,instructions,8012804027,100.00,0.53,insn per cycle
                 summary,1590259,,branches,8012814766,100.00,0.198,M/sec
                 summary,82601,,branch-misses,8012824365,100.00,5.19,of all branches
      
      Now it's easy for script to analyse the summary lines.
      
      Of course, we also consider not to break possible existing scripts which
      can continue to use the broken CSV format by using a new '--no-csv-summary.'
      option.
      
        # perf stat -x, -I1000 --interval-count 1 --summary --no-csv-summary
             1.001213261,8012.67,msec,cpu-clock,8012672327,100.00,8.013,CPUs utilized
             1.001213261,197,,context-switches,8012703742,100.00,24.586,/sec
             1.001213261,9,,cpu-migrations,8012720902,100.00,1.123,/sec
             1.001213261,644,,page-faults,8012738266,100.00,80.373,/sec
             1.001213261,18350698,,cycles,8012744109,100.00,0.002,GHz
             1.001213261,12745021,,instructions,8012759001,100.00,0.69,insn per cycle
             1.001213261,2458033,,branches,8012770864,100.00,306.768,K/sec
             1.001213261,102107,,branch-misses,8012781751,100.00,4.15,of all branches
        8012.67,msec,cpu-clock,8012672327,100.00,7.985,CPUs utilized
        197,,context-switches,8012703742,100.00,24.586,/sec
        9,,cpu-migrations,8012720902,100.00,1.123,/sec
        644,,page-faults,8012738266,100.00,80.373,/sec
        18350698,,cycles,8012744109,100.00,0.002,GHz
        12745021,,instructions,8012759001,100.00,0.69,insn per cycle
        2458033,,branches,8012770864,100.00,306.768,K/sec
        102107,,branch-misses,8012781751,100.00,4.15,of all branches
      
      This option can be enabled in perf config by setting the variable
      'stat.no-csv-summary'.
      
        # perf config stat.no-csv-summary=true
      
        # perf config -l
        stat.no-csv-summary=true
      
        # perf stat -x, -I1000 --interval-count 1 --summary
             1.001330198,8013.28,msec,cpu-clock,8013279201,100.00,8.013,CPUs utilized
             1.001330198,205,,context-switches,8013308394,100.00,25.583,/sec
             1.001330198,10,,cpu-migrations,8013324681,100.00,1.248,/sec
             1.001330198,0,,page-faults,8013340926,100.00,0.000,/sec
             1.001330198,8027742,,cycles,8013344503,100.00,0.001,GHz
             1.001330198,2871717,,instructions,8013356501,100.00,0.36,insn per cycle
             1.001330198,553564,,branches,8013366204,100.00,69.081,K/sec
             1.001330198,54021,,branch-misses,8013375952,100.00,9.76,of all branches
        8013.28,msec,cpu-clock,8013279201,100.00,7.985,CPUs utilized
        205,,context-switches,8013308394,100.00,25.583,/sec
        10,,cpu-migrations,8013324681,100.00,1.248,/sec
        0,,page-faults,8013340926,100.00,0.000,/sec
        8027742,,cycles,8013344503,100.00,0.001,GHz
        2871717,,instructions,8013356501,100.00,0.36,insn per cycle
        553564,,branches,8013366204,100.00,69.081,K/sec
        54021,,branch-misses,8013375952,100.00,9.76,of all branches
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210319070156.20394-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0bdad978
    • S
      perf stat: Measure 't0' and 'ref_time' after enable_counters() · 435b46ef
      Song Liu 提交于
      Take measurements of 't0' and 'ref_time' after enable_counters(), so
      that they only measure the time consumed when the counters are enabled.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Acked-by: NAndi Kleen <andi@firstfloor.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20210316211837.910506-3-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      435b46ef
    • S
      perf stat: Introduce 'bperf' to share hardware PMCs with BPF · 7fac83aa
      Song Liu 提交于
      The perf tool uses performance monitoring counters (PMCs) to monitor
      system performance. The PMCs are limited hardware resources. For
      example, Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
      
      Modern data center systems use these PMCs in many different ways: system
      level monitoring, (maybe nested) container level monitoring, per process
      monitoring, profiling (in sample mode), etc. In some cases, there are
      more active perf_events than available hardware PMCs. To allow all
      perf_events to have a chance to run, it is necessary to do expensive
      time multiplexing of events.
      
      On the other hand, many monitoring tools count the common metrics
      (cycles, instructions). It is a waste to have multiple tools create
      multiple perf_events of "cycles" and occupy multiple PMCs.
      
      bperf tries to reduce such wastes by allowing multiple perf_events of
      "cycles" or "instructions" (at different scopes) to share PMUs. Instead
      of having each perf-stat session to read its own perf_events, bperf uses
      BPF programs to read the perf_events and aggregate readings to BPF maps.
      Then, the perf-stat session(s) reads the values from these BPF maps.
      
      Please refer to the comment before the definition of bperf_ops for the
      description of bperf architecture.
      
      bperf is off by default. To enable it, pass --bpf-counters option to
      perf-stat. bperf uses a BPF hashmap to share information about BPF
      programs and maps used by bperf. This map is pinned to bpffs. The
      default path is /sys/fs/bpf/perf_attr_map. The user could change the
      path with option --bpf-attr-map.
      
      Committer testing:
      
        # dmesg|grep "Performance Events" -A5
        [    0.225277] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
        [    0.225280] ... version:                0
        [    0.225280] ... bit width:              48
        [    0.225281] ... generic registers:      6
        [    0.225281] ... value mask:             0000ffffffffffff
        [    0.225281] ... max period:             00007fffffffffff
        #
        #  for a in $(seq 6) ; do perf stat -a -e cycles,instructions sleep 100000 & done
        [1] 2436231
        [2] 2436232
        [3] 2436233
        [4] 2436234
        [5] 2436235
        [6] 2436236
        # perf stat -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
               310,326,987      cycles                                                        (41.87%)
               236,143,290      instructions              #    0.76  insn per cycle           (41.87%)
      
               0.100800885 seconds time elapsed
      
        #
      
      We can see that the counters were enabled for this workload 41.87% of
      the time.
      
      Now with --bpf-counters:
      
        #  for a in $(seq 32) ; do perf stat --bpf-counters -a -e cycles,instructions sleep 100000 & done
        [1] 2436514
        [2] 2436515
        [3] 2436516
        [4] 2436517
        [5] 2436518
        [6] 2436519
        [7] 2436520
        [8] 2436521
        [9] 2436522
        [10] 2436523
        [11] 2436524
        [12] 2436525
        [13] 2436526
        [14] 2436527
        [15] 2436528
        [16] 2436529
        [17] 2436530
        [18] 2436531
        [19] 2436532
        [20] 2436533
        [21] 2436534
        [22] 2436535
        [23] 2436536
        [24] 2436537
        [25] 2436538
        [26] 2436539
        [27] 2436540
        [28] 2436541
        [29] 2436542
        [30] 2436543
        [31] 2436544
        [32] 2436545
        #
        # ls -la /sys/fs/bpf/perf_attr_map
        -rw-------. 1 root root 0 Mar 23 14:53 /sys/fs/bpf/perf_attr_map
        # bpftool map | grep bperf | wc -l
        64
        #
      
        # bpftool map | tail
        1265: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1266: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1267: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 996
        	pids perf(2436545)
        1268: percpu_array  name accum_readings  flags 0x0
        	key 4B  value 24B  max_entries 1  memlock 4096B
        1269: hash  name filter  flags 0x0
        	key 4B  value 4B  max_entries 1  memlock 4096B
        1270: array  name bperf_fo.bss  flags 0x400
        	key 4B  value 8B  max_entries 1  memlock 4096B
        	btf_id 997
        	pids perf(2436541)
        1285: array  name pid_iter.rodata  flags 0x480
        	key 4B  value 4B  max_entries 1  memlock 4096B
        	btf_id 1017  frozen
        	pids bpftool(2437504)
        1286: array  flags 0x0
        	key 4B  value 32B  max_entries 1  memlock 4096B
        #
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        8f f3 bc ca 00 00 00 00  80 fd 2a d1 4d 00 00 00
        80 fd 2a d1 4d 00 00 00
        value (CPU 22):
        7e d5 64 4d 00 00 00 00  a4 8a 2e ee 4d 00 00 00
        a4 8a 2e ee 4d 00 00 00
        value (CPU 23):
        a7 78 3e 06 01 00 00 00  b2 34 94 f6 4d 00 00 00
        b2 34 94 f6 4d 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        c6 8b d9 ca 00 00 00 00  20 c6 fc 83 4e 00 00 00
        20 c6 fc 83 4e 00 00 00
        value (CPU 22):
        9c b4 d2 4d 00 00 00 00  3e 0c df 89 4e 00 00 00
        3e 0c df 89 4e 00 00 00
        value (CPU 23):
        18 43 66 06 01 00 00 00  5b 69 ed 83 4e 00 00 00
        5b 69 ed 83 4e 00 00 00
        Found 1 element
        # bpftool map dump id 1268 | tail
        value (CPU 21):
        f2 6e db ca 00 00 00 00  92 67 4c ba 4e 00 00 00
        92 67 4c ba 4e 00 00 00
        value (CPU 22):
        dc 8e e1 4d 00 00 00 00  d9 32 7a c5 4e 00 00 00
        d9 32 7a c5 4e 00 00 00
        value (CPU 23):
        bd 2b 73 06 01 00 00 00  7c 73 87 bf 4e 00 00 00
        7c 73 87 bf 4e 00 00 00
        Found 1 element
        #
      
        # perf stat --bpf-counters -a -e cycles,instructions sleep 0.1
      
         Performance counter stats for 'system wide':
      
             119,410,122      cycles
             152,105,479      instructions              #    1.27  insn per cycle
      
             0.101395093 seconds time elapsed
      
        #
      
      See? We had the counters enabled all the time.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20210316211837.910506-2-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7fac83aa
    • I
      perf tools: Fix various typos in comments · 4d39c89f
      Ingo Molnar 提交于
      Fix ~124 single-word typos and a few spelling errors in the perf tooling code,
      accumulated over the years.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210321113734.GA248990@gmail.com
      Link: http://lore.kernel.org/lkml/20210323160915.GA61903@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4d39c89f
  14. 09 2月, 2021 1 次提交
    • K
      perf stat: Support L2 Topdown events · 63e39aa6
      Kan Liang 提交于
      The TMA method level 2 metrics is supported from the Intel Sapphire
      Rapids server, which expose four L2 Topdown metrics events to user
      space. There are eight L2 events in total. The other four L2 Topdown
      metrics events are calculated from the corresponding L1 and the exposed
      L2 events.
      
      Now, the --topdown prints the complete top-down metrics that supported
      by the CPU. For the Intel Sapphire Rapids server, there are 4 L1 events
      and 8 L2 events displyed in one line.
      
      Add a new option, --td-level, to display the top-down statistics that
      equal to or lower than the input level.
      
      The L2 event is marked only when both its L1 parent event and itself
      crosse the threshold.
      
      Here is an example:
      
        $ perf stat --topdown --td-level=2 --no-metric-only sleep 1
        Topdown accuracy may decrease when measuring long periods.
        Please print the result regularly, e.g. -I1000
      
        Performance counter stats for 'sleep 1':
      
           16,734,390   slots
            2,100,001   topdown-retiring       # 12.6% retiring
            2,034,376   topdown-bad-spec       # 12.3% bad speculation
            4,003,128   topdown-fe-bound       # 24.1% frontend bound
              328,125   topdown-heavy-ops      #  2.0% heavy operations    #  10.6% light operations
            1,968,751   topdown-br-mispredict  # 11.9% branch mispredict   #  0.4% machine clears
            2,953,127   topdown-fetch-lat      # 17.8% fetch latency       #  6.3% fetch bandwidth
            5,906,255   topdown-mem-bound      # 35.6% memory bound        #  15.4% core bound
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/1612296553-21962-9-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63e39aa6
  15. 04 2月, 2021 1 次提交
    • K
      perf stat: Add Topdown metrics events as default events · 42641d6f
      Kan Liang 提交于
      The Topdown Microarchitecture Analysis (TMA) Method is a structured
      analysis methodology to identify critical performance bottlenecks in
      out-of-order processors. From the Ice Lake and later platforms, the
      Topdown information can be retrieved from the dedicated "metrics"
      register, which isn't impacted by other events. Also, the Topdown
      metrics support both per thread/process and per core measuring.  Adding
      Topdown metrics events as default events can enrich the default
      measuring information, and would not cost any extra multiplexing.
      
      Introduce arch_evlist__add_default_attrs() to allow architecture
      specific default events. Add the Topdown metrics events in the X86
      specific arch_evlist__add_default_attrs(). Other architectures can add
      their own default events later separately.
      
      With the patch:
      
       $ perf stat sleep 1
      
       Performance counter stats for 'sleep 1':
      
                 0.82 msec task-clock:u              #    0.001 CPUs utilized
                    0      context-switches:u        #    0.000 K/sec
                    0      cpu-migrations:u          #    0.000 K/sec
                   61      page-faults:u             #    0.074 M/sec
              319,941      cycles:u                  #    0.388 GHz
              242,802      instructions:u            #    0.76  insn per cycle
               54,380      branches:u                #   66.028 M/sec
                4,043      branch-misses:u           #    7.43% of all branches
            1,585,555      slots:u                   # 1925.189 M/sec
              238,941      topdown-retiring:u        #     15.0% retiring
              410,378      topdown-bad-spec:u        #     25.8% bad speculation
              634,222      topdown-fe-bound:u        #     39.9% frontend bound
              304,675      topdown-be-bound:u        #     19.2% backend bound
      
             1.001791625 seconds time elapsed
      
             0.000000000 seconds user
             0.001572000 seconds sys
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20210121133752.118327-1-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      42641d6f
  16. 21 1月, 2021 5 次提交
    • J
      perf tools: Add 'ping' control command · 47fddcb4
      Jiri Olsa 提交于
      Add a control 'ping' command to detect if perf is up and its control
      interface is operational.
      
      It will be used in following daemon patches to synchronize with record
      session - when control interface is up and running, we know that perf
      record is monitoring and ready to receive signals.
      
      Example session:
      
        terminal 1:
      
          # mkfifo control ack
          # perf record --control=fifo:control,ack
      
        terminal 2:
      
          # echo ping > control
          # cat ack
          ack
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      47fddcb4
    • J
      perf tools: Add 'stop' control command · f186cd61
      Jiri Olsa 提交于
      Adding control 'stop' command to stop perf record.
      
      When it is received, perf will set the 'done' variable to 1 to stop its
      mmap ring buffer reading loop.
      
      Example session:
      
        terminal 1:
          # mkfifo control ack
          # perf record --control=fifo:control,ack
      
        terminal 2:
          # echo stop > control
      
        terminal 1:
          [ perf record: Woken up 7 times to write data ]
          [ perf record: Captured and wrote 3.214 MB perf.data (38280 samples) ]
          #
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f186cd61
    • J
      perf tools: Add 'evlist' control command · 142544a9
      Jiri Olsa 提交于
      Add a new 'evlist' control command to display all the evlist events.
      When it is received, perf will scan and print current evlist into perf
      record terminal.
      
      The interface string for control file is:
      
        evlist [-v|-g|-F]
      
      The syntax follows perf evlist command:
        -F  Show just the sample frequency used for each event.
        -v  Show all fields.
        -g  Show event group information.
      
      Example session:
      
        terminal 1:
          # mkfifo control ack
          # perf record --control=fifo:control,ack -e '{cycles,instructions}'
      
        terminal 2:
          # echo evlist > control
      
        terminal 1:
          cycles
          instructions
          dummy:HG
      
        terminal 2:
          # echo 'evlist -v' > control
      
        terminal 1:
          cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type:            \
          IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1,    \
          sample_id_all: 1, exclude_guest: 1
          instructions: size: 120, config: 0x1, { sample_period, sample_freq }: 4000,      \
          sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, freq: 1,    \
          sample_id_all: 1, exclude_guest: 1
          dummy:HG: type: 1, size: 120, config: 0x9, { sample_period, sample_freq }: 4000, \
          sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, inherit: 1, mmap: 1,    \
          comm: 1, freq: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, \
           bpf_event: 1
      
        terminal 2:
          # echo 'evlist -g' > control
      
        terminal 1:
          {cycles,instructions}
          dummy:HG
      
        terminal 2:
          # echo 'evlist -F' > control
      
        terminal 1:
          cycles: sample_freq=4000
          instructions: sample_freq=4000
          dummy:HG: sample_freq=4000
      
      This new evlist command is handy to get real event names when
      wildcards are used.
      
      Adding evsel_fprintf.c object to python/perf.so build, because
      it's now evlist.c dependency.
      
      Adding PYTHON_PERF define for python/perf.so compilation, so we
      can use it to compile in only evsel__fprintf from evsel_fprintf.c
      object.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      142544a9
    • J
      perf tools: Allow to enable/disable events via control file · 991ae4eb
      Jiri Olsa 提交于
      Adding new control events to enable/disable specific event.
      The interface string for control file are:
      
        'enable <EVENT NAME>'
        'disable <EVENT NAME>'
      
      when received the command, perf will scan the current evlist
      for <EVENT NAME> and if found it's enabled/disabled.
      
      Example session:
      
        terminal 1:
          # mkfifo control ack perf.pipe
          # perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:*' -o - > perf.pipe
      
        terminal 2:
          # cat perf.pipe | perf --no-pager script -i -
      
        terminal 1:
          Events disabled
      
        NOTE Above message will show only after read side of the pipe ('>')
        is started on 'terminal 2'. The 'terminal 1's bash does not execute
        perf before that, hence the delyaed perf record message.
      
        terminal 3:
          # echo 'enable sched:sched_process_fork' > control
      
        terminal 1:
          event sched:sched_process_fork enabled
      
        terminal 2:
          bash 33349 [034] 149587.674295: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34056
          bash 33349 [034] 149588.239521: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34057
      
        terminal 3:
          # echo 'enable sched:sched_wakeup_new' > control
      
        terminal 1:
          event sched:sched_wakeup_new enabled
      
        terminal 2:
          bash 33349 [034] 149632.228023: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34059
          bash 33349 [034] 149632.228050:   sched:sched_wakeup_new: bash:34059 [120] success=1 CPU:036
          bash 33349 [034] 149633.950005: sched:sched_process_fork: comm=bash pid=33349 child_comm=bash child_pid=34060
          bash 33349 [034] 149633.950030:   sched:sched_wakeup_new: bash:34060 [120] success=1 CPU:036
      
      Committer testing:
      
      If I use 'sched:*' and then enable all events, I can't get 'perf record'
      to react to further commands, so I tested it with:
      
        [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe
        Events disabled
        Events enabled
        Events disabled
      
      And then it works as expected, so we need to fix this pre-existing
      problem.
      
      Another issue, we need to check if a event is already enabled or
      disabled and change the message to be clearer, i.e.:
      
        [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe
        Events disabled
      
      If we receive a 'disable' command, then it should say:
      
        [root@five ~]# perf record --control=fifo:control,ack -D -1 --no-buffering -e 'sched:sched_process_*' -o - > perf.pipe
        Events disabled
        Events already disabled
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Budankov <abudankov@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20201226232038.390883-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      991ae4eb
    • S
      perf stat: Enable counting events for BPF programs · fa853c4b
      Song Liu 提交于
      Introduce 'perf stat -b' option, which counts events for BPF programs, like:
      
        [root@localhost ~]# ~/perf stat -e ref-cycles,cycles -b 254 -I 1000
           1.487903822            115,200      ref-cycles
           1.487903822             86,012      cycles
           2.489147029             80,560      ref-cycles
           2.489147029             73,784      cycles
           3.490341825             60,720      ref-cycles
           3.490341825             37,797      cycles
           4.491540887             37,120      ref-cycles
           4.491540887             31,963      cycles
      
      The example above counts 'cycles' and 'ref-cycles' of BPF program of id
      254.  This is similar to bpftool-prog-profile command, but more
      flexible.
      
      'perf stat -b' creates per-cpu perf_event and loads fentry/fexit BPF
      programs (monitor-progs) to the target BPF program (target-prog). The
      monitor-progs read perf_event before and after the target-prog, and
      aggregate the difference in a BPF map. Then the user space reads data
      from these maps.
      
      A new 'struct bpf_counter' is introduced to provide a common interface
      that uses BPF programs/maps to count perf events.
      
      Committer notes:
      
      Removed all but bpf_counter.h includes from evsel.h, not needed at all.
      
      Also BPF map lookups for PERCPU_ARRAYs need to have as its value receive
      buffer passed to the kernel libbpf_num_possible_cpus() entries, not
      evsel__nr_cpus(evsel), as the former uses
      /sys/devices/system/cpu/possible while the later uses
      /sys/devices/system/cpu/online, which may be less than the 'possible'
      number making the bpf map lookup overwrite memory and cause hard to
      debug memory corruption.
      
      We need to continue using evsel__nr_cpus(evsel) when accessing the
      perf_counts array tho, not to overwrite another are of memory :-)
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: https://lore.kernel.org/lkml/20210120163031.GU12699@kernel.org/Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-team@fb.com
      Link: http://lore.kernel.org/lkml/20201229214214.3413833-4-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fa853c4b
  17. 24 12月, 2020 7 次提交