1. 23 6月, 2020 17 次提交
    • A
      perf evlist: Fix the class prefix for 'struct evlist' sample_type methods · b3c2cc2b
      Arnaldo Carvalho de Melo 提交于
      To differentiate from libperf's 'struct perf_evlist' methods.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b3c2cc2b
    • A
      perf evlist: Fix the class prefix for 'struct evlist' strerror methods · d1f249ec
      Arnaldo Carvalho de Melo 提交于
      To differentiate from libperf's 'struct perf_evlist' methods.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d1f249ec
    • A
      perf evlist: Fix the class prefix for 'struct evlist' 'add' evsel methods · e251abee
      Arnaldo Carvalho de Melo 提交于
      To differentiate from libperf's 'struct perf_evlist' methods.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e251abee
    • J
      perf pmu: Improve CPU core PMU HW event list ordering · ce0dc7d2
      John Garry 提交于
      For perf list, the CPU core PMU HW event ordering is such that not all
      events may will be listed adjacent - consider this example:
      
        $ tools/perf/perf list
      
        List of pre-defined events (to be used in -e):
      
          duration_time                                      [Tool event]
      
          branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
          branch-misses OR cpu/branch-misses/                [Kernel PMU event]
          bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
          cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
          cache-references OR cpu/cache-references/          [Kernel PMU event]
          cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
          cstate_core/c3-residency/                          [Kernel PMU event]
          cstate_core/c6-residency/                          [Kernel PMU event]
          cstate_core/c7-residency/                          [Kernel PMU event]
          cstate_pkg/c2-residency/                           [Kernel PMU event]
          cstate_pkg/c3-residency/                           [Kernel PMU event]
          cstate_pkg/c6-residency/                           [Kernel PMU event]
          cstate_pkg/c7-residency/                           [Kernel PMU event]
          cycles-ct OR cpu/cycles-ct/                        [Kernel PMU event]
          cycles-t OR cpu/cycles-t/                          [Kernel PMU event]
          el-abort OR cpu/el-abort/                          [Kernel PMU event]
          el-capacity OR cpu/el-capacity/                    [Kernel PMU event]
      
      Notice in the above example how the cstate_core PMU events are mixed in
      the middle of the CPU core events.
      
      For my arm64 platform, all the uncore events get mixed in, making the list
      very disorganised:
      
          page-faults OR faults                              [Software event]
          task-clock                                         [Software event]
          duration_time                                      [Tool event]
          L1-dcache-load-misses                              [Hardware cache event]
          L1-dcache-loads                                    [Hardware cache event]
          L1-icache-load-misses                              [Hardware cache event]
          L1-icache-loads                                    [Hardware cache event]
          branch-load-misses                                 [Hardware cache event]
          branch-loads                                       [Hardware cache event]
          dTLB-load-misses                                   [Hardware cache event]
          dTLB-loads                                         [Hardware cache event]
          iTLB-load-misses                                   [Hardware cache event]
          iTLB-loads                                         [Hardware cache event]
          br_mis_pred OR armv8_pmuv3_0/br_mis_pred/          [Kernel PMU event]
          br_mis_pred_retired OR armv8_pmuv3_0/br_mis_pred_retired/ [Kernel PMU event]
          br_pred OR armv8_pmuv3_0/br_pred/                  [Kernel PMU event]
          br_retired OR armv8_pmuv3_0/br_retired/            [Kernel PMU event]
          br_return_retired OR armv8_pmuv3_0/br_return_retired/ [Kernel PMU event]
          bus_access OR armv8_pmuv3_0/bus_access/            [Kernel PMU event]
          bus_cycles OR armv8_pmuv3_0/bus_cycles/            [Kernel PMU event]
          cid_write_retired OR armv8_pmuv3_0/cid_write_retired/ [Kernel PMU event]
          cpu_cycles OR armv8_pmuv3_0/cpu_cycles/            [Kernel PMU event]
          dtlb_walk OR armv8_pmuv3_0/dtlb_walk/              [Kernel PMU event]
          exc_return OR armv8_pmuv3_0/exc_return/            [Kernel PMU event]
          exc_taken OR armv8_pmuv3_0/exc_taken/              [Kernel PMU event]
          hisi_sccl1_ddrc0/act_cmd/                          [Kernel PMU event]
          hisi_sccl1_ddrc0/flux_rcmd/                        [Kernel PMU event]
          hisi_sccl1_ddrc0/flux_rd/                          [Kernel PMU event]
          hisi_sccl1_ddrc0/flux_wcmd/                        [Kernel PMU event]
          hisi_sccl1_ddrc0/flux_wr/                          [Kernel PMU event]
          hisi_sccl1_ddrc0/pre_cmd/                          [Kernel PMU event]
          hisi_sccl1_ddrc0/rnk_chg/                          [Kernel PMU event]
      
      ...
      
          hisi_sccl7_l3c21/wr_hit_cpipe/                     [Kernel PMU event]
          hisi_sccl7_l3c21/wr_hit_spipe/                     [Kernel PMU event]
          hisi_sccl7_l3c21/wr_spipe/                         [Kernel PMU event]
          inst_retired OR armv8_pmuv3_0/inst_retired/        [Kernel PMU event]
          inst_spec OR armv8_pmuv3_0/inst_spec/              [Kernel PMU event]
          itlb_walk OR armv8_pmuv3_0/itlb_walk/              [Kernel PMU event]
          l1d_cache OR armv8_pmuv3_0/l1d_cache/              [Kernel PMU event]
          l1d_cache_refill OR armv8_pmuv3_0/l1d_cache_refill/ [Kernel PMU event]
          l1d_cache_wb OR armv8_pmuv3_0/l1d_cache_wb/        [Kernel PMU event]
          l1d_tlb OR armv8_pmuv3_0/l1d_tlb/                  [Kernel PMU event]
          l1d_tlb_refill OR armv8_pmuv3_0/l1d_tlb_refill/    [Kernel PMU event]
      
      So the events are list alphabetically. However, CPU core event listing is
      special from commit dc098b35 ("perf list: List kernel supplied event
      aliases"), in that the alias and full event is shown (in that order).
      As such, the core events may become sparse.
      
      Improve this by grouping the CPU core events and ensure that they are
      listed first for kernel PMU events. For the first example, above, this
      now looks like:
      
          duration_time                                      [Tool event]
          branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
          branch-misses OR cpu/branch-misses/                [Kernel PMU event]
          bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
          cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
          cache-references OR cpu/cache-references/          [Kernel PMU event]
          cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
          cycles-ct OR cpu/cycles-ct/                        [Kernel PMU event]
          cycles-t OR cpu/cycles-t/                          [Kernel PMU event]
          el-abort OR cpu/el-abort/                          [Kernel PMU event]
          el-capacity OR cpu/el-capacity/                    [Kernel PMU event]
          el-commit OR cpu/el-commit/                        [Kernel PMU event]
          el-conflict OR cpu/el-conflict/                    [Kernel PMU event]
          el-start OR cpu/el-start/                          [Kernel PMU event]
          instructions OR cpu/instructions/                  [Kernel PMU event]
          mem-loads OR cpu/mem-loads/                        [Kernel PMU event]
          mem-stores OR cpu/mem-stores/                      [Kernel PMU event]
          ref-cycles OR cpu/ref-cycles/                      [Kernel PMU event]
          topdown-fetch-bubbles OR cpu/topdown-fetch-bubbles/ [Kernel PMU event]
          topdown-recovery-bubbles OR cpu/topdown-recovery-bubbles/ [Kernel PMU event]
          topdown-slots-issued OR cpu/topdown-slots-issued/  [Kernel PMU event]
          topdown-slots-retired OR cpu/topdown-slots-retired/ [Kernel PMU event]
          topdown-total-slots OR cpu/topdown-total-slots/    [Kernel PMU event]
          tx-abort OR cpu/tx-abort/                          [Kernel PMU event]
          tx-capacity OR cpu/tx-capacity/                    [Kernel PMU event]
          tx-commit OR cpu/tx-commit/                        [Kernel PMU event]
          tx-conflict OR cpu/tx-conflict/                    [Kernel PMU event]
          tx-start OR cpu/tx-start/                          [Kernel PMU event]
          cstate_core/c3-residency/                          [Kernel PMU event]
          cstate_core/c6-residency/                          [Kernel PMU event]
          cstate_core/c7-residency/                          [Kernel PMU event]
          cstate_pkg/c2-residency/                           [Kernel PMU event]
          cstate_pkg/c3-residency/                           [Kernel PMU event]
          cstate_pkg/c6-residency/                           [Kernel PMU event]
          cstate_pkg/c7-residency/                           [Kernel PMU event]
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1592384514-119954-3-git-send-email-john.garry@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ce0dc7d2
    • J
      perf pmu: List kernel supplied event aliases for arm64 · c1b4745b
      John Garry 提交于
      In commit dc098b35 ("perf list: List kernel supplied event aliases"),
      the aliases for events are supplied in addition to CPU event in perf list.
      
      This relies on the name of the core PMU being "cpu", which is not the case
      for arm64, so arm64 has always missed this. Use generic is_pmu_core()
      helper which takes account of arm64 to make this feature work for arm64
      (and possibly other archs).
      
      Sample, before:
      
        armv8_pmuv3_0/br_mis_pred/          [Kernel PMU event]
      
      after:
      
        br_mis_pred OR armv8_pmuv3_0/br_mis_pred/          [Kernel PMU event]
      Signed-off-by: NJohn Garry <john.garry@huawei.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxarm@huawei.com
      Link: http://lore.kernel.org/lkml/1592384514-119954-2-git-send-email-john.garry@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c1b4745b
    • I
      perf expr: Add < and > operators · ff1a12f9
      Ian Rogers 提交于
      These are broadly useful but required to handle TMA metrics. For example
      encoding Ports_Utilization from:
      
        https://download.01.org/perfmon/TMA_Metrics.csv
      
      requires '<'.
      
        {
          "BriefDescription": "This metric estimates fraction of cycles the CPU performance was potentially limited due to Core computation issues (non divider-related).  Two distinct categories can be attributed into this metric: (1) heavy data-dependency among contiguous instructions would manifest in this metric - such cases are often referred to as low Instruction Level Parallelism (ILP). (2) Contention on some hardware execution unit other than Divider. For example; when there are too many multiply operations.",
          "MetricExpr": "( ( cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ + cpu@EXE_ACTIVITY.1_PORTS_UTIL@ + ( cpu@EXE_ACTIVITY.2_PORTS_UTIL@ * ( ( ( cpu@UOPS_RETIRED.RETIRE_SLOTS@ ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) ) / ( ( 4.000000 ) + 1.000000 ) ) ) ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) if ( cpu@ARITH.DIVIDER_ACTIVE\\,cmask\\=1@ < cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ ) else ( ( cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ + cpu@EXE_ACTIVITY.1_PORTS_UTIL@ + ( cpu@EXE_ACTIVITY.2_PORTS_UTIL@ * ( ( ( cpu@UOPS_RETIRED.RETIRE_SLOTS@ ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) ) / ( ( 4.000000 ) + 1.000000 ) ) ) ) - cpu@EXE_ACTIVITY.EXE_BOUND_0_PORTS@ ) / ( cpu@CPU_CLK_UNHALTED.THREAD@ ) )",
          "MetricGroup": "Topdown_Group_Ports_Utilization",
          "MetricName": "Topdown_Metric_Ports_Utilization"
        },
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200610235823.52557-2-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ff1a12f9
    • I
      perf expr: Add d_ratio operation · 3e21a28a
      Ian Rogers 提交于
      d_ratio avoids division by 0 yielding infinity, such as when a counter
      doesn't get scheduled. An example usage is:
      
        {
            "BriefDescription": "DCache L1 misses",
            "MetricExpr": "d_ratio(MEM_LOAD_RETIRED.L1_MISS, MEM_LOAD_RETIRED.L1_HIT + MEM_LOAD_RETIRED.L1_MISS + MEM_LOAD_RETIRED.FB_HIT)",
            "MetricGroup": "DCache;DCache_L1",
            "MetricName": "DCache_L1_Miss",
            "ScaleUnit": "100%",
        }
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200610235823.52557-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3e21a28a
    • J
      perf tools: Add test_generic_metric function · 6d432c4c
      Jiri Olsa 提交于
      Adding test_generic_metric that prepares and runs given metric over the
      data from struct runtime_stat object.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-12-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6d432c4c
    • J
      perf tools: Release metric_events rblist · 9afe5658
      Jiri Olsa 提交于
      We don't release metric_events rblist, add the missing delete hook and
      call the release before leaving cmd_stat.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-11-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9afe5658
    • J
      perf tools: Factor out prepare_metric function · 2cfaa853
      Jiri Olsa 提交于
      Factoring out prepare_metric function so it can be used in test
      interface coming in following changes.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-10-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2cfaa853
    • J
      perf tools: Add metricgroup__parse_groups_test function · f78ac00a
      Jiri Olsa 提交于
      Add the metricgroup__parse_groups_test function. It will be used as
      test's interface to metric parsing in following changes.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-9-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f78ac00a
    • J
      perf tools: Add map to parse_groups() function · 1381396b
      Jiri Olsa 提交于
      For testing purposes we need to pass our own map of events from
      parse_groups() through metricgroup__add_metric.
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-8-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1381396b
    • J
      perf tools: Add fake_pmu to parse_group() function · 68173bda
      Jiri Olsa 提交于
      Allow to pass fake_pmu in parse_groups function so it can be used in
      parse_events call.
      
      It's will be passed by the upcoming metricgroup__parse_groups_test
      function.
      
      Committer notes:
      
      Made it a 'struct perf_pmu' pointer, in line with the changes at the
      start of this patchkit to avoid statics deep down in library code.
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      68173bda
    • J
      perf parse: Factor out parse_groups() function · 8b4468a2
      Jiri Olsa 提交于
      Factor out the parse_groups function, it will be used for new test
      interface coming in following changes.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b4468a2
    • A
      perf pmu: Add a perf_pmu__fake object to use with __parse_events() · e46fc8d9
      Arnaldo Carvalho de Melo 提交于
      When wanting to use the support in __parse_events() for fake pmus, just
      pass it.
      
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e46fc8d9
    • A
      perf parse: Provide a way to pass a fake_pmu to parse_events() · 3bf91aa5
      Arnaldo Carvalho de Melo 提交于
      This is an alternative patch to what Jiri sent that instead of changing
      all callers to parse_events() for allowing to pass a fake_pmu, provide
      another function specifically for that.
      
      From Jiri's patch:
      
      This way it's possible to parse events from PMUs which are not present
      in the system. It's available only for testing purposes coming in
      following changes, so all the current users set fake_pmu argument as
      false.
      Based-on-a-patch-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-3-jolsa@kernel.orgAcked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3bf91aa5
    • J
      perf tools: Add fake pmu support · 387ad33f
      Jiri Olsa 提交于
      Add a way to create a pmu event without the actual PMU being in place.
      
      This way we can test metrics defined for any processor.
      
      The interface is to define fake_pmu in struct parse_events_state data.
      It will be used only in tests via special interface function added in
      following changes.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200602214741.1218986-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      387ad33f
  2. 10 6月, 2020 1 次提交
  3. 09 6月, 2020 5 次提交
    • I
      perf parse-events: Fix an old style declaration · ffaecd7d
      Ian Rogers 提交于
      Fixes: a26e4716 (perf tools: Move ALLOC_LIST into a function)
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200609053610.206588-2-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ffaecd7d
    • I
      perf parse-events: Fix an incompatible pointer · c2412fae
      Ian Rogers 提交于
      Arrays are pointer types and don't need their address taking.
      
      Fixes: 8255718f (perf pmu: Expand PMU events by prefix match)
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200609053610.206588-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2412fae
    • S
      perf bpf: Fix bpf prologue generation · d38c692f
      Sumanth Korikkar 提交于
      Issue:
      
      bpf_probe_read() is no longer available for architecture which has
      overlapping address space. Hence bpf prologue generation fails
      
      Fix:
      
      Use bpf_probe_read_kernel for kernel member access. For user attribute
      access in kprobes, use bpf_probe_read_user.
      
      Other:
      
      @user attribute was introduced in commit 1e032f7c ("perf-probe: Add
      user memory access attribute support")
      
      Test:
      
      1. ulimit -l 128 ; ./perf record -e tests/bpf_sched_setscheduler.c
      2. cat tests/bpf_sched_setscheduler.c
      
      static void (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
              (void *) 6;
      static int (*bpf_probe_read_user)(void *dst, __u32 size,
                                        const void *unsafe_ptr) = (void *) 112;
      static int (*bpf_probe_read_kernel)(void *dst, __u32 size,
              const void *unsafe_ptr) = (void *) 113;
      
      SEC("func=do_sched_setscheduler  pid policy param->sched_priority@user")
      int bpf_func__setscheduler(void *ctx, int err, pid_t pid, int policy,
                                 int param)
      {
              char fmt[] = "prio: %ld";
              bpf_trace_printk(fmt, sizeof(fmt), param);
              return 1;
      }
      
      char _license[] SEC("license") = "GPL";
      int _version SEC("version") = LINUX_VERSION_CODE;
      
      3. ./perf script
         sched 305669 [000] 1614458.838675: perf_bpf_probe:func: (2904e508)
         pid=261614 policy=2 sched_priority=1
      
      4. cat /sys/kernel/debug/tracing/trace
         <...>-309956 [006] .... 1616098.093957: 0: prio: 1
      
      Committer testing:
      
      I had to add some missing headers in the bpf_sched_setscheduler.c test
      proggie, then instead of using record+script I used 'perf trace' to
      drive everything in one go:
      
        # cat bpf_sched_setscheduler.c
        #include <linux/types.h>
        #include <bpf.h>
      
        static void (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) = (void *) 6;
        static int (*bpf_probe_read_user)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) 112;
        static int (*bpf_probe_read_kernel)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) 113;
      
        SEC("func=do_sched_setscheduler  pid policy param->sched_priority@user")
        int bpf_func__setscheduler(void *ctx, int err, pid_t pid, int policy, int param)
        {
                char fmt[] = "prio: %ld";
                bpf_trace_printk(fmt, sizeof(fmt), param);
                return 1;
        }
      
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        #
        #
        # perf trace -e bpf_sched_setscheduler.c chrt -f 42 sleep 1
           0.000 chrt/80125 perf_bpf_probe:func(__probe_ip: -1676607808, policy: 1, sched_priority: 42)
        #
      
      And even with backtraces :-)
      
        # perf trace -e bpf_sched_setscheduler.c/max-stack=8/ chrt -f 42 sleep 1
             0.000 chrt/79805 perf_bpf_probe:func(__probe_ip: -1676607808, policy: 1, sched_priority: 42)
                                               do_sched_setscheduler ([kernel.kallsyms])
                                               __x64_sys_sched_setscheduler ([kernel.kallsyms])
                                               do_syscall_64 ([kernel.kallsyms])
                                               entry_SYSCALL_64 ([kernel.kallsyms])
                                               __GI___sched_setscheduler (/usr/lib64/libc-2.30.so)
        #
      Signed-off-by: NSumanth Korikkar <sumanthk@linux.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ilya Leoshkevich <iii@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: bpf@vger.kernel.org
      LPU-Reference: 20200609081019.60234-3-sumanthk@linux.ibm.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d38c692f
    • S
      perf probe: Fix user attribute access in kprobes · 9256c303
      Sumanth Korikkar 提交于
      Issue:
      
        # perf probe -a 'do_sched_setscheduler pid policy param->sched_priority@user'
      
      did not work before.
      
      Fix:
      
      Make:
      
        # perf probe -a 'do_sched_setscheduler pid policy param->sched_priority@user'
      
      output equivalent to ftrace:
      
        # echo 'p:probe/do_sched_setscheduler _text+517384 pid=%r2:s32 policy=%r3:s32 sched_priority=+u0(%r4):s32' > /sys/kernel/debug/tracing/kprobe_events
      
      Other:
      
      1. Right now, __match_glob() does not handle [u]<offset>. For now, use
        *u]<offset>.
      
      2. @user attribute was introduced in commit 1e032f7c ("perf-probe:
         Add user memory access attribute support")
      
      Test:
      1. perf probe -a 'do_sched_setscheduler  pid policy
         param->sched_priority@user'
      
      2 ./perf script
         sched 305669 [000] 1614458.838675: perf_bpf_probe:func: (2904e508)
         pid=261614 policy=2 sched_priority=1
      
      3. cat /sys/kernel/debug/tracing/trace
         <...>-309956 [006] .... 1616098.093957: 0: prio: 1
      
      Committer testing:
      
      Before:
      
        # perf probe -a 'do_sched_setscheduler pid policy param->sched_priority@user'
        param(type:sched_param) has no member sched_priority@user.
          Error: Failed to add events.
        # pahole sched_param
        struct sched_param {
        	int                        sched_priority;       /*     0     4 */
      
        	/* size: 4, cachelines: 1, members: 1 */
        	/* last cacheline: 4 bytes */
        };
        #
      
      After:
      
        # perf probe -a 'do_sched_setscheduler pid policy param->sched_priority@user'
        Added new event:
          probe:do_sched_setscheduler (on do_sched_setscheduler with pid policy sched_priority=param->sched_priority)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:do_sched_setscheduler -aR sleep 1
      
        # cat /sys/kernel/debug/tracing/kprobe_events
        p:probe/do_sched_setscheduler _text+1113792 pid=%di:s32 policy=%si:s32 sched_priority=+u0(%dx):s32
        #
      
      Fixes: 1e032f7c ("perf-probe: Add user memory access attribute support")
      Signed-off-by: NSumanth Korikkar <sumanthk@linux.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ilya Leoshkevich <iii@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: bpf@vger.kernel.org
      LPU-Reference: 20200609081019.60234-2-sumanthk@linux.ibm.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9256c303
    • H
      perf stat: Fix NULL pointer dereference · c0c652fc
      Hongbo Yao 提交于
      If config->aggr_map is NULL and config->aggr_get_id is not NULL,
      the function print_aggr() will still calling arrg_update_shadow(),
      which can result in accessing the invalid pointer.
      
      Fixes: 088519f3 ("perf stat: Move the display functions to stat-display.c")
      Signed-off-by: NHongbo Yao <yaohongbo@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wei Li <liwei391@huawei.com>
      Link: https://lore.kernel.org/lkml/20200608163625.GC3073@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c0c652fc
  4. 02 6月, 2020 2 次提交
  5. 01 6月, 2020 3 次提交
    • T
      perf arm-spe: Support synthetic events · a54ca194
      Tan Xiaojun 提交于
      After the commit ffd3d18c ("perf tools: Add ARM Statistical
      Profiling Extensions (SPE) support") has been merged, it supports to
      output raw data with option "--dump-raw-trace".  However, it misses for
      support synthetic events so cannot output any statistical info.
      
      This patch is to improve the "perf report" support for ARM SPE for four
      types synthetic events:
      
        First level cache synthetic events, including L1 data cache accessing
        and missing events;
        Last level cache synthetic events, including last level cache
        accessing and missing events;
        TLB synthetic events, including TLB accessing and missing events;
        Remote access events, which is used to account load/store operations
        caused to another socket.
      
      Example usage:
      
        $ perf record -c 1024 -e arm_spe_0/branch_filter=1,ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ dd if=/dev/zero of=/dev/null count=10000
        $ perf report --stdio
      
        # Samples: 59  of event 'l1d-miss'
        # Event count (approx.): 59
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  ..................................
        #
            23.73%    23.73%  dd       [kernel.kallsyms]  [k] perf_iterate_ctx.constprop.135
            20.34%    20.34%  dd       [kernel.kallsyms]  [k] filemap_map_pages
             5.08%     5.08%  dd       [kernel.kallsyms]  [k] perf_event_mmap
             5.08%     5.08%  dd       [kernel.kallsyms]  [k] unlock_page_memcg
             5.08%     5.08%  dd       [kernel.kallsyms]  [k] unmap_page_range
             3.39%     3.39%  dd       [kernel.kallsyms]  [k] PageHuge
             3.39%     3.39%  dd       [kernel.kallsyms]  [k] release_pages
             3.39%     3.39%  dd       ld-2.28.so         [.] 0x0000000000008b5c
             1.69%     1.69%  dd       [kernel.kallsyms]  [k] __alloc_fd
             [...]
      
        # Samples: 3K of event 'l1d-access'
        # Event count (approx.): 3980
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  ......................................
        #
            26.98%    26.98%  dd       [kernel.kallsyms]  [k] ret_to_user
            10.53%    10.53%  dd       [kernel.kallsyms]  [k] fsnotify
             7.51%     7.51%  dd       [kernel.kallsyms]  [k] new_sync_read
             4.57%     4.57%  dd       [kernel.kallsyms]  [k] vfs_read
             4.35%     4.35%  dd       [kernel.kallsyms]  [k] vfs_write
             3.69%     3.69%  dd       [kernel.kallsyms]  [k] __fget_light
             3.69%     3.69%  dd       [kernel.kallsyms]  [k] rw_verify_area
             3.44%     3.44%  dd       [kernel.kallsyms]  [k] security_file_permission
             2.76%     2.76%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
             2.44%     2.44%  dd       [kernel.kallsyms]  [k] ksys_write
             2.24%     2.24%  dd       [kernel.kallsyms]  [k] iov_iter_zero
             2.19%     2.19%  dd       [kernel.kallsyms]  [k] read_iter_zero
             1.81%     1.81%  dd       dd                 [.] 0x0000000000002960
             1.78%     1.78%  dd       dd                 [.] 0x0000000000002980
             [...]
      
        # Samples: 35  of event 'llc-miss'
        # Event count (approx.): 35
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  ...........................
        #
            34.29%    34.29%  dd       [kernel.kallsyms]  [k] filemap_map_pages
             8.57%     8.57%  dd       [kernel.kallsyms]  [k] unlock_page_memcg
             8.57%     8.57%  dd       [kernel.kallsyms]  [k] unmap_page_range
             5.71%     5.71%  dd       [kernel.kallsyms]  [k] PageHuge
             5.71%     5.71%  dd       [kernel.kallsyms]  [k] release_pages
             5.71%     5.71%  dd       ld-2.28.so         [.] 0x0000000000008b5c
             2.86%     2.86%  dd       [kernel.kallsyms]  [k] __queue_work
             2.86%     2.86%  dd       [kernel.kallsyms]  [k] __radix_tree_lookup
             2.86%     2.86%  dd       [kernel.kallsyms]  [k] copy_page
             [...]
      
        # Samples: 2  of event 'llc-access'
        # Event count (approx.): 2
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  .............
        #
            50.00%    50.00%  dd       [kernel.kallsyms]  [k] copy_page
            50.00%    50.00%  dd       libc-2.28.so       [.] _dl_addr
      
        # Samples: 48  of event 'tlb-miss'
        # Event count (approx.): 48
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  ..................................
        #
            20.83%    20.83%  dd       [kernel.kallsyms]  [k] perf_iterate_ctx.constprop.135
            12.50%    12.50%  dd       [kernel.kallsyms]  [k] __arch_clear_user
            10.42%    10.42%  dd       [kernel.kallsyms]  [k] clear_page
             4.17%     4.17%  dd       [kernel.kallsyms]  [k] copy_page
             4.17%     4.17%  dd       [kernel.kallsyms]  [k] filemap_map_pages
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] __alloc_fd
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] __mod_memcg_state.part.70
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] __queue_work
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] __rcu_read_unlock
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] d_path
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] destroy_inode
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] do_dentry_open
             [...]
      
        # Samples: 9K of event 'tlb-access'
        # Event count (approx.): 9573
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  ......................................
        #
            25.79%    25.79%  dd       [kernel.kallsyms]  [k] __arch_clear_user
            11.22%    11.22%  dd       [kernel.kallsyms]  [k] ret_to_user
             8.56%     8.56%  dd       [kernel.kallsyms]  [k] fsnotify
             4.06%     4.06%  dd       [kernel.kallsyms]  [k] new_sync_read
             3.67%     3.67%  dd       [kernel.kallsyms]  [k] el0_svc_common.constprop.2
             3.04%     3.04%  dd       [kernel.kallsyms]  [k] __fsnotify_parent
             2.90%     2.90%  dd       [kernel.kallsyms]  [k] vfs_write
             2.82%     2.82%  dd       [kernel.kallsyms]  [k] vfs_read
             2.52%     2.52%  dd       libc-2.28.so       [.] write
             2.26%     2.26%  dd       [kernel.kallsyms]  [k] security_file_permission
             2.08%     2.08%  dd       [kernel.kallsyms]  [k] ksys_write
             1.96%     1.96%  dd       [kernel.kallsyms]  [k] rw_verify_area
             1.95%     1.95%  dd       [kernel.kallsyms]  [k] read_iter_zero
             [...]
      
        # Samples: 9  of event 'branch-miss'
        # Event count (approx.): 9
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  .........................
        #
            22.22%    22.22%  dd       libc-2.28.so       [.] _dl_addr
            11.11%    11.11%  dd       [kernel.kallsyms]  [k] __arch_clear_user
            11.11%    11.11%  dd       [kernel.kallsyms]  [k] __arch_copy_from_user
            11.11%    11.11%  dd       [kernel.kallsyms]  [k] __dentry_kill
            11.11%    11.11%  dd       [kernel.kallsyms]  [k] __efistub_memcpy
            11.11%    11.11%  dd       ld-2.28.so         [.] 0x0000000000012b7c
            11.11%    11.11%  dd       libc-2.28.so       [.] 0x000000000002a980
            11.11%    11.11%  dd       libc-2.28.so       [.] 0x0000000000083340
      
        # Samples: 29  of event 'remote-access'
        # Event count (approx.): 29
        #
        # Children      Self  Command  Shared Object      Symbol
        # ........  ........  .......  .................  ...........................
        #
            41.38%    41.38%  dd       [kernel.kallsyms]  [k] filemap_map_pages
            10.34%    10.34%  dd       [kernel.kallsyms]  [k] unlock_page_memcg
            10.34%    10.34%  dd       [kernel.kallsyms]  [k] unmap_page_range
             6.90%     6.90%  dd       [kernel.kallsyms]  [k] release_pages
             3.45%     3.45%  dd       [kernel.kallsyms]  [k] PageHuge
             3.45%     3.45%  dd       [kernel.kallsyms]  [k] __queue_work
             3.45%     3.45%  dd       [kernel.kallsyms]  [k] page_add_file_rmap
             3.45%     3.45%  dd       [kernel.kallsyms]  [k] page_counter_try_charge
             3.45%     3.45%  dd       [kernel.kallsyms]  [k] page_remove_rmap
             3.45%     3.45%  dd       [kernel.kallsyms]  [k] xas_start
             3.45%     3.45%  dd       ld-2.28.so         [.] 0x0000000000002a1c
             3.45%     3.45%  dd       ld-2.28.so         [.] 0x0000000000008b5c
             3.45%     3.45%  dd       ld-2.28.so         [.] 0x00000000000093cc
      Signed-off-by: NTan Xiaojun <tanxiaojun@huawei.com>
      Tested-by: NJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200530122442.490-4-leo.yan@linaro.orgSigned-off-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a54ca194
    • T
      perf auxtrace: Add four itrace options · 9f74d770
      Tan Xiaojun 提交于
      This patch is to add four options to synthesize events which are
      described as below:
      
       'f': synthesize first level cache events
       'm': synthesize last level cache events
       't': synthesize TLB events
       'a': synthesize remote access events
      
      This four options will be used by ARM SPE as their first consumer.
      Signed-off-by: NTan Xiaojun <tanxiaojun@huawei.com>
      Tested-by: NJames Clark <james.clark@arm.com>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200530122442.490-3-leo.yan@linaro.orgSigned-off-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9f74d770
    • T
      perf tools: Move arm-spe-pkt-decoder.h/c to the new dir · 4db25f66
      Tan Xiaojun 提交于
      Create a new arm-spe-decoder directory for subsequent extensions and
      move arm-spe-pkt-decoder.h/c to this directory. No code changes.
      Signed-off-by: NTan Xiaojun <tanxiaojun@huawei.com>
      Tested-by: NJames Clark <james.clark@arm.com>
      Tested-by: NQi Liu <liuqi115@hisilicon.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Grant <al.grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200530122442.490-2-leo.yan@linaro.orgSigned-off-by: NJames Clark <james.clark@arm.com>
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4db25f66
  6. 30 5月, 2020 4 次提交
    • S
      perf tools: Add optional support for libpfm4 · 70943490
      Stephane Eranian 提交于
      This patch links perf with the libpfm4 library if it is available and
      LIBPFM4 is passed to the build. The libpfm4 library contains hardware
      event tables for all processors supported by perf_events. It is a helper
      library that helps convert from a symbolic event name to the event
      encoding required by the underlying kernel interface. This library is
      open-source and available from: http://perfmon2.sf.net.
      
      With this patch, it is possible to specify full hardware events by name.
      Hardware filters are also supported. Events must be specified via the
      --pfm-events and not -e option. Both options are active at the same time
      and it is possible to mix and match:
      
        $ perf stat --pfm-events inst_retired:any_p:c=1:i -e cycles ....
      
      One needs to explicitely ask for its inclusion by using the LIBPFM4 make
      command line option, ie its opt-in rather than opt-out of feature
      detection and build support.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Igor Lubashev <ilubashe@akamai.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Jiwei Sun <jiwei.sun@windriver.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: yuzhoujian <yuzhoujian@didichuxing.com>
      Link: http://lore.kernel.org/lkml/20200505182943.218248-2-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      70943490
    • N
      perf jit: Fix inaccurate DWARF line table · 1e4bd2ae
      Nick Gasson 提交于
      Fix an issue where addresses in the DWARF line table are offset by -0x40
      (GEN_ELF_TEXT_OFFSET). This can be seen with `objdump -S` on the ELF
      files after perf inject.
      
      Committer notes:
      
      Ian added this in his Acked-by reply:
      
       ---
      Without too much knowledge this looks good to me. The original code came
      from oprofile's jit support:
      
        https://sourceforge.net/p/oprofile/oprofile/ci/master/tree/opjitconv/debug_line.c#l325
       ---
      Signed-off-by: NNick Gasson <nick.gasson@arm.com>
      Acked-by: NIan Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200528051916.6722-1-nick.gasson@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1e4bd2ae
    • A
      perf trace: Use zalloc() to make sure all fields are zeroed in the syscalltbl constructor · a9e8c1f8
      Arnaldo Carvalho de Melo 提交于
      In the past this wasn't needed as the libaudit based code would use just
      one field, and the alternative constructor would fill in all the fields,
      but now that even when using the libaudit based method we need the other
      fields, switch to zalloc() to make sure the other fields are zeroed at
      instantiation time.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a9e8c1f8
    • A
      perf trace: Remove union from syscalltbl, all the fields are needed · db6b8cc8
      Arnaldo Carvalho de Melo 提交于
      When we moved to a syscalltbl generated from the kernel syscall tables
      (arch/..../syscall*.tbl) the idea was to either use it, when having the
      generator (e.g. tools/perf/arch/x86/entry/syscalls/syscalltbl.sh), or
      falling back to the previous audit-libs based way of mapping syscall ids
      to strings and the other way around.
      
      At first we just needed the audit_detect_machine() return to then use it
      to the str->id/id->str, or the other fields for the now used by default
      in the most well developed arches method of using the syscall table
      generator.
      
      The problem is that then the libaudit code fell into disrepair, and
      architectures where it is the method used are not working.
      
      Now, with NO_SYSCALL_TABLE=1 being possible to pass on the make command
      line we can automate the testing of that method even on x86-64, arm64,
      etc.
      
      And doing it I noted that we actually use fields in both entries in the
      union, oops, so ditch the union, as we need all those fields at the same
      time.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      db6b8cc8
  7. 28 5月, 2020 8 次提交
    • A
      perf record: Respect --no-switch-events · 16b4b4e1
      Adrian Hunter 提交于
      Context switch events are added automatically by Intel PT and Coresight.
      
      Make it possible to suppress them. That is useful for tracing the
      scheduler without the disturbance that the switch event processing
      creates.
      
      Example:
      
        Prerequisites:
      
          $ which perf
          ~/bin/perf
          $ sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf
          $ sudo chmod +r /proc/kcore
      
        Before:
      
          $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.938 MB perf.data ]
          $ perf script -D | grep PERF_RECORD_SWITCH | wc -l
          572
      
        After:
      
          $ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
          Warning:
          Intel Processor Trace decoding will not be possible except for kernel tracing!
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.838 MB perf.data ]
          $ perf script -D | grep PERF_RECORD_SWITCH | wc -l
          0
      
          $ sudo chmod go-r /proc/kcore
          $ sudo setcap -r ~/bin/perf
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Link: http://lore.kernel.org/lkml/20200528120859.21604-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      16b4b4e1
    • A
      perf evlist: Disable 'immediate' events last · 87cf8360
      Adrian Hunter 提交于
      Events marked as 'immediate' are started before other events to ensure
      that there is context at the start of the main tracing events. The same
      is true at the end of tracing, so disable 'immediate' events after other
      events.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20200512121922.8997-11-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      87cf8360
    • A
      perf kcore_copy: Fix module map when there are no modules loaded · 61f82e3f
      Adrian Hunter 提交于
      In the absence of any modules, no "modules" map is created, but there
      are other executable pages to map, due to eBPF JIT, kprobe or ftrace.
      Map them by recognizing that the first "module" symbol is not
      necessarily from a module, and adjust the map accordingly.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20200512121922.8997-10-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61f82e3f
    • N
      perf jvmti: Fix demangling Java symbols · 0bdf3181
      Nick Gasson 提交于
      For a Java method signature like:
      
          Ljava/lang/AbstractStringBuilder;appendChars(Ljava/lang/String;II)V
      
      The demangler produces:
      
          void class java.lang.AbstractStringBuilder.appendChars(class java.lang., shorttring., int, int)
      
      The arguments should be (java.lang.String, int, int) but the demangler
      interprets the "S" in String as the type code for "short". Correct this
      and two other minor things:
      
      - There is no "bool" type in Java, should be "boolean".
      
      - The demangler prepends "class" to every Java class name. This is not
        standard Java syntax and it wastes a lot of horizontal space if the
        signature is long. Remove this as there isn't any ambiguity between
        class names and primitives.
      
      Committer notes:
      
      This was split from a larger patch that also added a java demangler
      'perf test' entry, that, before this patch shows the error being fixed
      by it:
      
        $ perf test java
        65: Demangle Java                                         : FAILED!
        $ perf test -v java
        Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
        65: Demangle Java                                         :
        --- start ---
        test child forked, pid 307264
        FAILED: Ljava/lang/StringLatin1;equals([B[B)Z: bool class java.lang.StringLatin1.equals(byte[], byte[]) != boolean java.lang.StringLatin1.equals(byte[], byte[])
        FAILED: Ljava/util/zip/ZipUtils;CENSIZ([BI)J: long class java.util.zip.ZipUtils.CENSIZ(byte[], int) != long java.util.zip.ZipUtils.CENSIZ(byte[], int)
        FAILED: Ljava/util/regex/Pattern$BmpCharProperty;match(Ljava/util/regex/Matcher;ILjava/lang/CharSequence;)Z: bool class java.util.regex.Pattern$BmpCharProperty.match(class java.util.regex.Matcher., int, class java.lang., charhar, shortequence) != boolean java.util.regex.Pattern$BmpCharProperty.match(java.util.regex.Matcher, int, java.lang.CharSequence)
        FAILED: Ljava/lang/AbstractStringBuilder;appendChars(Ljava/lang/String;II)V: void class java.lang.AbstractStringBuilder.appendChars(class java.lang., shorttring., int, int) != void java.lang.AbstractStringBuilder.appendChars(java.lang.String, int, int)
        FAILED: Ljava/lang/Object;<init>()V: void class java.lang.Object<init>() != void java.lang.Object<init>()
        test child finished with -1
        ---- end ----
        Demangle Java: FAILED!
        $
      
      After applying this patch:
      
        $ perf test  java
        65: Demangle Java                                         : Ok
        $
      Signed-off-by: NNick Gasson <nick.gasson@arm.com>
      Reviewed-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200427061520.24905-4-nick.gasson@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0bdf3181
    • A
      perf symbols: Fix debuginfo search for Ubuntu · 85afd355
      Adrian Hunter 提交于
      Reportedly, from 19.10 Ubuntu has begun mixing up the location of some
      debug symbol files, putting files expected to be in
      /usr/lib/debug/usr/lib into /usr/lib/debug/lib instead. Fix by adding
      another dso_binary_type.
      
      Example on Ubuntu 20.04
      
        Before:
      
          $ perf record -e intel_pt//u uname
          Linux
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.030 MB perf.data ]
          $ perf script --call-trace | head -5
                 uname 14003 [005] 15321.764958566:  cbr: 42 freq: 4219 MHz (156%)
                 uname 14003 [005] 15321.764958566: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )          7f1e71cc4100
                 uname 14003 [005] 15321.764961566: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )              7f1e71cc4df0
                 uname 14003 [005] 15321.764961900: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )              7f1e71cc4e18
                 uname 14003 [005] 15321.764963233: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )              7f1e71cc5128
      
        After:
      
          $ perf script --call-trace | head -5
                 uname 14003 [005] 15321.764958566:  cbr: 42 freq: 4219 MHz (156%)
                 uname 14003 [005] 15321.764958566: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )      _start
                 uname 14003 [005] 15321.764961566: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )          _dl_start
                 uname 14003 [005] 15321.764961900: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )          _dl_start
                 uname 14003 [005] 15321.764963233: (/usr/lib/x86_64-linux-gnu/ld-2.31.so              )          _dl_start
      Reported-by: NTravis Downs <travis.downs@gmail.com>
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200526155207.9172-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      85afd355
    • J
      perf parse: Add 'struct parse_events_state' pointer to scanner · 1244a327
      Jiri Olsa 提交于
      We need to pass more data to the scanner so let's start with having it
      to take pointer to 'struct parse_events_state' object instead of just
      start token.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200524224219.234847-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1244a327
    • J
      perf stat: Do not pass avg to generic_metric · 5f09ca5a
      Jiri Olsa 提交于
      There's no need to pass the given evsel's count to metric data, because
      it will be pushed again within the following metric_events loop.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200524224219.234847-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5f09ca5a
    • I
      perf metricgroup: Remove unnecessary ',' from events · e2ce1059
      Ian Rogers 提交于
      Remove unnecessary commas from events before they are parsed. This
      avoids ',' being echoed by parse-events.l.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-8-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e2ce1059