未验证 提交 1ec4733f 编写于 作者: O openeuler-ci-bot 提交者: Gitee

!268 [OLK-5.10]perf arm64 metricgroup support and some bugfix

Merge Pull Request from: @liujie-248683921 
 
This series contains support to get basic metricgroups working for
arm64 CPUs.

Initial support is added for HiSilicon hip08 platform.

Some sample usage on Huawei D06 board:

$ ./perf list metric

List of pre-defined events (to be used in -e):

Metrics:

bp_misp_flush
[BP misp flush L3 topdown metric]
branch_mispredicts
[Branch mispredicts L2 topdown metric]
core_bound
[Core bound L2 topdown metric]
divider
[Divider L3 topdown metric]
exe_ports_util
[EXE ports util L3 topdown metric]
fetch_bandwidth_bound
[Fetch bandwidth bound L2 topdown metric]
fetch_latency_bound
[Fetch latency bound L2 topdown metric]
fsu_stall
[FSU stall L3 topdown metric]
idle_by_icache_miss

$ sudo ./perf stat -v -M core_bound sleep 1
Using CPUID 0x00000000480fd010
metric expr (exe_stall_cycle - (mem_stall_anyload + armv8_pmuv3_0@event=0x7005@)) / cpu_cycles for core_bound
found event cpu_cycles
found event armv8_pmuv3_0/event=0x7005/
found event exe_stall_cycle
found event mem_stall_anyload
adding {cpu_cycles -> armv8_pmuv3_0/event=0x7001/
mem_stall_anyload -> armv8_pmuv3_0/event=0x7004/
Control descriptor is not initialized
cpu_cycles: 989433 385050 385050
armv8_pmuv3_0/event=0x7005/: 19207 385050 385050
exe_stall_cycle: 900825 385050 385050
mem_stall_anyload: 253516 385050 385050

Performance counter stats for 'sleep':

989,433 cpu_cycles # 0.63 core_bound
19,207 armv8_pmuv3_0/event=0x7005/
900,825 exe_stall_cycle
253,516 mem_stall_anyload

   0.000805809 seconds time elapsed

   0.000875000 seconds user
   0.000000000 seconds sys
perf stat --topdown is not supported, as this requires the CPU PMU to
expose (alias) events for the TopDown L1 metrics from sysfs, which arm
does not do. To get that to work, we probably need to make perf use the
pmu-events cpumap to learn about those alias events.

Metric reuse support is added for pmu-events parse metric testcase.
This had been broken on power9 recently:
https://lore.kernel.org/lkml/20210324015418.GC8931@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com/

Differences to v2:

Add TB and RB tags (Thanks!)
Rename metricgroup__find_metric() from metricgroup_find_metric()
Change resolve_metric_simple() to rescan after any insert
Differences to v1:

Add pmu_events_map__find() as arm64-specific function
Fix metric reuse for pmu-events parse metric testcase
John Garry (6):
perf metricgroup: Make find_metric() public with name change
perf test: Handle metric reuse in pmu-events parsing test
perf pmu: Add pmu_events_map__find()
perf vendor events arm64: Add Hisi hip08 L1 metrics
perf vendor events arm64: Add Hisi hip08 L2 metrics
perf vendor events arm64: Add Hisi hip08 L3 metrics

tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/pmu.c | 25 ++
.../arch/arm64/hisilicon/hip08/metrics.json | 233 ++++++++++++++++++
tools/perf/tests/pmu-events.c | 83 ++++++-
tools/perf/util/metricgroup.c | 12 +-
tools/perf/util/metricgroup.h | 3 +-
tools/perf/util/pmu.c | 5 +
tools/perf/util/pmu.h | 1 +
tools/perf/util/s390-sample-raw.c | 4 +-
9 files changed, 356 insertions(+), 11 deletions(-)
create mode 100644 tools/perf/arch/arm64/util/pmu.c
create mode 100644 tools/perf/pmu-events/arch/arm64/hisilicon/hip08/metrics.json

Reference:https://patchwork.kernel.org/project/linux-arm-kernel/cover/1617791570-165223-1-git-send-email-john.garry@huawei.com/

Bugfix:perf vendor events arm64: Fix incorrect metrics and improve readability
First fix the incorrect hip08 metrics, then add some core events to the
JSON file. Last, change the event code to the event name for improving
readability.
changes in v2:
- adjust commit msg of 1st patch.
- fix tab in 3rd patch.
Shang XiaoJing (3):
  perf vendor events arm64: Fix incorrect Hisi hip08 L3 metrics
  perf vendor events arm64: Add HiSilicon hip08 core events
  perf vendor events arm64: Use event name instead of event code
 .../arm64/hisilicon/hip08/core-imp-def.json   | 132 ++++++++++++++++++
 .../arch/arm64/hisilicon/hip08/metrics.json   |  48 +++----
 2 files changed, 156 insertions(+), 24 deletions(-
Reference:https://lore.kernel.org/all/20221021105035.10000-1-shangxiaojing@huawei.com/
 
 
Link:https://gitee.com/openeuler/kernel/pulls/268 
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com> 
Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> 
perf-y += header.o
perf-y += perf_regs.o
perf-y += tsc.o
perf-y += pmu.o
perf-y += kvm-stat.o
perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
......
// SPDX-License-Identifier: GPL-2.0
#include "../../util/cpumap.h"
#include "../../util/pmu.h"
struct pmu_events_map *pmu_events_map__find(void)
{
struct perf_pmu *pmu = NULL;
while ((pmu = perf_pmu__scan(pmu))) {
if (!is_pmu_core(pmu->name))
continue;
/*
* The cpumap should cover all CPUs. Otherwise, some CPUs may
* not support some events or have different event IDs.
*/
if (pmu->cpus->nr != cpu__max_cpu())
return NULL;
return perf_pmu__find_map(pmu);
}
return NULL;
}
[
{
"MetricExpr": "FETCH_BUBBLE / (4 * CPU_CYCLES)",
"PublicDescription": "Frontend bound L1 topdown metric",
"BriefDescription": "Frontend bound L1 topdown metric",
"MetricGroup": "TopDownL1",
"MetricName": "frontend_bound"
},
{
"MetricExpr": "(INST_SPEC - INST_RETIRED) / (4 * CPU_CYCLES)",
"PublicDescription": "Bad Speculation L1 topdown metric",
"BriefDescription": "Bad Speculation L1 topdown metric",
"MetricGroup": "TopDownL1",
"MetricName": "bad_speculation"
},
{
"MetricExpr": "INST_RETIRED / (CPU_CYCLES * 4)",
"PublicDescription": "Retiring L1 topdown metric",
"BriefDescription": "Retiring L1 topdown metric",
"MetricGroup": "TopDownL1",
"MetricName": "retiring"
},
{
"MetricExpr": "1 - (frontend_bound + bad_speculation + retiring)",
"PublicDescription": "Backend Bound L1 topdown metric",
"BriefDescription": "Backend Bound L1 topdown metric",
"MetricGroup": "TopDownL1",
"MetricName": "backend_bound"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x201d@ / CPU_CYCLES",
"PublicDescription": "Fetch latency bound L2 topdown metric",
"BriefDescription": "Fetch latency bound L2 topdown metric",
"MetricGroup": "TopDownL2",
"MetricName": "fetch_latency_bound"
},
{
"MetricExpr": "frontend_bound - fetch_latency_bound",
"PublicDescription": "Fetch bandwidth bound L2 topdown metric",
"BriefDescription": "Fetch bandwidth bound L2 topdown metric",
"MetricGroup": "TopDownL2",
"MetricName": "fetch_bandwidth_bound"
},
{
"MetricExpr": "(bad_speculation * BR_MIS_PRED) / (BR_MIS_PRED + armv8_pmuv3_0@event\\=0x2013@)",
"PublicDescription": "Branch mispredicts L2 topdown metric",
"BriefDescription": "Branch mispredicts L2 topdown metric",
"MetricGroup": "TopDownL2",
"MetricName": "branch_mispredicts"
},
{
"MetricExpr": "bad_speculation - branch_mispredicts",
"PublicDescription": "Machine clears L2 topdown metric",
"BriefDescription": "Machine clears L2 topdown metric",
"MetricGroup": "TopDownL2",
"MetricName": "machine_clears"
},
{
"MetricExpr": "(EXE_STALL_CYCLE - (MEM_STALL_ANYLOAD + armv8_pmuv3_0@event\\=0x7005@)) / CPU_CYCLES",
"PublicDescription": "Core bound L2 topdown metric",
"BriefDescription": "Core bound L2 topdown metric",
"MetricGroup": "TopDownL2",
"MetricName": "core_bound"
},
{
"MetricExpr": "(MEM_STALL_ANYLOAD + armv8_pmuv3_0@event\\=0x7005@) / CPU_CYCLES",
"PublicDescription": "Memory bound L2 topdown metric",
"BriefDescription": "Memory bound L2 topdown metric",
"MetricGroup": "TopDownL2",
"MetricName": "memory_bound"
},
{
"MetricExpr": "(((L2I_TLB - L2I_TLB_REFILL) * 15) + (L2I_TLB_REFILL * 100)) / CPU_CYCLES",
"PublicDescription": "Idle by itlb miss L3 topdown metric",
"BriefDescription": "Idle by itlb miss L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "idle_by_itlb_miss"
},
{
"MetricExpr": "(((L2I_CACHE - L2I_CACHE_REFILL) * 15) + (L2I_CACHE_REFILL * 100)) / CPU_CYCLES",
"PublicDescription": "Idle by icache miss L3 topdown metric",
"BriefDescription": "Idle by icache miss L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "idle_by_icache_miss"
},
{
"MetricExpr": "(BR_MIS_PRED * 5) / CPU_CYCLES",
"PublicDescription": "BP misp flush L3 topdown metric",
"BriefDescription": "BP misp flush L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "bp_misp_flush"
},
{
"MetricExpr": "(armv8_pmuv3_0@event\\=0x2013@ * 5) / CPU_CYCLES",
"PublicDescription": "OOO flush L3 topdown metric",
"BriefDescription": "OOO flush L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "ooo_flush"
},
{
"MetricExpr": "(armv8_pmuv3_0@event\\=0x1001@ * 5) / CPU_CYCLES",
"PublicDescription": "Static predictor flush L3 topdown metric",
"BriefDescription": "Static predictor flush L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "sp_flush"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x1010@ / BR_MIS_PRED",
"PublicDescription": "Indirect branch L3 topdown metric",
"BriefDescription": "Indirect branch L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "indirect_branch"
},
{
"MetricExpr": "(armv8_pmuv3_0@event\\=0x1013@ + armv8_pmuv3_0@event\\=0x1016@) / BR_MIS_PRED",
"PublicDescription": "Push branch L3 topdown metric",
"BriefDescription": "Push branch L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "push_branch"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x100d@ / BR_MIS_PRED",
"PublicDescription": "Pop branch L3 topdown metric",
"BriefDescription": "Pop branch L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "pop_branch"
},
{
"MetricExpr": "(BR_MIS_PRED - armv8_pmuv3_0@event\\=0x1010@ - armv8_pmuv3_0@event\\=0x1013@ - armv8_pmuv3_0@event\\=0x1016@ - armv8_pmuv3_0@event\\=0x100d@) / BR_MIS_PRED",
"PublicDescription": "Other branch L3 topdown metric",
"BriefDescription": "Other branch L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "other_branch"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x2012@ / armv8_pmuv3_0@event\\=0x2013@",
"PublicDescription": "Nuke flush L3 topdown metric",
"BriefDescription": "Nuke flush L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "nuke_flush"
},
{
"MetricExpr": "1 - nuke_flush",
"PublicDescription": "Other flush L3 topdown metric",
"BriefDescription": "Other flush L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "other_flush"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x2010@ / CPU_CYCLES",
"PublicDescription": "Sync stall L3 topdown metric",
"BriefDescription": "Sync stall L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "sync_stall"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x2004@ / CPU_CYCLES",
"PublicDescription": "Rob stall L3 topdown metric",
"BriefDescription": "Rob stall L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "rob_stall"
},
{
"MetricExpr": "(armv8_pmuv3_0@event\\=0x2006@ + armv8_pmuv3_0@event\\=0x2007@ + armv8_pmuv3_0@event\\=0x2008@) / CPU_CYCLES",
"PublicDescription": "Ptag stall L3 topdown metric",
"BriefDescription": "Ptag stall L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "ptag_stall"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x201e@ / CPU_CYCLES",
"PublicDescription": "SaveOpQ stall L3 topdown metric",
"BriefDescription": "SaveOpQ stall L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "saveopq_stall"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x2005@ / CPU_CYCLES",
"PublicDescription": "PC buffer stall L3 topdown metric",
"BriefDescription": "PC buffer stall L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "pc_buffer_stall"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x7002@ / CPU_CYCLES",
"PublicDescription": "Divider L3 topdown metric",
"BriefDescription": "Divider L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "divider"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x7003@ / CPU_CYCLES",
"PublicDescription": "FSU stall L3 topdown metric",
"BriefDescription": "FSU stall L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "fsu_stall"
},
{
"MetricExpr": "core_bound - divider - fsu_stall",
"PublicDescription": "EXE ports util L3 topdown metric",
"BriefDescription": "EXE ports util L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "exe_ports_util"
},
{
"MetricExpr": "(MEM_STALL_ANYLOAD - MEM_STALL_L1MISS) / CPU_CYCLES",
"PublicDescription": "L1 bound L3 topdown metric",
"BriefDescription": "L1 bound L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "l1_bound"
},
{
"MetricExpr": "(MEM_STALL_L1MISS - MEM_STALL_L2MISS) / CPU_CYCLES",
"PublicDescription": "L2 bound L3 topdown metric",
"BriefDescription": "L2 bound L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "l2_bound"
},
{
"MetricExpr": "MEM_STALL_L2MISS / CPU_CYCLES",
"PublicDescription": "Mem bound L3 topdown metric",
"BriefDescription": "Mem bound L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "mem_bound"
},
{
"MetricExpr": "armv8_pmuv3_0@event\\=0x7005@ / CPU_CYCLES",
"PublicDescription": "Store bound L3 topdown metric",
"BriefDescription": "Store bound L3 topdown metric",
"MetricGroup": "TopDownL3",
"MetricName": "store_bound"
},
]
......@@ -12,6 +12,7 @@
#include "util/evlist.h"
#include "util/expr.h"
#include "util/parse-events.h"
#include "metricgroup.h"
struct perf_pmu_test_event {
struct pmu_event event;
......@@ -457,9 +458,74 @@ static void expr_failure(const char *msg,
pr_debug("On expression %s\n", pe->metric_expr);
}
struct metric {
struct list_head list;
struct metric_ref metric_ref;
};
static int resolve_metric_simple(struct expr_parse_ctx *pctx,
struct list_head *compound_list,
struct pmu_events_map *map,
const char *metric_name)
{
struct hashmap_entry *cur, *cur_tmp;
struct metric *metric, *tmp;
size_t bkt;
bool all;
int rc;
do {
all = true;
hashmap__for_each_entry_safe((&pctx->ids), cur, cur_tmp, bkt) {
struct metric_ref *ref;
struct pmu_event *pe;
pe = metricgroup__find_metric(cur->key, map);
if (!pe)
continue;
if (!strcmp(metric_name, (char *)cur->key)) {
pr_warning("Recursion detected for metric %s\n", metric_name);
rc = -1;
goto out_err;
}
all = false;
/* The metric key itself needs to go out.. */
expr__del_id(pctx, cur->key);
metric = malloc(sizeof(*metric));
if (!metric) {
rc = -ENOMEM;
goto out_err;
}
ref = &metric->metric_ref;
ref->metric_name = pe->metric_name;
ref->metric_expr = pe->metric_expr;
list_add_tail(&metric->list, compound_list);
rc = expr__find_other(pe->metric_expr, NULL, pctx, 0);
if (rc)
goto out_err;
break; /* The hashmap has been modified, so restart */
}
} while (!all);
return 0;
out_err:
list_for_each_entry_safe(metric, tmp, compound_list, list)
free(metric);
return rc;
}
static int test_parsing(void)
{
struct pmu_events_map *cpus_map = perf_pmu__find_map(NULL);
struct pmu_events_map *cpus_map = pmu_events_map__find();
struct pmu_events_map *map;
struct pmu_event *pe;
int i, j, k;
......@@ -474,7 +540,9 @@ static int test_parsing(void)
break;
j = 0;
for (;;) {
struct metric *metric, *tmp;
struct hashmap_entry *cur;
LIST_HEAD(compound_list);
size_t bkt;
pe = &map->table[j++];
......@@ -490,6 +558,13 @@ static int test_parsing(void)
continue;
}
if (resolve_metric_simple(&ctx, &compound_list, map,
pe->metric_name)) {
expr_failure("Could not resolve metrics", map, pe);
ret++;
goto exit; /* Don't tolerate errors due to severity */
}
/*
* Add all ids with a made up value. The value may
* trigger divide by zero when subtracted and so try to
......@@ -505,6 +580,11 @@ static int test_parsing(void)
ret++;
}
list_for_each_entry_safe(metric, tmp, &compound_list, list) {
expr__add_ref(&ctx, &metric->metric_ref);
free(metric);
}
if (expr__parse(&result, &ctx, pe->metric_expr, 0)) {
expr_failure("Parse failed", map, pe);
ret++;
......@@ -513,6 +593,7 @@ static int test_parsing(void)
}
}
/* TODO: fail when not ok */
exit:
return ret == 0 ? TEST_OK : TEST_SKIP;
}
......
......@@ -494,7 +494,7 @@ static void metricgroup__print_strlist(struct strlist *metrics, bool raw)
void metricgroup__print(bool metrics, bool metricgroups, char *filter,
bool raw, bool details)
{
struct pmu_events_map *map = perf_pmu__find_map(NULL);
struct pmu_events_map *map = pmu_events_map__find();
struct pmu_event *pe;
int i;
struct rblist groups;
......@@ -803,7 +803,8 @@ static int __add_metric(struct list_head *metric_list,
(match_metric(__pe->metric_group, __metric) || \
match_metric(__pe->metric_name, __metric)))
static struct pmu_event *find_metric(const char *metric, struct pmu_events_map *map)
struct pmu_event *metricgroup__find_metric(const char *metric,
struct pmu_events_map *map)
{
struct pmu_event *pe;
int i;
......@@ -888,7 +889,7 @@ static int __resolve_metric(struct metric *m,
struct expr_id *parent;
struct pmu_event *pe;
pe = find_metric(cur->key, map);
pe = metricgroup__find_metric(cur->key, map);
if (!pe)
continue;
......@@ -1117,7 +1118,7 @@ int metricgroup__parse_groups(const struct option *opt,
struct rblist *metric_events)
{
struct evlist *perf_evlist = *(struct evlist **)opt->value;
struct pmu_events_map *map = perf_pmu__find_map(NULL);
struct pmu_events_map *map = pmu_events_map__find();
if (!map)
return 0;
......@@ -1139,7 +1140,7 @@ int metricgroup__parse_groups_test(struct evlist *evlist,
bool metricgroup__has_metric(const char *metric)
{
struct pmu_events_map *map = perf_pmu__find_map(NULL);
struct pmu_events_map *map = pmu_events_map__find();
struct pmu_event *pe;
int i;
......
......@@ -44,7 +44,8 @@ int metricgroup__parse_groups(const struct option *opt,
bool metric_no_group,
bool metric_no_merge,
struct rblist *metric_events);
struct pmu_event *metricgroup__find_metric(const char *metric,
struct pmu_events_map *map);
int metricgroup__parse_groups_test(struct evlist *evlist,
struct pmu_events_map *map,
const char *str,
......
......@@ -701,6 +701,11 @@ struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu)
return map;
}
struct pmu_events_map *__weak pmu_events_map__find(void)
{
return perf_pmu__find_map(NULL);
}
bool pmu_uncore_alias_match(const char *pmu_name, const char *name)
{
char *tmp = NULL, *tok, *str;
......
......@@ -113,6 +113,7 @@ void pmu_add_cpu_aliases_map(struct list_head *head, struct perf_pmu *pmu,
struct pmu_events_map *map);
struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
struct pmu_events_map *pmu_events_map__find(void);
bool pmu_uncore_alias_match(const char *pmu_name, const char *name);
void perf_pmu_free_alias(struct perf_pmu_alias *alias);
......
......@@ -160,11 +160,9 @@ static void s390_cpumcfdg_dump(struct perf_sample *sample)
const char *color = PERF_COLOR_BLUE;
struct cf_ctrset_entry *cep, ce;
struct pmu_events_map *map;
struct perf_pmu pmu;
u64 *p;
memset(&pmu, 0, sizeof(pmu));
map = perf_pmu__find_map(&pmu);
map = pmu_events_map__find();
while (offset < len) {
cep = (struct cf_ctrset_entry *)(buf + offset);
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册