1. 30 7月, 2020 3 次提交
  2. 17 7月, 2020 1 次提交
  3. 23 6月, 2020 5 次提交
  4. 28 5月, 2020 10 次提交
    • I
      perf metricgroup: Remove unnecessary ',' from events · e2ce1059
      Ian Rogers 提交于
      Remove unnecessary commas from events before they are parsed. This
      avoids ',' being echoed by parse-events.l.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-8-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e2ce1059
    • I
      perf metricgroup: Add options to not group or merge · 05530a79
      Ian Rogers 提交于
      Add --metric-no-group that causes all events within metrics to not be
      grouped. This can allow the event to get more time when multiplexed, but
      may also lower accuracy.
      Add --metric-no-merge option. By default events in different metrics may
      be shared if the group of events for one metric is the same or larger
      than that of the second. Sharing may increase or lower accuracy and so
      is now configurable.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-7-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      05530a79
    • I
      perf metricgroup: Remove duped metric group events · 2440689d
      Ian Rogers 提交于
      A metric group contains multiple metrics. These metrics may use the same
      events. If metrics use separate events then it leads to more
      multiplexing and overall metric counts fail to sum to 100%.
      
      Modify how metrics are associated with events so that if the events in
      an earlier group satisfy the current metric, the same events are used.
      A record of used events is kept and at the end of processing unnecessary
      events are eliminated.
      
      Before:
      
        $ perf stat -a -M TopDownL1 sleep 1
      
         Performance counter stats for 'system wide':
      
             920,211,343   uops_issued.any             #      0.5 Backend_Bound   (16.56%)
           1,977,733,128   idq_uops_not_delivered.core                            (16.56%)
              51,668,510   int_misc.recovery_cycles                               (16.56%)
             732,305,692   uops_retired.retire_slots                              (16.56%)
           1,497,621,849   cycles                                                 (16.56%)
             721,098,274   uops_issued.any             #      0.1 Bad_Speculation (16.79%)
           1,332,681,791   cycles                                                 (16.79%)
             552,475,482   uops_retired.retire_slots                              (16.79%)
              47,708,340   int_misc.recovery_cycles                               (16.79%)
           1,383,713,292   cycles
                                                       #      0.4 Frontend_Bound  (16.76%)
           2,013,757,701   idq_uops_not_delivered.core                            (16.76%)
           1,373,363,790   cycles
                                                       #      0.1 Retiring        (33.54%)
             577,302,589   uops_retired.retire_slots                              (33.54%)
             392,766,987   inst_retired.any            #      0.3 IPC             (50.24%)
           1,351,873,350   cpu_clk_unhalted.thread                                (50.24%)
           1,332,510,318   cycles
                                                       # 5330041272.0 SLOTS       (49.90%)
      
             1.006336145 seconds time elapsed
      
      After:
      
        $ perf stat -a -M TopDownL1 sleep 1
      
         Performance counter stats for 'system wide':
      
             765,949,145   uops_issued.any             #      0.1 Bad_Speculation
                                                       #      0.5 Backend_Bound   (50.09%)
           1,883,830,591   idq_uops_not_delivered.core #      0.3 Frontend_Bound  (50.09%)
              48,237,080   int_misc.recovery_cycles                               (50.09%)
             581,798,385   uops_retired.retire_slots   #      0.1 Retiring        (50.09%)
           1,361,628,527   cycles
                                                       # 5446514108.0 SLOTS       (50.09%)
             391,415,714   inst_retired.any            #      0.3 IPC             (49.91%)
           1,336,486,781   cpu_clk_unhalted.thread                                (49.91%)
      
             1.005469298 seconds time elapsed
      
      Note: Bad_Speculation + Backend_Bound + Frontend_Bound + Retiring = 100%
      after, where as before it is 110%. After there are 2 groups, whereas
      before there are 6. After the cycles event appears once, before it
      appeared 5 times.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-6-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2440689d
    • I
      perf metricgroup: Order event groups by size · 6bf2102b
      Ian Rogers 提交于
      When adding event groups to the group list, insert them in size order.
      This performs an insertion sort on the group list. By placing the
      largest groups at the front of the group list it is possible to see if a
      larger group contains the same events as a later group. This can make
      the later group redundant - it can reuse the events from the large
      group.  A later patch will add this sharing.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-5-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6bf2102b
    • I
      perf metricgroup: Delay events string creation · 7f9eca51
      Ian Rogers 提交于
      Currently event groups are placed into groups_list at the same time as
      the events string containing the events is built. Separate these two
      operations and build the groups_list first, then the event string from
      the groups_list. This adds an ability to reorder the groups_list that
      will be used in a later patch.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-4-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7f9eca51
    • I
      perf metricgroup: Use early return in add_metric · 90810399
      Ian Rogers 提交于
      Use early return in metricgroup__add_metric and try to make the intent
      of the returns more intention revealing.
      Suggested-by: NJiri Olsa <jolsa@redhat.com>
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-3-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      90810399
    • I
      perf metricgroup: Always place duration_time last · 4e21c13a
      Ian Rogers 提交于
      If a metric contains the duration_time event then the event is placed
      outside of the metric's group of events. Rather than split the group,
      make it so the duration_time is immediately after the group.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520182011.32236-2-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4e21c13a
    • I
      perf metricgroup: Free metric_events on error · a159e2fe
      Ian Rogers 提交于
      Avoid a simple memory leak.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: kp singh <kpsingh@chromium.org>
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200508053629.210324-10-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a159e2fe
    • I
      perf metricgroup: Make 'evlist_used' variable a bitmap instead of array of bools · 45db55f2
      Ian Rogers 提交于
      Use a bitmap rather than an array of bools.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200520072814.128267-2-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      45db55f2
    • I
      perf expr: Migrate expr ids table to a hashmap · ded80bda
      Ian Rogers 提交于
      Use a hashmap between a char* string and a double* value. While bpf's
      hashmap entries are size_t in size, we can't guarantee sizeof(size_t) >=
      sizeof(double). Avoid a memory allocation when gathering ids by making
      0.0 a special value encoded as NULL.
      
      Original map suggestion by Andi Kleen:
      
        https://lore.kernel.org/lkml/20200224210308.GQ160988@tassilo.jf.intel.com/
      
      and seconded by Jiri Olsa:
      
        https://lore.kernel.org/lkml/20200423112915.GH1136647@krava/
      
      Committer notes:
      
      There are fixes that need to land upstream before we can use libbpf's
      headers, for now use our copy unconditionally, since the data structures
      at this point are exactly the same, no problem.
      
      When the fixes for libbpf's hashmap land upstream, we can fix this up.
      
      Testing it:
      
      Building with LIBBPF=1, i.e. the default:
      
        $ perf -vv | grep -i bpf
                           bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
        $ nm ~/bin/perf | grep -i libbpf_ | wc -l
        39
        $ nm ~/bin/perf | grep -i hashmap_ | wc -l
        17
        $
      
      Explicitely building without LIBBPF:
      
        $ perf -vv | grep -i bpf
                           bpf: [ OFF ]  # HAVE_LIBBPF_SUPPORT
        $
        $ nm ~/bin/perf | grep -i libbpf_ | wc -l
        0
        $ nm ~/bin/perf | grep -i hashmap_ | wc -l
        9
        $
      Signed-off-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: kp singh <kpsingh@chromium.org>
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20200515221732.44078-8-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ded80bda
  5. 30 4月, 2020 1 次提交
    • K
      perf metricgroups: Enhance JSON/metric infrastructure to handle "?" · 1e1a873d
      Kajol Jain 提交于
      Patch enhances current metric infrastructure to handle "?" in the metric
      expression. The "?" can be use for parameters whose value not known
      while creating metric events and which can be replace later at runtime
      to the proper value. It also add flexibility to create multiple events
      out of single metric event added in JSON file.
      
      Patch adds function 'arch_get_runtimeparam' which is a arch specific
      function, returns the count of metric events need to be created.  By
      default it return 1.
      
      This infrastructure needed for hv_24x7 socket/chip level events.
      "hv_24x7" chip level events needs specific chip-id to which the data is
      requested. Function 'arch_get_runtimeparam' implemented in header.c
      which extract number of sockets from sysfs file "sockets" under
      "/sys/devices/hv_24x7/interface/".
      
      With this patch basically we are trying to create as many metric events
      as define by runtime_param.
      
      For that one loop is added in function 'metricgroup__add_metric', which
      create multiple events at run time depend on return value of
      'arch_get_runtimeparam' and merge that event in 'group_list'.
      
      To achieve that we are actually passing this parameter value as part of
      `expr__find_other` function and changing "?" present in metric
      expression with this value.
      
      As in our JSON file, there gonna be single metric event, and out of
      which we are creating multiple events.
      
      To understand which data count belongs to which parameter value,
      we also printing param value in generic_metric function.
      
      For example,
      
        command:# ./perf stat  -M PowerBUS_Frequency -C 0 -I 1000
          1.000101867  9,356,933  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 GHz  PowerBUS_Frequency_0
          1.000101867  9,366,134  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 GHz  PowerBUS_Frequency_1
          2.000314878  9,365,868  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 GHz  PowerBUS_Frequency_0
          2.000314878  9,366,092  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 GHz  PowerBUS_Frequency_1
      
      So, here _0 and _1 after PowerBUS_Frequency specify parameter value.
      Signed-off-by: NKajol Jain <kjain@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lore.kernel.org/lkml/20200401203340.31402-5-kjain@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1e1a873d
  6. 16 4月, 2020 1 次提交
    • K
      perf metrictroup: Split the metricgroup__add_metric function · 47352aba
      Kajol Jain 提交于
      This patch refactors metricgroup__add_metric function where some part of
      it move to function metricgroup__add_metric_param.  No logic change.
      Signed-off-by: NKajol Jain <kjain@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lore.kernel.org/lkml/20200401203340.31402-4-kjain@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      47352aba
  7. 24 3月, 2020 1 次提交
    • K
      perf metricgroup: Fix printing event names of metric group with multiple... · 58fc90fd
      Kajol Jain 提交于
      perf metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
      
      Commit f01642e4 ("perf metricgroup: Support multiple events for
      metricgroup") introduced support for multiple events in a metric group.
      But with the current upstream, metric events names are not printed
      properly incase we try to run multiple metric groups with overlapping
      event.
      
      With current upstream version, incase of overlapping metric events issue
      is, we always start our comparision logic from start.  So, the events
      which already matched with some metric group also take part in
      comparision logic. Because of that when we have overlapping events, we
      end up matching current metric group event with already matched one.
      
      For example, in skylake machine we have metric event CoreIPC and
      Instructions. Both of them need 'inst_retired.any' event value.  As
      events in Instructions is subset of events in CoreIPC, they endup in
      pointing to same 'inst_retired.any' value.
      
      In skylake platform:
      
      command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
           1,254,992,790      inst_retired.any          # 1254992790.0
                                                          Instructions
                                                        #      1.3 CoreIPC
             977,172,805      cycles
           1,254,992,756      inst_retired.any
      
             1.000802596 seconds time elapsed
      
      command:# sudo ./perf stat -M UPI,IPC sleep 1
      
         Performance counter stats for 'sleep 1':
                 948,650      uops_retired.retire_slots
                 866,182      inst_retired.any          #      0.7 IPC
                 866,182      inst_retired.any
               1,175,671      cpu_clk_unhalted.thread
      
      Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
      track of events which already matched with some group by setting it
      true.  So, we skip all used events in list when we start comparision
      logic.  Patch also make some changes in comparision logic, incase we get
      a match miss, we discard the whole match and start again with first
      event id in metric event.
      
      With this patch:
      
      In skylake platform:
      
      command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
      
       Performance counter stats for 'CPU(s) 0':
      
               3,348,415      inst_retired.any          #      0.3 CoreIPC
              11,779,026      cycles
               3,348,381      inst_retired.any          # 3348381.0
                                                          Instructions
      
             1.001649056 seconds time elapsed
      
      command:# ./perf stat -M UPI,IPC sleep 1
      
       Performance counter stats for 'sleep 1':
      
               1,023,148      uops_retired.retire_slots #      1.1 UPI
                 924,976      inst_retired.any
                 924,976      inst_retired.any          #      0.6 IPC
               1,489,414      cpu_clk_unhalted.thread
      
             1.003064672 seconds time elapsed
      Signed-off-by: NKajol Jain <kjain@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20200221101121.28920-1-kjain@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      58fc90fd
  8. 11 3月, 2020 2 次提交
  9. 11 12月, 2019 1 次提交
    • K
      perf metricgroup: Fix printing event names of metric group with multiple events · eb573e74
      Kajol Jain 提交于
      Commit f01642e4 ("perf metricgroup: Support multiple events for
      metricgroup") introduced support for multiple events in a metric group.
      But with the current upstream, metric events names are not printed
      properly
      
      In power9 platform:
      
      command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
           1.000208486
           2.000368863
           2.001400558
      
      Similarly in skylake platform:
      
      command:./perf stat --metric-only -M Power -I 1000
           1.000579994
           2.002189493
      
      With current upstream version, issue is with event name comparison logic
      in find_evsel_group(). Current logic is to compare events belonging to a
      metric group to the events in perf_evlist.  Since the break statement is
      missing in the loop used for comparison between metric group and
      perf_evlist events, the loop continues to execute even after getting a
      pattern match, and end up in discarding the matches.
      
      Incase of single metric event belongs to metric group, its working fine,
      because in case of single event once it compare all events it reaches to
      end of perf_evlist.
      
      Example for single metric event in power9 platform:
      
      command:# ./perf stat --metric-only  -M branches_per_inst -I 1000 sleep 1
           1.000094653                  0.2
           1.001337059                  0.0
      
      This patch fixes the issue by making sure once we found all events
      belongs to that metric event matched in find_evsel_group(), we
      successfully break from that loop by adding corresponding condition.
      
      With this patch:
      In power9 platform:
      
      command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
      result:#
                  time  derat_4k_miss_rate_percent  derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent  derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent
           1.000135672                         0.0                  0.3              1.0                         0.0                   0.2                    0.0                    0.0
           2.000380617                         0.0                  0.0              0.0                         0.0                   0.0                    0.0                    0.0
      
      command:# ./perf stat --metric-only -M Power -I 1000
      
      Similarly in skylake platform:
      result:#
                  time    Turbo_Utilization    C3_Core_Residency  C6_Core_Residency  C7_Core_Residency    C2_Pkg_Residency  C3_Pkg_Residency     C6_Pkg_Residency   C7_Pkg_Residency
           1.000563580                  0.3                  0.0                2.6               44.2                21.9               0.0                  0.0               0.0
           2.002235027                  0.4                  0.0                2.7               43.0                20.7               0.0                  0.0               0.0
      
      Committer testing:
      
        Before:
      
        [root@seventh ~]# perf stat --metric-only -M Power -I 1000
        #           time
             1.000383223
             2.001168182
             3.001968545
             4.002741200
             5.003442022
        ^C     5.777687244
      
        [root@seventh ~]#
      
        After the patch:
      
        [root@seventh ~]# perf stat --metric-only -M Power -I 1000
        #           time    Turbo_Utilization    C3_Core_Residency    C6_Core_Residency    C7_Core_Residency     C2_Pkg_Residency     C3_Pkg_Residency     C6_Pkg_Residency     C7_Pkg_Residency
             1.000406577                  0.4                  0.1                  1.4                 97.0                  0.0                  0.0                  0.0                  0.0
             2.001481572                  0.3                  0.0                  0.6                 97.9                  0.0                  0.0                  0.0                  0.0
             3.002332585                  0.2                  0.0                  1.0                 97.5                  0.0                  0.0                  0.0                  0.0
             4.003196624                  0.2                  0.0                  0.3                 98.6                  0.0                  0.0                  0.0                  0.0
             5.004063851                  0.3                  0.0                  0.7                 97.7                  0.0                  0.0                  0.0                  0.0
        ^C     5.471260276                  0.2                  0.0                  0.5                 49.3                  0.0                  0.0                  0.0                  0.0
      
        [root@seventh ~]#
        [root@seventh ~]# dmesg | grep -i skylake
        [    0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
        [root@seventh ~]#
      
      Fixes: f01642e4 ("perf metricgroup: Support multiple events for metricgroup")
      Signed-off-by: NKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eb573e74
  10. 19 11月, 2019 1 次提交
    • I
      perf parse: Report initial event parsing error · a910e466
      Ian Rogers 提交于
      Record the first event parsing error and report. Implementing feedback
      from Jiri Olsa:
      
        https://lkml.org/lkml/2019/10/28/680
      
      An example error is:
      
        $ tools/perf/perf stat -e c/c/
        WARNING: multiple event parsing errors
        event syntax error: 'c/c/'
                               \___ unknown term
      
        valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore
      
      Initial error:
      
        event syntax error: 'c/c/'
                            \___ Cannot find PMU `c'. Missing kernel support?
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      Signed-off-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20191116074652.9960-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a910e466
  11. 01 9月, 2019 2 次提交
    • J
      perf metricgroup: Support multiple events for metricgroup · f01642e4
      Jin Yao 提交于
      Some uncore metrics don't work as expected. For example, on
      cascadelakex:
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 1841092      unc_m_pmm_rpq_inserts
                 3680816      unc_m_pmm_wpq_inserts
      
             1.001775055 seconds time elapsed
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
               860649746      unc_m_pmm_rpq_occupancy.all
                 1840557      unc_m_pmm_rpq_inserts
             12790627455      unc_m_clockticks
      
             1.001773348 seconds time elapsed
      
      No metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' or 'UNC_M_PMM_READ_LATENCY' are
      reported.
      
      The issue is, the case of an alias expanding to mulitple events is not
      supported, typically the uncore events.  (see comments in
      find_evsel_group()).
      
      For UNC_M_PMM_BANDWIDTH.TOTAL in above example, the expanded event group
      is '{unc_m_pmm_rpq_inserts,unc_m_pmm_wpq_inserts}:W', but the actual
      events passed to find_evsel_group are:
      
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_rpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
        unc_m_pmm_wpq_inserts
      
      For this multiple events case, it's not supported well.
      
      This patch introduces a new field 'metric_leader' in struct evsel. The
      first event is considered as a metric leader. For the rest of same
      events, they point to the first event via it's metric_leader field in
      struct evsel.
      
      This design is for adding the counting results of all same events to the
      first event in group (the metric_leader).
      
      With this patch,
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 1842108      unc_m_pmm_rpq_inserts     #    337.2 MB/sec  UNC_M_PMM_BANDWIDTH.TOTAL
                 3682209      unc_m_pmm_wpq_inserts
      
             1.001819706 seconds time elapsed
      
        root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
               861970685      unc_m_pmm_rpq_occupancy.all #    219.4 ns  UNC_M_PMM_READ_LATENCY
                 1842772      unc_m_pmm_rpq_inserts
             12790196356      unc_m_clockticks
      
             1.001749103 seconds time elapsed
      
      Now we can see the correct metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' and
      'UNC_M_PMM_READ_LATENCY'.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20190828055932.8269-5-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f01642e4
    • J
      perf metricgroup: Scale the metric result · 287f2649
      Jin Yao 提交于
      Some metrics define the scale unit, such as
      
          {
              "BriefDescription": "Intel Optane DC persistent memory read latency (ns). Derived from unc_m_pmm_rpq_occupancy.all",
              "Counter": "0,1,2,3",
              "EventCode": "0xE0",
              "EventName": "UNC_M_PMM_READ_LATENCY",
              "MetricExpr": "UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS",
              "MetricName": "UNC_M_PMM_READ_LATENCY",
              "PerPkg": "1",
              "ScaleUnit": "6000000000ns",
              "UMask": "0x1",
              "Unit": "iMC"
          },
      
      For above example, the ratio should be,
      
      ratio = (UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / UNC_M_CLOCKTICKS) * 6000000000
      
      But in current code, the ratio is not scaled ( * 6000000000)
      
      With this patch, the ratio is scaled and the unit (ns) is printed.
      
      For example,
        #    219.4 ns  UNC_M_PMM_READ_LATENCY
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20190828055932.8269-4-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      287f2649
  12. 30 8月, 2019 1 次提交
  13. 23 8月, 2019 1 次提交
  14. 30 7月, 2019 2 次提交
  15. 09 7月, 2019 2 次提交
  16. 03 7月, 2019 2 次提交
  17. 02 7月, 2019 1 次提交
    • A
      perf stat: Fix group lookup for metric group · 2f87f33f
      Andi Kleen 提交于
      The metric group code tries to find a group it added earlier in the
      evlist. Fix the lookup to handle groups with partially overlaps
      correctly. When a sub string match fails and we reset the match, we have
      to compare the first element again.
      
      I also renamed the find_evsel function to find_evsel_group to make its
      purpose clearer.
      
      With the earlier changes this fixes:
      
      Before:
      
        % perf stat -M UPI,IPC sleep 1
        ...
               1,032,922      uops_retired.retire_slots #      1.1 UPI
               1,896,096      inst_retired.any
               1,896,096      inst_retired.any
               1,177,254      cpu_clk_unhalted.thread
      
      After:
      
        % perf stat -M UPI,IPC sleep 1
        ...
              1,013,193      uops_retired.retire_slots #      1.1 UPI
                 932,033      inst_retired.any
                 932,033      inst_retired.any          #      0.9 IPC
               1,091,245      cpu_clk_unhalted.thread
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Fixes: b18f3e36 ("perf stat: Support JSON metrics in perf stat")
      Link: http://lkml.kernel.org/r/20190624193711.35241-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f87f33f
  18. 26 6月, 2019 2 次提交
  19. 05 6月, 2019 1 次提交