1. 10 10月, 2017 1 次提交
    • M
      perf pmu: Unbreak perf record for arm/arm64 with events with explicit PMU · 66ec1191
      Mark Rutland 提交于
      Currently, perf record is broken on arm/arm64 systems when the PMU is
      specified explicitly as part of the event, e.g.
      
      $ ./perf record -e armv8_cortex_a53/cpu_cycles/u true
      
      In such cases, perf record fails to open events unless
      perf_event_paranoid is set to -1, even if the PMU in question supports
      mode exclusion. Further, even when perf_event_paranoid is toggled, no
      samples are recorded.
      
      This is an unintended side effect of commit:
      
        e3ba76de ("perf tools: Force uncore events to system wide monitoring)
      
      ... which assumes that if a PMU has an associated cpu_map, it is an
      uncore PMU, and forces events for such PMUs to be system-wide.
      
      This is not true for arm/arm64 systems, which can have heterogeneous
      CPUs. To account for this, multiple CPU PMUs are exposed, each with a
      "cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM
      PMUs do not have a "cpumask" file, and only have a "cpus" file. For the
      gory details as to why, see commit:
      
       7e3fcffe ("perf pmu: Support alternative sysfs cpumask")
      
      Given all of this, we can instead identify uncore PMUs by explicitly
      checking for a "cpumask" file, and restore arm/arm64 PMU support back to
      a working state. This patch does so, adding a new perf_pmu::is_uncore
      field, and splitting the existing cpumask parsing so that it can be
      reused.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by Will Deacon <will.deacon@arm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: 4.12+ <stable@vger.kernel.org>
      Fixes: e3ba76de ("perf tools: Force uncore events to system wide monitoring)
      Link: http://lkml.kernel.org/r/1507315102-5942-1-git-send-email-mark.rutland@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      66ec1191
  2. 02 9月, 2017 1 次提交
    • A
      perf stat: Only auto-merge events that are PMU aliases · 63ce8449
      Arnaldo Carvalho de Melo 提交于
      Peter reported that when he explicitely asked for multiple events with
      the same name on the command line it got coalesced into just one line,
      i.e.:
      
         # perf stat -e cycles -e cycles -e cycles usleep 1
      
         Performance counter stats for 'usleep 1':
      
               3,269,652      cycles
      
             0.000884123 seconds time elapsed
      
        #
      
      And while there is the --no-merges option to disable that auto-merging,
      this is a blunt change in behaviour for such explicit request, so change
      the code so that this auto merging is done only when handling the multi
      PMU aliases with the same name that introduced this coalescing,
      restoring the previous behaviour for the explicit case:
      
        # perf stat -e cycles -e cycles -e cycles usleep 1
      
         Performance counter stats for 'usleep 1':
      
               1,472,837      cycles
               1,472,837      cycles
               1,472,837      cycles
      
             0.001764870 seconds time elapsed
      
        #
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 430daf2d ("perf stat: Collapse identically named events")
      Link: http://lkml.kernel.org/r/20170831184122.GK4831@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63ce8449
  3. 18 8月, 2017 3 次提交
  4. 19 7月, 2017 1 次提交
  5. 25 4月, 2017 2 次提交
  6. 20 4月, 2017 5 次提交
  7. 23 3月, 2017 3 次提交
    • A
      perf list: Move extra details printing to new option · bf874fcf
      Andi Kleen 提交于
      Move the printing of perf expressions and internal events to a new
      clearer --details flag, instead of lumping it together with other debug
      options in --debug. This makes it clearer to use.
      
      Before
      
        perf list --debug
        ...
        unc_m_power_critical_throttle_cycles
               [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
                uncore_imc_2/event=0x86/  MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
      
      after
      
        perf list --details
        ...
        unc_m_power_critical_throttle_cycles
               [Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
                uncore_imc_2/event=0x86/  MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/20170320201711.14142-14-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bf874fcf
    • A
      perf pmu: Add support for MetricName JSON attribute · 96284814
      Andi Kleen 提交于
      Add support for a new JSON event attribute to name MetricExpr for better
      output in perf stat.
      
      If the event has no MetricName it uses the normal event name instead to
      describe the metric.
      
      Before
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
                 time unc_p_freq_max_os_cycles
           1.000149775     15.7
           2.000344807     19.3
           3.000502544     16.7
           4.000640656      6.6
           5.000779955      9.9
      
      After
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
                 time freq_max_os_cycles %
           1.000149775     15.7
           2.000344807     19.3
           3.000502544     16.7
           4.000640656      6.6
           5.000779955      9.9
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170320201711.14142-13-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      96284814
    • A
      perf stat: Output JSON MetricExpr metric · 37932c18
      Andi Kleen 提交于
      Add generic infrastructure to perf stat to output ratios for
      "MetricExpr" entries in the event lists. Many events are more useful as
      ratios than in raw form, typically some count in relation to total
      ticks.
      
      Transfer the MetricExpr information from the alias to the evsel.
      
      We mark the events that need to be collected for MetricExpr, and also
      link the events using them with a pointer. The code is careful to always
      prefer the right event in the same group to minimize multiplexing
      errors. At the moment only a single relation is supported.
      
      Then add a rblist to the stat shadow code that remembers stats based on
      the cpu and context.
      
      Then finally update and retrieve and print these values similarly to the
      existing hardcoded perf metrics. We use the simple expression parser
      added earlier to evaluate the expression.
      
      Normally we just output the result without further commentary, but for
      --metric-only this would lead to empty columns. So for this case use the
      original event as description.
      
      There is no attempt to automatically add the MetricExpr event, if it is
      missing, however we suggest it to the user, because the user tool
      doesn't have enough information to reliably construct a group that is
      guaranteed to schedule. So we leave that to the user.
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}'
             1.000147889        800,085,181      unc_p_clockticks
             1.000147889         93,126,241      unc_p_freq_max_os_cycles  #     11.6
             2.000448381        800,218,217      unc_p_clockticks
             2.000448381        142,516,095      unc_p_freq_max_os_cycles  #     17.8
             3.000639852        800,243,057      unc_p_clockticks
             3.000639852        162,292,689      unc_p_freq_max_os_cycles  #     20.3
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
        #    time         freq_max_os_cycles %
             1.000127077      0.9
             2.000301436      0.7
             3.000456379      0.0
      
      v2: Change from DivideBy to MetricExpr
      v3: Use expr__ prefix.  Support more than one other event.
      v4: Update description
      v5: Only print warning message once for multiple PMUs.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      37932c18
  8. 22 3月, 2017 2 次提交
  9. 04 3月, 2017 1 次提交
    • J
      perf tools: Force uncore events to system wide monitoring · e3ba76de
      Jiri Olsa 提交于
      Make system wide (-a) the default option if no target was specified and
      one of following conditions is met:
      
        - there's no workload specified (current behaviour)
        - there is workload specified but all requested
          events are system wide ones
      
      Mixed events core/uncore with workload:
      
        $ perf stat -e 'uncore_cbox_0/clockticks/,cycles' sleep 1
      
         Performance counter stats for 'sleep 1':
      
           <not supported>      uncore_cbox_0/clockticks/
                   980,489      cycles
      
               1.000897406 seconds time elapsed
      
      Uncore event with workload:
      
        $ perf stat -e 'uncore_cbox_0/clockticks/' sleep 1
      
         Performance counter stats for 'system wide':
      
        281,473,897,192,670      uncore_cbox_0/clockticks/
      
               1.000833784 seconds time elapsed
      
      Committer note:
      
      When testing I realized the default case for !root, i.e. no events
      passed via -e, was broke by v2 of this patch, reported and after a
      patch provided by Jiri it is back working:
      
        [acme@jouet linux]$ perf stat usleep 1
      
         Performance counter stats for 'usleep 1':
      
               0.401335      task-clock:u (msec)     #   0.297 CPUs utilized
                      0      context-switches:u      #   0.000 K/sec
                      0      cpu-migrations:u        #   0.000 K/sec
                     48      page-faults:u           #   0.120 M/sec
                458,146      cycles:u                #   1.142 GHz
                245,113      instructions:u          #   0.54  insn per cycle
                 47,991      branches:u              # 119.578 M/sec
                  4,022      branch-misses:u         #   8.38% of all branches
      
            0.001350029 seconds time elapsed
      
        [acme@jouet linux]$
      Suggested-and-Tested-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170227094818.GA12764@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e3ba76de
  10. 18 2月, 2017 2 次提交
  11. 15 2月, 2017 1 次提交
  12. 08 2月, 2017 5 次提交
  13. 24 10月, 2016 1 次提交
  14. 04 10月, 2016 4 次提交
  15. 29 9月, 2016 3 次提交
  16. 14 9月, 2016 1 次提交
  17. 16 7月, 2016 1 次提交
    • W
      perf tools: Enable overwrite settings · 626a6b78
      Wang Nan 提交于
      This patch allows following config terms and option:
      
      Globally setting events to overwrite;
      
        # perf record --overwrite ...
      
      Set specific events to be overwrite or no-overwrite.
      
        # perf record --event cycles/overwrite/ ...
        # perf record --event cycles/no-overwrite/ ...
      
      Add missing config terms and update the config term array size because
      the longest string length has changed.
      
      For overwritable events, it automatically selects attr.write_backward
      since perf requires it to be backward for reading.
      
      Test result:
      
        # perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
        # perf evlist -v
        syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1468485287-33422-14-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      626a6b78
  18. 14 7月, 2016 3 次提交