1. 13 9月, 2017 3 次提交
  2. 11 8月, 2017 1 次提交
  3. 21 6月, 2017 1 次提交
    • K
      perf stat: Add support to measure SMI cost · daefd0bc
      Kan Liang 提交于
      Implementing a new --smi-cost mode in perf stat to measure SMI cost.
      
      During the measurement, the /sys/device/cpu/freeze_on_smi will be set.
      
      The measurement can be done with one counter (unhalted core cycles), and
      two free running MSR counters (IA32_APERF and SMI_COUNT).
      
      In practice, the percentages of SMI core cycles should be more useful
      than absolute value. So the output will be the percentage of SMI core
      cycles and SMI#. metric_only will be set by default.
      
      SMI cycles% = (aperf - unhalted core cycles) / aperf
      
      Here is an example output.
      
       Performance counter stats for 'sudo echo ':
      
      SMI cycles%          SMI#
          0.1%              1
      
             0.010858678 seconds time elapsed
      
      Users who wants to get the actual value can apply additional
      --no-metric-only.
      Signed-off-by: NKan Liang <Kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      daefd0bc
  4. 23 3月, 2017 2 次提交
    • A
      perf pmu: Add support for MetricName JSON attribute · 96284814
      Andi Kleen 提交于
      Add support for a new JSON event attribute to name MetricExpr for better
      output in perf stat.
      
      If the event has no MetricName it uses the normal event name instead to
      describe the metric.
      
      Before
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
                 time unc_p_freq_max_os_cycles
           1.000149775     15.7
           2.000344807     19.3
           3.000502544     16.7
           4.000640656      6.6
           5.000779955      9.9
      
      After
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
                 time freq_max_os_cycles %
           1.000149775     15.7
           2.000344807     19.3
           3.000502544     16.7
           4.000640656      6.6
           5.000779955      9.9
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170320201711.14142-13-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      96284814
    • A
      perf stat: Output JSON MetricExpr metric · 37932c18
      Andi Kleen 提交于
      Add generic infrastructure to perf stat to output ratios for
      "MetricExpr" entries in the event lists. Many events are more useful as
      ratios than in raw form, typically some count in relation to total
      ticks.
      
      Transfer the MetricExpr information from the alias to the evsel.
      
      We mark the events that need to be collected for MetricExpr, and also
      link the events using them with a pointer. The code is careful to always
      prefer the right event in the same group to minimize multiplexing
      errors. At the moment only a single relation is supported.
      
      Then add a rblist to the stat shadow code that remembers stats based on
      the cpu and context.
      
      Then finally update and retrieve and print these values similarly to the
      existing hardcoded perf metrics. We use the simple expression parser
      added earlier to evaluate the expression.
      
      Normally we just output the result without further commentary, but for
      --metric-only this would lead to empty columns. So for this case use the
      original event as description.
      
      There is no attempt to automatically add the MetricExpr event, if it is
      missing, however we suggest it to the user, because the user tool
      doesn't have enough information to reliably construct a group that is
      guaranteed to schedule. So we leave that to the user.
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}'
             1.000147889        800,085,181      unc_p_clockticks
             1.000147889         93,126,241      unc_p_freq_max_os_cycles  #     11.6
             2.000448381        800,218,217      unc_p_clockticks
             2.000448381        142,516,095      unc_p_freq_max_os_cycles  #     17.8
             3.000639852        800,243,057      unc_p_clockticks
             3.000639852        162,292,689      unc_p_freq_max_os_cycles  #     20.3
      
        % perf stat -a -I 1000 -e '{unc_p_clockticks,unc_p_freq_max_os_cycles}' --metric-only
        #    time         freq_max_os_cycles %
             1.000127077      0.9
             2.000301436      0.7
             3.000456379      0.0
      
      v2: Change from DivideBy to MetricExpr
      v3: Use expr__ prefix.  Support more than one other event.
      v4: Update description
      v5: Only print warning message once for multiple PMUs.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170320201711.14142-11-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      37932c18
  5. 07 6月, 2016 1 次提交
  6. 17 5月, 2016 2 次提交
  7. 23 3月, 2016 1 次提交
  8. 03 3月, 2016 3 次提交
    • A
      perf stat: Check for frontend stalled for metrics · fb4605ba
      Andi Kleen 提交于
      Add an extra check for frontend stalled in the metrics.  This avoids an
      extra column for the --metric-only case when the CPU does not support
      frontend stalled.
      
      v2: Add separate init function
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1456858672-21594-8-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fb4605ba
    • A
      perf stat: Support metrics in --per-core/socket mode · 44d49a60
      Andi Kleen 提交于
      Enable metrics printing in --per-core / --per-socket mode. We need to
      save the shadow metrics in a unique place. Always use the first CPU in
      the aggregation. Then use the same CPU to retrieve the shadow value
      later.
      
      Example output:
      
        % perf stat --per-core -a ./BC1s
      
         Performance counter stats for 'system wide':
      
        S0-C0 2   2966.020381 task-clock (msec) #   2.004 CPUs utilized  (100.00%)
        S0-C0 2            49 context-switches  #   0.017 K/sec          (100.00%)
        S0-C0 2             4 cpu-migrations    #   0.001 K/sec          (100.00%)
        S0-C0 2           467 page-faults       #   0.157 K/sec
        S0-C0 2 4,599,061,773 cycles            #   1.551 GHz            (100.00%)
        S0-C0 2 9,755,886,883 instructions      #   2.12  insn per cycle (100.00%)
        S0-C0 2 1,906,272,125 branches          # 642.704 M/sec          (100.00%)
        S0-C0 2    81,180,867 branch-misses     #   4.26% of all branches
        S0-C1 2   2965.995373 task-clock (msec) #   2.003 CPUs utilized  (100.00%)
        S0-C1 2            62 context-switches  #   0.021 K/sec          (100.00%)
        S0-C1 2             8 cpu-migrations    #   0.003 K/sec          (100.00%)
        S0-C1 2           281 page-faults       #   0.095 K/sec
        S0-C1 2     6,347,290 cycles            #   0.002 GHz            (100.00%)
        S0-C1 2     4,654,156 instructions      #   0.73  insn per cycle (100.00%)
        S0-C1 2       947,121 branches          #   0.319 M/sec          (100.00%)
        S0-C1 2        37,322 branch-misses     #   3.94% of all branches
      
               1.480409747 seconds time elapsed
      
      v2: Rebase to older patches
      v3: Document shadow cpus. Fix aggr_get_id argument. Fix -A shadows (Jiri)
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1456785386-19481-4-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      44d49a60
    • A
      perf stat: Implement CSV metrics output · 92a61f64
      Andi Kleen 提交于
      Now support CSV output for metrics. With the new output callbacks this
      is relatively straight forward by creating new callbacks.
      
      This allows to easily plot metrics from CSV files.
      
      The new line callback needs to know the number of fields to skip them
      correctly
      
      Example output before:
      
        % perf stat -x, true
        0.200687,,task-clock,200687,100.00
        0,,context-switches,200687,100.00
        0,,cpu-migrations,200687,100.00
        40,,page-faults,200687,100.00
        730871,,cycles,203601,100.00
        551056,,stalled-cycles-frontend,203601,100.00
        <not supported>,,stalled-cycles-backend,0,100.00
        385523,,instructions,203601,100.00
        78028,,branches,203601,100.00
        3946,,branch-misses,203601,100.00
      
      After:
      
        % perf stat -x, true
        .502457,,task-clock,502457,100.00,0.485,CPUs utilized
        0,,context-switches,502457,100.00,0.000,K/sec
        0,,cpu-migrations,502457,100.00,0.000,K/sec
        45,,page-faults,502457,100.00,0.090,M/sec
        644692,,cycles,509102,100.00,1.283,GHz
        423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
        <not supported>,,stalled-cycles-backend,0,100.00,,,,
        492701,,instructions,509102,100.00,0.76,insn per cycle
        ,,,,,0.86,stalled cycles per insn
        97767,,branches,509102,100.00,194.578,M/sec
        4788,,branch-misses,509102,100.00,4.90,of all branches
      
      or easier readable
      
        $ perf stat  -x, -o x.csv true
        $ column -s, -t x.csv
        0.490635        task-clock              490635 100.00 0.489   CPUs utilized
        0               context-switches        490635 100.00 0.000   K/sec
        0               cpu-migrations          490635 100.00 0.000   K/sec
        45              page-faults             490635 100.00 0.092   M/sec
        629080          cycles                  497698 100.00 1.282   GHz
        409498          stalled-cycles-frontend 497698 100.00 65.09   frontend cycles idle
        <not supported> stalled-cycles-backend  0      100.00
        491424          instructions            497698 100.00 0.78    insn per cycle
                                                              0.83    stalled cycles per insn
        97278           branches                497698 100.00 198.270 M/sec
        4569            branch-misses           497698 100.00 4.70    of all branches
      
      Two new fields are added: metric value and metric name.
      
      v2: Split out function argument changes
      v3: Reenable metrics for real.
      v4: Fix wrong hunk from refactoring.
      v5: Remove extra "noise" printing (Jiri), but add it to the not counted case.
      Print empty metrics for not counted.
      v6: Avoid outputting metric on empty format.
      v7: Print metric at the end
      v8: Remove extra run, ena fields
      v9: Avoid extra new line for unsupported counters
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/1456785386-19481-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      92a61f64
  9. 17 2月, 2016 1 次提交
    • A
      perf stat: Abstract stat metrics printing · 140aeadc
      Andi Kleen 提交于
      Abstract the printing of shadow metrics. Instead of every metric calling
      fprintf directly and taking care of indentation, use two call backs: one
      to print metrics and another to start a new line.
      
      This will allow adding metrics to CSV mode and also using them for other
      purposes.
      
      The computation of padding is now done in the central callback, instead
      of every metric doing it manually.  This makes it easier to add new
      metrics.
      
      v2: Refactor functions, printout now does more. Move
      shadow printing. Improve fallback callbacks. Don't
      use void * callback data.
      v3: Remove unnecessary hunk. Add typedef for new_line
      v4: Remove unnecessary hunk. Don't print metrics for CSV/interval
      mode yet.  Move printout change to separate patch.
      v5: Fix bisect bugs. Avoid bogus frontend cycles printing.
      Fix indentation in different aggregation modes.
      v6: Delay newline handling
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      140aeadc
  10. 05 11月, 2015 1 次提交
  11. 28 7月, 2015 1 次提交
  12. 08 6月, 2015 1 次提交