1. 31 8月, 2018 24 次提交
  2. 25 7月, 2018 2 次提交
    • J
      perf stat: Get rid of extra clock display function · 0aa802a7
      Jiri Olsa 提交于
      There's no reason to have separate function to display clock events.
      It's only purpose was to convert the nanosecond value into microseconds.
      We do that now in generic code, if the unit and scale values are
      properly set, which this patch do for clock events.
      
      The output differs in the unit field being displayed in its columns
      rather than having it added as a suffix of the event name. Plus the
      value is rounded into 2 decimal numbers as for any other event.
      
      Before:
      
        # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
      
         Performance counter stats for 'CPU(s) 0':
      
             3001.123137      cpu-clock (msec)          #    1.000 CPUs utilized
             3001.133250      task-clock (msec)         #    1.000 CPUs utilized
      
             3.001159813 seconds time elapsed
      
      Now:
      
        # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
      
         Performance counter stats for 'CPU(s) 0':
      
                3,001.05 msec cpu-clock                 #    1.000 CPUs utilized
                3,001.05 msec task-clock                #    1.000 CPUs utilized
      
             3.001077794 seconds time elapsed
      
      There's a small difference in csv output, as we now output the unit
      field, which was empty before. It's in the proper spot, so there's no
      compatibility issue.
      
      Before:
      
        # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
        3001.065177,,cpu-clock,3001064187,100.00,1.000,CPUs utilized
        3001.077085,,task-clock,3001077085,100.00,1.000,CPUs utilized
      
        # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
        3000.80,msec,cpu-clock,3000799026,100.00,1.000,CPUs utilized
        3000.80,msec,task-clock,3000799550,100.00,1.000,CPUs utilized
      
      Add perf_evsel__is_clock to replace nsec_counter.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180720110036.32251-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0aa802a7
    • T
      perf stat: Add transaction flag (-T) support for s390 · 742d92ff
      Thomas Richter 提交于
      The 'perf stat' command line flag -T to display transaction counters is
      currently supported for x86 only.
      
      Add support for s390. It is based on the metrics flag -M transaction
      using the architecture dependent JSON files. This requires a metric
      named "transaction" in the JSON files for the platform.
      
      Introduce a new function metricgroup__has_metric() to check for the
      existence of a metric_name transaction.
      
      As suggested by Andi Kleen, this is the new approach to support
      transactions counters. Other architectures will follow.
      
      Output before:
      
        [root@p23lp27 perf]# ./perf stat -T -- sleep 1
        Cannot set up transaction events
        [root@p23lp27 perf]#
      
      Output after:
      
        [root@s35lp76 perf]# ./perf stat -T -- ~/mytesttx 1 >/tmp/111
      
         Performance counter stats for '/root/mytesttx 1':
      
                         1      tx_c_tend           #     13.0 transaction
                         1      tx_nc_tend
                        11      tx_nc_tabort
                         0      tx_c_tabort_special
                         0      tx_c_tabort_no_special
      
               0.001070109 seconds time elapsed
      
        [root@s35lp76 perf]#
      Suggested-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180626071701.58190-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      742d92ff
  3. 11 7月, 2018 1 次提交
    • J
      perf stat: Fix --interval_clear option · c818cc06
      Jiri Olsa 提交于
      Currently we display extra header line, like:
      
        # perf stat -I 1000 -a --interval-clear
        #           time             counts unit events
               insn per cycle branch-misses of all branches
             2.964917103        3855.349912      cpu-clock (msec)          #    3.855 CPUs utilized
             2.964917103             23,993      context-switches          #    0.006 M/sec
             2.964917103              1,301      cpu-migrations            #    0.329 K/sec
             ...
      
      Fixing the condition and getting proper:
      
        # perf stat -I 1000 -a --interval-clear
        #           time             counts unit events
             2.359048938        1432.492228      cpu-clock (msec)          #    1.432 CPUs utilized
             2.359048938              7,613      context-switches          #    0.002 M/sec
             2.359048938                419      cpu-migrations            #    0.133 K/sec
             ...
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 9660e08e ("perf stat: Add --interval-clear option")
      Link: http://lkml.kernel.org/r/20180702134202.17745-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c818cc06
  4. 08 6月, 2018 5 次提交
  5. 06 6月, 2018 1 次提交
    • J
      perf stat: Display user and system time · 0ce2da14
      Jiri Olsa 提交于
      Adding the support to read rusage data once the workload is finished and
      display the system/user time values:
      
        $ perf stat --null perf bench sched pipe
        ...
      
         Performance counter stats for 'perf bench sched pipe':
      
             5.342599256 seconds time elapsed
      
             2.544434000 seconds user
             4.549691000 seconds sys
      
      It works only in non -r mode and only for workload target.
      
      So as of now, for workload targets, we display 3 types of timings. The
      time we meassure in perf stat from enable to disable+period:
      
             5.342599256 seconds time elapsed
      
      The time spent in user and system lands, displayed only for workload
      session/target:
      
             2.544434000 seconds user
             4.549691000 seconds sys
      
      Those times are the very same displayed by 'time' tool.  They are
      returned by wait4 call via the getrusage struct interface.
      
      Committer notes:
      
      Had to rename some variables to avoid this on older systems such as
      centos:6:
      
        builtin-stat.c: In function 'print_footer':
        builtin-stat.c:1831: warning: declaration of 'stime' shadows a global declaration
        /usr/include/time.h:297: warning: shadowed declaration is here
      
      Committer testing:
      
        # perf stat --null time perf bench sched pipe
        # Running 'sched/pipe' benchmark:
        # Executed 1000000 pipe operations between two processes
      
             Total time: 5.526 [sec]
      
               5.526534 usecs/op
                 180945 ops/sec
        1.00user 6.25system 0:05.52elapsed 131%CPU (0avgtext+0avgdata 8056maxresident)k
        0inputs+0outputs (0major+606minor)pagefaults 0swaps
      
         Performance counter stats for 'time perf bench sched pipe':
      
               5.530978744 seconds time elapsed
      
               1.004037000 seconds user
               6.259937000 seconds sys
      
        #
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180605121313.31337-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0ce2da14
  6. 26 4月, 2018 3 次提交
  7. 25 4月, 2018 2 次提交
    • K
      perf stat: Fix duplicate PMU name for interval print · 80ee8c58
      Kan Liang 提交于
      PMU name is printed repeatedly for interval print, for example:
      
        perf stat --no-merge -e 'unc_m_clockticks' -a -I 1000
        #           time             counts unit events
           1.001053069        243,702,144      unc_m_clockticks [uncore_imc_4]
           1.001053069        244,268,304      unc_m_clockticks [uncore_imc_2]
           1.001053069        244,427,386      unc_m_clockticks [uncore_imc_0]
           1.001053069        244,583,760      unc_m_clockticks [uncore_imc_5]
           1.001053069        244,738,971      unc_m_clockticks [uncore_imc_3]
           1.001053069        244,880,309      unc_m_clockticks [uncore_imc_1]
           2.002024821        240,818,200      unc_m_clockticks [uncore_imc_4] [uncore_imc_4]
           2.002024821        240,767,812      unc_m_clockticks [uncore_imc_2] [uncore_imc_2]
           2.002024821        240,764,215      unc_m_clockticks [uncore_imc_0] [uncore_imc_0]
           2.002024821        240,759,504      unc_m_clockticks [uncore_imc_5] [uncore_imc_5]
           2.002024821        240,755,992      unc_m_clockticks [uncore_imc_3] [uncore_imc_3]
           2.002024821        240,750,403      unc_m_clockticks [uncore_imc_1] [uncore_imc_1]
      
      For each print, the PMU name is unconditionally appended to the
      counter->name.
      
      Need to check the counter->name first. If the PMU name is already
      appended, do nothing.
      
      Committer notes:
      
      Add and use perf_evsel->uniquified_name bool instead of doing the more
      expensive strstr(event->name, pmu->name).
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 8c5421c0 ("perf pmu: Display pmu name when printing unmerged events in stat")
      Link: http://lkml.kernel.org/r/1524594014-79243-5-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      80ee8c58
    • K
      perf stat: Print out hint for mixed PMU group error · 30060eae
      Kan Liang 提交于
      Perf doesn't support mixed events from different PMUs (except software
      event) in a group. For this case, only "<not counted>" or "<not
      supported>" are printed out. There is no hint which guides users to fix
      the issue.
      
      Checking the PMU type of events to determine if they are from the same
      PMU. There may be false alarm for the checking. E.g. the core PMU has
      different PMU type. But it should not happen often.
      
      The false alarm can also be tolerated, because:
      
      - It only happens on error path.
      - It just provides a possible solution for the issue.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1524594014-79243-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      30060eae
  8. 12 4月, 2018 1 次提交
  9. 17 3月, 2018 1 次提交
    • T
      perf stat: Fix core dump when flag T is used · fca32340
      Thomas Richter 提交于
      Executing command 'perf stat -T -- ls' dumps core on x86 and s390.
      
      Here is the call back chain (done on x86):
      
       # gdb ./perf
       ....
       (gdb) r stat -T -- ls
      ...
      Program received signal SIGSEGV, Segmentation fault.
      0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6
      (gdb) where
       #0  0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6
       #1  0x00007ffff56ae484 in asprintf () from /lib64/libc.so.6
       #2  0x00000000004f1982 in __parse_events_add_pmu (parse_state=0x7fffffffd580,
          list=0xbfb970, name=0xbf3ef0 "cpu",
          head_config=0xbfb930, auto_merge_stats=false) at util/parse-events.c:1233
       #3  0x00000000004f1c8e in parse_events_add_pmu (parse_state=0x7fffffffd580,
          list=0xbfb970, name=0xbf3ef0 "cpu",
          head_config=0xbfb930) at util/parse-events.c:1288
       #4  0x0000000000537ce3 in parse_events_parse (_parse_state=0x7fffffffd580,
          scanner=0xbf4210) at util/parse-events.y:234
       #5  0x00000000004f2c7a in parse_events__scanner (str=0x6b66c0
          "task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}",
          parse_state=0x7fffffffd580, start_token=258) at util/parse-events.c:1673
       #6  0x00000000004f2e23 in parse_events (evlist=0xbe9990, str=0x6b66c0
          "task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}", err=0x0)
          at util/parse-events.c:1713
       #7  0x000000000044e137 in add_default_attributes () at builtin-stat.c:2281
       #8  0x000000000044f7b5 in cmd_stat (argc=1, argv=0x7fffffffe3b0) at
          builtin-stat.c:2828
       #9  0x00000000004c8b0f in run_builtin (p=0xab01a0 <commands+288>, argc=4,
          argv=0x7fffffffe3b0) at perf.c:297
       #10 0x00000000004c8d7c in handle_internal_command (argc=4,
          argv=0x7fffffffe3b0) at perf.c:349
       #11 0x00000000004c8ece in run_argv (argcp=0x7fffffffe20c,
         argv=0x7fffffffe200) at perf.c:393
       #12 0x00000000004c929c in main (argc=4, argv=0x7fffffffe3b0) at perf.c:537
      (gdb)
      
      It turns out that a NULL pointer is referenced. Here are the
      function calls:
      
        ...
        cmd_stat()
        +---> add_default_attributes()
      	+---> parse_events(evsel_list, transaction_attrs, NULL);
      	             3rd parameter set to NULL
      
      Function parse_events(xx, xx, struct parse_events_error *err) dives
      into a bison generated scanner and creates
      parser state information for it first:
      
         struct parse_events_state parse_state = {
                      .list   = LIST_HEAD_INIT(parse_state.list),
                      .idx    = evlist->nr_entries,
                      .error  = err,   <--- NULL POINTER !!!
                      .evlist = evlist,
              };
      
      Now various functions inside the bison scanner are called to end up in
      __parse_events_add_pmu(struct parse_events_state *parse_state, ..) with
      first parameter being a pointer to above structure definition.
      
      Now the PMU event name is not found (because being executed in a VM) and
      this function tries to create an error message with
      
         asprintf(&parse_state->error.str, ....)
      
      which references a NULL pointer and dumps core.
      
      Fix this by providing a pointer to the necessary error information
      instead of NULL. Technically only the else part is needed to avoid the
      core dump, just lets be safe...
      Signed-off-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180308145735.64717-1-tmricht@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fca32340