1. 31 8月, 2018 4 次提交
  2. 25 7月, 2018 2 次提交
    • J
      perf stat: Get rid of extra clock display function · 0aa802a7
      Jiri Olsa 提交于
      There's no reason to have separate function to display clock events.
      It's only purpose was to convert the nanosecond value into microseconds.
      We do that now in generic code, if the unit and scale values are
      properly set, which this patch do for clock events.
      
      The output differs in the unit field being displayed in its columns
      rather than having it added as a suffix of the event name. Plus the
      value is rounded into 2 decimal numbers as for any other event.
      
      Before:
      
        # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
      
         Performance counter stats for 'CPU(s) 0':
      
             3001.123137      cpu-clock (msec)          #    1.000 CPUs utilized
             3001.133250      task-clock (msec)         #    1.000 CPUs utilized
      
             3.001159813 seconds time elapsed
      
      Now:
      
        # perf stat  -e cpu-clock,task-clock -C 0 sleep 3
      
         Performance counter stats for 'CPU(s) 0':
      
                3,001.05 msec cpu-clock                 #    1.000 CPUs utilized
                3,001.05 msec task-clock                #    1.000 CPUs utilized
      
             3.001077794 seconds time elapsed
      
      There's a small difference in csv output, as we now output the unit
      field, which was empty before. It's in the proper spot, so there's no
      compatibility issue.
      
      Before:
      
        # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
        3001.065177,,cpu-clock,3001064187,100.00,1.000,CPUs utilized
        3001.077085,,task-clock,3001077085,100.00,1.000,CPUs utilized
      
        # perf stat  -e cpu-clock,task-clock -C 0 -x, sleep 3
        3000.80,msec,cpu-clock,3000799026,100.00,1.000,CPUs utilized
        3000.80,msec,task-clock,3000799550,100.00,1.000,CPUs utilized
      
      Add perf_evsel__is_clock to replace nsec_counter.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180720110036.32251-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0aa802a7
    • T
      perf stat: Add transaction flag (-T) support for s390 · 742d92ff
      Thomas Richter 提交于
      The 'perf stat' command line flag -T to display transaction counters is
      currently supported for x86 only.
      
      Add support for s390. It is based on the metrics flag -M transaction
      using the architecture dependent JSON files. This requires a metric
      named "transaction" in the JSON files for the platform.
      
      Introduce a new function metricgroup__has_metric() to check for the
      existence of a metric_name transaction.
      
      As suggested by Andi Kleen, this is the new approach to support
      transactions counters. Other architectures will follow.
      
      Output before:
      
        [root@p23lp27 perf]# ./perf stat -T -- sleep 1
        Cannot set up transaction events
        [root@p23lp27 perf]#
      
      Output after:
      
        [root@s35lp76 perf]# ./perf stat -T -- ~/mytesttx 1 >/tmp/111
      
         Performance counter stats for '/root/mytesttx 1':
      
                         1      tx_c_tend           #     13.0 transaction
                         1      tx_nc_tend
                        11      tx_nc_tabort
                         0      tx_c_tabort_special
                         0      tx_c_tabort_no_special
      
               0.001070109 seconds time elapsed
      
        [root@s35lp76 perf]#
      Suggested-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180626071701.58190-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      742d92ff
  3. 11 7月, 2018 1 次提交
    • J
      perf stat: Fix --interval_clear option · c818cc06
      Jiri Olsa 提交于
      Currently we display extra header line, like:
      
        # perf stat -I 1000 -a --interval-clear
        #           time             counts unit events
               insn per cycle branch-misses of all branches
             2.964917103        3855.349912      cpu-clock (msec)          #    3.855 CPUs utilized
             2.964917103             23,993      context-switches          #    0.006 M/sec
             2.964917103              1,301      cpu-migrations            #    0.329 K/sec
             ...
      
      Fixing the condition and getting proper:
      
        # perf stat -I 1000 -a --interval-clear
        #           time             counts unit events
             2.359048938        1432.492228      cpu-clock (msec)          #    1.432 CPUs utilized
             2.359048938              7,613      context-switches          #    0.002 M/sec
             2.359048938                419      cpu-migrations            #    0.133 K/sec
             ...
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 9660e08e ("perf stat: Add --interval-clear option")
      Link: http://lkml.kernel.org/r/20180702134202.17745-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c818cc06
  4. 08 6月, 2018 5 次提交
  5. 06 6月, 2018 1 次提交
    • J
      perf stat: Display user and system time · 0ce2da14
      Jiri Olsa 提交于
      Adding the support to read rusage data once the workload is finished and
      display the system/user time values:
      
        $ perf stat --null perf bench sched pipe
        ...
      
         Performance counter stats for 'perf bench sched pipe':
      
             5.342599256 seconds time elapsed
      
             2.544434000 seconds user
             4.549691000 seconds sys
      
      It works only in non -r mode and only for workload target.
      
      So as of now, for workload targets, we display 3 types of timings. The
      time we meassure in perf stat from enable to disable+period:
      
             5.342599256 seconds time elapsed
      
      The time spent in user and system lands, displayed only for workload
      session/target:
      
             2.544434000 seconds user
             4.549691000 seconds sys
      
      Those times are the very same displayed by 'time' tool.  They are
      returned by wait4 call via the getrusage struct interface.
      
      Committer notes:
      
      Had to rename some variables to avoid this on older systems such as
      centos:6:
      
        builtin-stat.c: In function 'print_footer':
        builtin-stat.c:1831: warning: declaration of 'stime' shadows a global declaration
        /usr/include/time.h:297: warning: shadowed declaration is here
      
      Committer testing:
      
        # perf stat --null time perf bench sched pipe
        # Running 'sched/pipe' benchmark:
        # Executed 1000000 pipe operations between two processes
      
             Total time: 5.526 [sec]
      
               5.526534 usecs/op
                 180945 ops/sec
        1.00user 6.25system 0:05.52elapsed 131%CPU (0avgtext+0avgdata 8056maxresident)k
        0inputs+0outputs (0major+606minor)pagefaults 0swaps
      
         Performance counter stats for 'time perf bench sched pipe':
      
               5.530978744 seconds time elapsed
      
               1.004037000 seconds user
               6.259937000 seconds sys
      
        #
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180605121313.31337-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0ce2da14
  6. 26 4月, 2018 3 次提交
  7. 25 4月, 2018 2 次提交
    • K
      perf stat: Fix duplicate PMU name for interval print · 80ee8c58
      Kan Liang 提交于
      PMU name is printed repeatedly for interval print, for example:
      
        perf stat --no-merge -e 'unc_m_clockticks' -a -I 1000
        #           time             counts unit events
           1.001053069        243,702,144      unc_m_clockticks [uncore_imc_4]
           1.001053069        244,268,304      unc_m_clockticks [uncore_imc_2]
           1.001053069        244,427,386      unc_m_clockticks [uncore_imc_0]
           1.001053069        244,583,760      unc_m_clockticks [uncore_imc_5]
           1.001053069        244,738,971      unc_m_clockticks [uncore_imc_3]
           1.001053069        244,880,309      unc_m_clockticks [uncore_imc_1]
           2.002024821        240,818,200      unc_m_clockticks [uncore_imc_4] [uncore_imc_4]
           2.002024821        240,767,812      unc_m_clockticks [uncore_imc_2] [uncore_imc_2]
           2.002024821        240,764,215      unc_m_clockticks [uncore_imc_0] [uncore_imc_0]
           2.002024821        240,759,504      unc_m_clockticks [uncore_imc_5] [uncore_imc_5]
           2.002024821        240,755,992      unc_m_clockticks [uncore_imc_3] [uncore_imc_3]
           2.002024821        240,750,403      unc_m_clockticks [uncore_imc_1] [uncore_imc_1]
      
      For each print, the PMU name is unconditionally appended to the
      counter->name.
      
      Need to check the counter->name first. If the PMU name is already
      appended, do nothing.
      
      Committer notes:
      
      Add and use perf_evsel->uniquified_name bool instead of doing the more
      expensive strstr(event->name, pmu->name).
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 8c5421c0 ("perf pmu: Display pmu name when printing unmerged events in stat")
      Link: http://lkml.kernel.org/r/1524594014-79243-5-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      80ee8c58
    • K
      perf stat: Print out hint for mixed PMU group error · 30060eae
      Kan Liang 提交于
      Perf doesn't support mixed events from different PMUs (except software
      event) in a group. For this case, only "<not counted>" or "<not
      supported>" are printed out. There is no hint which guides users to fix
      the issue.
      
      Checking the PMU type of events to determine if they are from the same
      PMU. There may be false alarm for the checking. E.g. the core PMU has
      different PMU type. But it should not happen often.
      
      The false alarm can also be tolerated, because:
      
      - It only happens on error path.
      - It just provides a possible solution for the issue.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1524594014-79243-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      30060eae
  8. 12 4月, 2018 1 次提交
  9. 17 3月, 2018 1 次提交
    • T
      perf stat: Fix core dump when flag T is used · fca32340
      Thomas Richter 提交于
      Executing command 'perf stat -T -- ls' dumps core on x86 and s390.
      
      Here is the call back chain (done on x86):
      
       # gdb ./perf
       ....
       (gdb) r stat -T -- ls
      ...
      Program received signal SIGSEGV, Segmentation fault.
      0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6
      (gdb) where
       #0  0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6
       #1  0x00007ffff56ae484 in asprintf () from /lib64/libc.so.6
       #2  0x00000000004f1982 in __parse_events_add_pmu (parse_state=0x7fffffffd580,
          list=0xbfb970, name=0xbf3ef0 "cpu",
          head_config=0xbfb930, auto_merge_stats=false) at util/parse-events.c:1233
       #3  0x00000000004f1c8e in parse_events_add_pmu (parse_state=0x7fffffffd580,
          list=0xbfb970, name=0xbf3ef0 "cpu",
          head_config=0xbfb930) at util/parse-events.c:1288
       #4  0x0000000000537ce3 in parse_events_parse (_parse_state=0x7fffffffd580,
          scanner=0xbf4210) at util/parse-events.y:234
       #5  0x00000000004f2c7a in parse_events__scanner (str=0x6b66c0
          "task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}",
          parse_state=0x7fffffffd580, start_token=258) at util/parse-events.c:1673
       #6  0x00000000004f2e23 in parse_events (evlist=0xbe9990, str=0x6b66c0
          "task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}", err=0x0)
          at util/parse-events.c:1713
       #7  0x000000000044e137 in add_default_attributes () at builtin-stat.c:2281
       #8  0x000000000044f7b5 in cmd_stat (argc=1, argv=0x7fffffffe3b0) at
          builtin-stat.c:2828
       #9  0x00000000004c8b0f in run_builtin (p=0xab01a0 <commands+288>, argc=4,
          argv=0x7fffffffe3b0) at perf.c:297
       #10 0x00000000004c8d7c in handle_internal_command (argc=4,
          argv=0x7fffffffe3b0) at perf.c:349
       #11 0x00000000004c8ece in run_argv (argcp=0x7fffffffe20c,
         argv=0x7fffffffe200) at perf.c:393
       #12 0x00000000004c929c in main (argc=4, argv=0x7fffffffe3b0) at perf.c:537
      (gdb)
      
      It turns out that a NULL pointer is referenced. Here are the
      function calls:
      
        ...
        cmd_stat()
        +---> add_default_attributes()
      	+---> parse_events(evsel_list, transaction_attrs, NULL);
      	             3rd parameter set to NULL
      
      Function parse_events(xx, xx, struct parse_events_error *err) dives
      into a bison generated scanner and creates
      parser state information for it first:
      
         struct parse_events_state parse_state = {
                      .list   = LIST_HEAD_INIT(parse_state.list),
                      .idx    = evlist->nr_entries,
                      .error  = err,   <--- NULL POINTER !!!
                      .evlist = evlist,
              };
      
      Now various functions inside the bison scanner are called to end up in
      __parse_events_add_pmu(struct parse_events_state *parse_state, ..) with
      first parameter being a pointer to above structure definition.
      
      Now the PMU event name is not found (because being executed in a VM) and
      this function tries to create an error message with
      
         asprintf(&parse_state->error.str, ....)
      
      which references a NULL pointer and dumps core.
      
      Fix this by providing a pointer to the necessary error information
      instead of NULL. Technically only the else part is needed to avoid the
      core dump, just lets be safe...
      Signed-off-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180308145735.64717-1-tmricht@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fca32340
  10. 08 3月, 2018 1 次提交
    • A
      perf pmu: Display pmu name when printing unmerged events in stat · 8c5421c0
      Agustin Vega-Frias 提交于
      To simplify creation of events accross multiple instances of the same
      type of PMU stat supports two methods for creating multiple events from
      a single event specification:
      
      1. A prefix or glob can be used in the PMU name.
      2. Aliases, which are listed immediately after the Kernel PMU events
         by perf list, are used.
      
      When the --no-merge option is passed and these events are displayed
      individually the PMU name is lost and it's not possible to see which
      count corresponds to which pmu:
      
          $ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          67      l3cache/read-miss/
                          67      l3cache/read-miss/
                          63      l3cache/read-miss/
                          60      l3cache/read-miss/
      
                 0.001675706 seconds time elapsed
      
          $ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          12      l3cache_read_miss
                          17      l3cache_read_miss
                          10      l3cache_read_miss
                           8      l3cache_read_miss
      
                 0.001661305 seconds time elapsed
      
      This change adds the original pmu name to the event. For dynamic pmu
      events the pmu name is restored in the event name:
      
          $ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          63      l3cache_0_3/read-miss/
                          74      l3cache_0_1/read-miss/
                          64      l3cache_0_2/read-miss/
                          74      l3cache_0_0/read-miss/
      
                 0.001675706 seconds time elapsed
      
      For alias events the name is added after the event name:
      
          $ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null
      
           Performance counter stats for 'system wide':
      
                          10      l3cache_read_miss [l3cache_0_3]
                          12      l3cache_read_miss [l3cache_0_1]
                          10      l3cache_read_miss [l3cache_0_2]
                          17      l3cache_read_miss [l3cache_0_0]
      
                 0.001661305 seconds time elapsed
      Signed-off-by: NAgustin Vega-Frias <agustinv@codeaurora.org>
      Acked-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Timur Tabi <timur@codeaurora.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Change-Id: I8056b9eda74bda33e95065056167ad96e97cb1fb
      Link: http://lkml.kernel.org/r/1520345084-42646-3-git-send-email-agustinv@codeaurora.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8c5421c0
  11. 06 3月, 2018 1 次提交
  12. 27 2月, 2018 1 次提交
    • J
      perf stat: Ignore error thread when enabling system-wide --per-thread · ab6c79b8
      Jin Yao 提交于
      If we execute 'perf stat --per-thread' with non-root account (even set
      kernel.perf_event_paranoid = -1 yet), it reports the error:
      
        jinyao@skl:~$ perf stat --per-thread
        Error:
        You may not have permission to collect system-wide stats.
      
        Consider tweaking /proc/sys/kernel/perf_event_paranoid,
        which controls use of the performance events system by
        unprivileged users (without CAP_SYS_ADMIN).
      
        The current value is 2:
      
          -1: Allow use of (almost) all events by all users
              Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
        >= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
              Disallow raw tracepoint access by users without CAP_SYS_ADMIN
        >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
        >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
      
        To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
      
                kernel.perf_event_paranoid = -1
      
      Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
      the global --per-thread mode had better ignore such errors and continue
      working on other threads.
      
      This patch will record the index of error thread in perf_evsel__open()
      and remove this thread before retrying.
      
      For example (run with non-root, kernel.perf_event_paranoid isn't set):
      
        jinyao@skl:~$ perf stat --per-thread
        ^C
         Performance counter stats for 'system wide':
      
               vmstat-3458    6.171984   cpu-clock:u (msec) #  0.000 CPUs utilized
                 perf-3670    0.515599   cpu-clock:u (msec) #  0.000 CPUs utilized
               vmstat-3458   1,163,643   cycles:u           #  0.189 GHz
                 perf-3670      40,881   cycles:u           #  0.079 GHz
               vmstat-3458   1,410,238   instructions:u     #  1.21  insn per cycle
                 perf-3670       3,536   instructions:u     #  0.09  insn per cycle
               vmstat-3458     288,937   branches:u         # 46.814 M/sec
                 perf-3670         936   branches:u         #  1.815 M/sec
               vmstat-3458      15,195   branch-misses:u    #  5.26% of all branches
                 perf-3670          76   branch-misses:u    #  8.12% of all branches
      
              12.651675247 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1516117388-10120-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab6c79b8
  13. 21 2月, 2018 1 次提交
  14. 16 2月, 2018 2 次提交
    • Y
      perf stat: Add support to print counts after a period of time · f1f8ad52
      yuzhoujian 提交于
      Introduce a new option to print counts after N milliseconds and update
      'perf stat' documentation accordingly.
      
      Show below is the output of the new option for perf stat.
      
        $ perf stat --time 2000 -e cycles -a
        Performance counter stats for 'system wide':
      
              157,260,423      cycles
      
              2.003060766 seconds time elapsed
      
      We can print the count deltas after N milliseconds with this new
      introduced option. This option is not supported with "-I" option.
      
      In addition, according to Kangliang's patch(19afd104), the
      monitoring overhead for system-wide core event could be very high if the
      interval-print parameter was below 100ms, and the limitation value is
      10ms.
      
      So the same warning will be displayed when the time is set between 10ms
      to 100ms, and the minimal time is limited to 10ms. Users can make a
      decision according to their spcific cases.
      
      Committer notes:
      
      This actually stops the workload after the specified time, then prints
      the counts.
      
      So I renamed the option to --timeout and updated the documentation to
      state that it will not just print the counts after the specified time,
      but will really stop the 'perf stat' session and print the counts.
      
      The rename from 'time' to 'timeout' also fixes the build in systems
      where 'time' is used by glibc and can't be used as a name of a variable,
      such as centos:5 and centos:6.
      
      Changes since v3:
      - none.
      
      Changes since v2:
      - modify the time check in __run_perf_stat func to keep some consistency
        with the workload case.
      - add the warning when the time is set between 10ms to 100ms.
      - add the pr_err when the time is set below 10ms.
      
      Changes since v1:
      - none.
      Signed-off-by: Nyuzhoujian <yuzhoujian@didichuxing.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1517217923-8302-3-git-send-email-ufo19890607@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f1f8ad52
    • Y
      perf stat: Add support to print counts for fixed times · db06a269
      yuzhoujian 提交于
      Introduce a new option to print counts for fixed number of times and
      update 'perf stat' documentation accordingly.
      
      Show below is the output of the new option for perf stat.
      
        $ perf stat -I 1000 --interval-count 2 -e cycles -a
        #           time             counts unit events
                 1.002827089         93,884,870      cycles
                 2.004231506         56,573,446      cycles
      
      We can just print the counts for several times with this newly
      introduced option. The usage of it is a little like 'vmstat', and it
      should be used together with "-I" option.
      
        $ vmstat -n 1 2
        procs ---------memory-------------- --swap- ----io-- -system-- ------cpu---
         r  b swpd   free   buff   cache    si   so  bi   bo  in   cs us sy id wa st
         0  0    0 78270544 547484 51732076  0   0   0   20    1    1  1  0 99  0 0
         0  0    0 78270512 547484 51732080  0   0   0   16  477 1555  0  0 100 0 0
      
      Changes since v3:
      - merge interval_count check and times check to one line.
      - fix the wrong indent in stat.h
      - use stat_config.times instead of 'times' in cmd_stat function.
      
      Changes since v2:
      - none.
      
      Changes since v1:
      - change the name of the new option "times-print" to "interval-count".
      - keep the new option interval specifically.
      Signed-off-by: Nyuzhoujian <yuzhoujian@didichuxing.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1517217923-8302-2-git-send-email-ufo19890607@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      db06a269
  15. 27 12月, 2017 7 次提交
    • P
      perf perf: Remove duplicate includes · 3315d14f
      Pravin Shedge 提交于
      These duplicate includes have been found with scripts/checkincludes.pl
      but they have been removed manually to avoid removing false positives.
      Signed-off-by: NPravin Shedge <pravin.shedge4linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3315d14f
    • J
      perf stat: Resort '--per-thread' result · 29734550
      Jin Yao 提交于
      There are many threads reported if we enable '--per-thread'
      globally.
      
      1. Most of the threads are not counted or counting value 0.
      This patch removes these threads.
      
      2. We also resort the threads in display according to the
      counting value. It's useful for user to see the hottest
      threads easily.
      
      For example, the new results would be:
      
      root@skl:/tmp# perf stat --per-thread
      ^C
       Performance counter stats for 'system wide':
      
                  perf-24165              4.302433      cpu-clock (msec)          #    0.001 CPUs utilized
                vmstat-23127              1.562215      cpu-clock (msec)          #    0.000 CPUs utilized
            irqbalance-2780               0.827851      cpu-clock (msec)          #    0.000 CPUs utilized
                  sshd-23111              0.278308      cpu-clock (msec)          #    0.000 CPUs utilized
              thermald-2841               0.230880      cpu-clock (msec)          #    0.000 CPUs utilized
                  sshd-23058              0.207306      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/0:2-19991              0.133983      cpu-clock (msec)          #    0.000 CPUs utilized
         kworker/u16:1-18249              0.125636      cpu-clock (msec)          #    0.000 CPUs utilized
             rcu_sched-8                  0.085533      cpu-clock (msec)          #    0.000 CPUs utilized
         kworker/u16:2-23146              0.077139      cpu-clock (msec)          #    0.000 CPUs utilized
                 gmain-2700               0.041789      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/4:1-15354              0.028370      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/6:0-17528              0.023895      cpu-clock (msec)          #    0.000 CPUs utilized
          kworker/4:1H-1887               0.013209      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/5:2-31362              0.011627      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/0-11                 0.010892      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/3:2-12870              0.010220      cpu-clock (msec)          #    0.000 CPUs utilized
           ksoftirqd/0-7                  0.008869      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/1-14                 0.008476      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/7-50                 0.002944      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/3-26                 0.002893      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/4-32                 0.002759      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/2-20                 0.002429      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/6-44                 0.001491      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/5-38                 0.001477      cpu-clock (msec)          #    0.000 CPUs utilized
             rcu_sched-8                        10      context-switches          #    0.117 M/sec
         kworker/u16:1-18249                     7      context-switches          #    0.056 M/sec
                  sshd-23111                     4      context-switches          #    0.014 M/sec
                vmstat-23127                     4      context-switches          #    0.003 M/sec
                  perf-24165                     4      context-switches          #    0.930 K/sec
           kworker/0:2-19991                     3      context-switches          #    0.022 M/sec
         kworker/u16:2-23146                     3      context-switches          #    0.039 M/sec
           kworker/4:1-15354                     2      context-switches          #    0.070 M/sec
           kworker/6:0-17528                     2      context-switches          #    0.084 M/sec
                  sshd-23058                     2      context-switches          #    0.010 M/sec
           ksoftirqd/0-7                         1      context-switches          #    0.113 M/sec
            watchdog/0-11                        1      context-switches          #    0.092 M/sec
            watchdog/1-14                        1      context-switches          #    0.118 M/sec
            watchdog/2-20                        1      context-switches          #    0.412 M/sec
            watchdog/3-26                        1      context-switches          #    0.346 M/sec
            watchdog/4-32                        1      context-switches          #    0.362 M/sec
            watchdog/5-38                        1      context-switches          #    0.677 M/sec
            watchdog/6-44                        1      context-switches          #    0.671 M/sec
            watchdog/7-50                        1      context-switches          #    0.340 M/sec
          kworker/4:1H-1887                      1      context-switches          #    0.076 M/sec
              thermald-2841                      1      context-switches          #    0.004 M/sec
                 gmain-2700                      1      context-switches          #    0.024 M/sec
            irqbalance-2780                      1      context-switches          #    0.001 M/sec
           kworker/3:2-12870                     1      context-switches          #    0.098 M/sec
           kworker/5:2-31362                     1      context-switches          #    0.086 M/sec
         kworker/u16:1-18249                     2      cpu-migrations            #    0.016 M/sec
         kworker/u16:2-23146                     2      cpu-migrations            #    0.026 M/sec
             rcu_sched-8                         1      cpu-migrations            #    0.012 M/sec
                  sshd-23058                     1      cpu-migrations            #    0.005 M/sec
                  perf-24165             8,833,385      cycles                    #    2.053 GHz
                vmstat-23127             1,702,699      cycles                    #    1.090 GHz
            irqbalance-2780                739,847      cycles                    #    0.894 GHz
                  sshd-23111               269,506      cycles                    #    0.968 GHz
              thermald-2841                204,556      cycles                    #    0.886 GHz
                  sshd-23058               158,780      cycles                    #    0.766 GHz
           kworker/0:2-19991               112,981      cycles                    #    0.843 GHz
         kworker/u16:1-18249               100,926      cycles                    #    0.803 GHz
             rcu_sched-8                    74,024      cycles                    #    0.865 GHz
         kworker/u16:2-23146                55,984      cycles                    #    0.726 GHz
                 gmain-2700                 34,278      cycles                    #    0.820 GHz
           kworker/4:1-15354                20,665      cycles                    #    0.728 GHz
           kworker/6:0-17528                16,445      cycles                    #    0.688 GHz
           kworker/5:2-31362                 9,492      cycles                    #    0.816 GHz
            watchdog/3-26                    8,695      cycles                    #    3.006 GHz
          kworker/4:1H-1887                  8,238      cycles                    #    0.624 GHz
            watchdog/4-32                    7,580      cycles                    #    2.747 GHz
           kworker/3:2-12870                 7,306      cycles                    #    0.715 GHz
            watchdog/2-20                    7,274      cycles                    #    2.995 GHz
            watchdog/0-11                    6,988      cycles                    #    0.642 GHz
           ksoftirqd/0-7                     6,376      cycles                    #    0.719 GHz
            watchdog/1-14                    5,340      cycles                    #    0.630 GHz
            watchdog/5-38                    4,061      cycles                    #    2.749 GHz
            watchdog/6-44                    3,976      cycles                    #    2.667 GHz
            watchdog/7-50                    3,418      cycles                    #    1.161 GHz
                vmstat-23127             2,511,699      instructions              #    1.48  insn per cycle
                  perf-24165             1,829,908      instructions              #    0.21  insn per cycle
            irqbalance-2780              1,190,204      instructions              #    1.61  insn per cycle
              thermald-2841                143,544      instructions              #    0.70  insn per cycle
                  sshd-23111               128,138      instructions              #    0.48  insn per cycle
                  sshd-23058                57,654      instructions              #    0.36  insn per cycle
             rcu_sched-8                    44,063      instructions              #    0.60  insn per cycle
         kworker/u16:1-18249                42,551      instructions              #    0.42  insn per cycle
           kworker/0:2-19991                25,873      instructions              #    0.23  insn per cycle
         kworker/u16:2-23146                21,407      instructions              #    0.38  insn per cycle
                 gmain-2700                 13,691      instructions              #    0.40  insn per cycle
           kworker/4:1-15354                12,964      instructions              #    0.63  insn per cycle
           kworker/6:0-17528                10,034      instructions              #    0.61  insn per cycle
           kworker/5:2-31362                 5,203      instructions              #    0.55  insn per cycle
           kworker/3:2-12870                 4,866      instructions              #    0.67  insn per cycle
          kworker/4:1H-1887                  3,586      instructions              #    0.44  insn per cycle
           ksoftirqd/0-7                     3,463      instructions              #    0.54  insn per cycle
            watchdog/0-11                    3,135      instructions              #    0.45  insn per cycle
            watchdog/1-14                    3,135      instructions              #    0.59  insn per cycle
            watchdog/2-20                    3,135      instructions              #    0.43  insn per cycle
            watchdog/3-26                    3,135      instructions              #    0.36  insn per cycle
            watchdog/4-32                    3,135      instructions              #    0.41  insn per cycle
            watchdog/5-38                    3,135      instructions              #    0.77  insn per cycle
            watchdog/6-44                    3,135      instructions              #    0.79  insn per cycle
            watchdog/7-50                    3,135      instructions              #    0.92  insn per cycle
                vmstat-23127               539,181      branches                  #  345.139 M/sec
                  perf-24165               375,364      branches                  #   87.245 M/sec
            irqbalance-2780                262,092      branches                  #  316.593 M/sec
              thermald-2841                 31,611      branches                  #  136.915 M/sec
                  sshd-23111                21,874      branches                  #   78.596 M/sec
                  sshd-23058                10,682      branches                  #   51.528 M/sec
             rcu_sched-8                     8,693      branches                  #  101.633 M/sec
         kworker/u16:1-18249                 7,891      branches                  #   62.808 M/sec
           kworker/0:2-19991                 5,761      branches                  #   42.998 M/sec
         kworker/u16:2-23146                 4,099      branches                  #   53.138 M/sec
           kworker/4:1-15354                 2,755      branches                  #   97.110 M/sec
                 gmain-2700                  2,638      branches                  #   63.127 M/sec
           kworker/6:0-17528                 2,216      branches                  #   92.739 M/sec
           kworker/5:2-31362                 1,132      branches                  #   97.360 M/sec
           kworker/3:2-12870                 1,081      branches                  #  105.773 M/sec
          kworker/4:1H-1887                    725      branches                  #   54.887 M/sec
           ksoftirqd/0-7                       707      branches                  #   79.716 M/sec
            watchdog/0-11                      652      branches                  #   59.860 M/sec
            watchdog/1-14                      652      branches                  #   76.923 M/sec
            watchdog/2-20                      652      branches                  #  268.423 M/sec
            watchdog/3-26                      652      branches                  #  225.372 M/sec
            watchdog/4-32                      652      branches                  #  236.318 M/sec
            watchdog/5-38                      652      branches                  #  441.435 M/sec
            watchdog/6-44                      652      branches                  #  437.290 M/sec
            watchdog/7-50                      652      branches                  #  221.467 M/sec
                vmstat-23127                 8,960      branch-misses             #    1.66% of all branches
            irqbalance-2780                  3,047      branch-misses             #    1.16% of all branches
                  perf-24165                 2,876      branch-misses             #    0.77% of all branches
                  sshd-23111                 1,843      branch-misses             #    8.43% of all branches
              thermald-2841                  1,444      branch-misses             #    4.57% of all branches
                  sshd-23058                 1,379      branch-misses             #   12.91% of all branches
         kworker/u16:1-18249                   982      branch-misses             #   12.44% of all branches
             rcu_sched-8                       893      branch-misses             #   10.27% of all branches
         kworker/u16:2-23146                   578      branch-misses             #   14.10% of all branches
           kworker/0:2-19991                   376      branch-misses             #    6.53% of all branches
                 gmain-2700                    280      branch-misses             #   10.61% of all branches
           kworker/6:0-17528                   196      branch-misses             #    8.84% of all branches
           kworker/4:1-15354                   187      branch-misses             #    6.79% of all branches
           kworker/5:2-31362                   123      branch-misses             #   10.87% of all branches
            watchdog/0-11                       95      branch-misses             #   14.57% of all branches
            watchdog/4-32                       89      branch-misses             #   13.65% of all branches
           kworker/3:2-12870                    80      branch-misses             #    7.40% of all branches
            watchdog/3-26                       61      branch-misses             #    9.36% of all branches
          kworker/4:1H-1887                     60      branch-misses             #    8.28% of all branches
            watchdog/2-20                       52      branch-misses             #    7.98% of all branches
           ksoftirqd/0-7                        47      branch-misses             #    6.65% of all branches
            watchdog/1-14                       46      branch-misses             #    7.06% of all branches
            watchdog/7-50                       13      branch-misses             #    1.99% of all branches
            watchdog/5-38                        8      branch-misses             #    1.23% of all branches
            watchdog/6-44                        7      branch-misses             #    1.07% of all branches
      
             3.695150786 seconds time elapsed
      
      root@skl:/tmp# perf stat --per-thread -M IPC,CPI
      ^C
      
       Performance counter stats for 'system wide':
      
                vmstat-23127             2,000,783      inst_retired.any          #      1.5 IPC
              thermald-2841              1,472,670      inst_retired.any          #      1.3 IPC
                  sshd-23111               977,374      inst_retired.any          #      1.2 IPC
                  perf-24163               483,779      inst_retired.any          #      0.2 IPC
                 gmain-2700                341,213      inst_retired.any          #      0.9 IPC
                  sshd-23058               148,891      inst_retired.any          #      0.8 IPC
          rtkit-daemon-3288                 71,210      inst_retired.any          #      0.7 IPC
         kworker/u16:1-18249                39,562      inst_retired.any          #      0.3 IPC
             rcu_sched-8                    14,474      inst_retired.any          #      0.8 IPC
           kworker/0:2-19991                 7,659      inst_retired.any          #      0.2 IPC
           kworker/4:1-15354                 6,714      inst_retired.any          #      0.8 IPC
          rtkit-daemon-3289                  4,839      inst_retired.any          #      0.3 IPC
           kworker/6:0-17528                 3,321      inst_retired.any          #      0.6 IPC
           kworker/5:2-31362                 3,215      inst_retired.any          #      0.5 IPC
           kworker/7:2-23145                 3,173      inst_retired.any          #      0.7 IPC
          kworker/4:1H-1887                  1,719      inst_retired.any          #      0.3 IPC
            watchdog/0-11                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/1-14                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/2-20                    1,479      inst_retired.any          #      0.4 IPC
            watchdog/3-26                    1,479      inst_retired.any          #      0.4 IPC
            watchdog/4-32                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/5-38                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/6-44                    1,479      inst_retired.any          #      0.7 IPC
            watchdog/7-50                    1,479      inst_retired.any          #      0.7 IPC
         kworker/u16:2-23146                 1,408      inst_retired.any          #      0.5 IPC
                  perf-24163             2,249,872      cpu_clk_unhalted.thread
                vmstat-23127             1,352,455      cpu_clk_unhalted.thread
              thermald-2841              1,161,140      cpu_clk_unhalted.thread
                  sshd-23111               807,827      cpu_clk_unhalted.thread
                 gmain-2700                375,535      cpu_clk_unhalted.thread
                  sshd-23058               194,071      cpu_clk_unhalted.thread
         kworker/u16:1-18249               114,306      cpu_clk_unhalted.thread
          rtkit-daemon-3288                103,547      cpu_clk_unhalted.thread
           kworker/0:2-19991                46,550      cpu_clk_unhalted.thread
             rcu_sched-8                    18,855      cpu_clk_unhalted.thread
          rtkit-daemon-3289                 17,549      cpu_clk_unhalted.thread
           kworker/4:1-15354                 8,812      cpu_clk_unhalted.thread
           kworker/5:2-31362                 6,812      cpu_clk_unhalted.thread
          kworker/4:1H-1887                  5,270      cpu_clk_unhalted.thread
           kworker/6:0-17528                 5,111      cpu_clk_unhalted.thread
           kworker/7:2-23145                 4,667      cpu_clk_unhalted.thread
            watchdog/0-11                    4,663      cpu_clk_unhalted.thread
            watchdog/1-14                    4,663      cpu_clk_unhalted.thread
            watchdog/4-32                    4,626      cpu_clk_unhalted.thread
            watchdog/5-38                    4,403      cpu_clk_unhalted.thread
            watchdog/3-26                    3,936      cpu_clk_unhalted.thread
            watchdog/2-20                    3,850      cpu_clk_unhalted.thread
         kworker/u16:2-23146                 2,654      cpu_clk_unhalted.thread
            watchdog/6-44                    2,017      cpu_clk_unhalted.thread
            watchdog/7-50                    2,017      cpu_clk_unhalted.thread
                vmstat-23127             2,000,783      inst_retired.any          #      0.7 CPI
              thermald-2841              1,472,670      inst_retired.any          #      0.8 CPI
                  sshd-23111               977,374      inst_retired.any          #      0.8 CPI
                  perf-24163               495,037      inst_retired.any          #      4.7 CPI
                 gmain-2700                341,213      inst_retired.any          #      1.1 CPI
                  sshd-23058               148,891      inst_retired.any          #      1.3 CPI
          rtkit-daemon-3288                 71,210      inst_retired.any          #      1.5 CPI
         kworker/u16:1-18249                39,562      inst_retired.any          #      2.9 CPI
             rcu_sched-8                    14,474      inst_retired.any          #      1.3 CPI
           kworker/0:2-19991                 7,659      inst_retired.any          #      6.1 CPI
           kworker/4:1-15354                 6,714      inst_retired.any          #      1.3 CPI
          rtkit-daemon-3289                  4,839      inst_retired.any          #      3.6 CPI
           kworker/6:0-17528                 3,321      inst_retired.any          #      1.5 CPI
           kworker/5:2-31362                 3,215      inst_retired.any          #      2.1 CPI
           kworker/7:2-23145                 3,173      inst_retired.any          #      1.5 CPI
          kworker/4:1H-1887                  1,719      inst_retired.any          #      3.1 CPI
            watchdog/0-11                    1,479      inst_retired.any          #      3.2 CPI
            watchdog/1-14                    1,479      inst_retired.any          #      3.2 CPI
            watchdog/2-20                    1,479      inst_retired.any          #      2.6 CPI
            watchdog/3-26                    1,479      inst_retired.any          #      2.7 CPI
            watchdog/4-32                    1,479      inst_retired.any          #      3.1 CPI
            watchdog/5-38                    1,479      inst_retired.any          #      3.0 CPI
            watchdog/6-44                    1,479      inst_retired.any          #      1.4 CPI
            watchdog/7-50                    1,479      inst_retired.any          #      1.4 CPI
         kworker/u16:2-23146                 1,408      inst_retired.any          #      1.9 CPI
                  perf-24163             2,302,323      cycles
                vmstat-23127             1,352,455      cycles
              thermald-2841              1,161,140      cycles
                  sshd-23111               807,827      cycles
                 gmain-2700                375,535      cycles
                  sshd-23058               194,071      cycles
         kworker/u16:1-18249               114,306      cycles
          rtkit-daemon-3288                103,547      cycles
           kworker/0:2-19991                46,550      cycles
             rcu_sched-8                    18,855      cycles
          rtkit-daemon-3289                 17,549      cycles
           kworker/4:1-15354                 8,812      cycles
           kworker/5:2-31362                 6,812      cycles
          kworker/4:1H-1887                  5,270      cycles
           kworker/6:0-17528                 5,111      cycles
           kworker/7:2-23145                 4,667      cycles
            watchdog/0-11                    4,663      cycles
            watchdog/1-14                    4,663      cycles
            watchdog/4-32                    4,626      cycles
            watchdog/5-38                    4,403      cycles
            watchdog/3-26                    3,936      cycles
            watchdog/2-20                    3,850      cycles
         kworker/u16:2-23146                 2,654      cycles
            watchdog/6-44                    2,017      cycles
            watchdog/7-50                    2,017      cycles
      
             2.175726600 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      29734550
    • J
      perf stat: Remove --per-thread pid/tid limitation · 1d9f8d1b
      Jin Yao 提交于
      Currently, if we execute 'perf stat --per-thread' without specifying
      pid/tid, perf will return error.
      
      root@skl:/tmp# perf stat --per-thread
      The --per-thread option is only available when monitoring via -p -t options.
          -p, --pid <pid>       stat events on existing process id
          -t, --tid <tid>       stat events on existing thread id
      
      This patch removes this limitation. If no pid/tid specified, it returns
      all threads (get threads from /proc).
      
      Note that it doesn't support cpu_list yet so if it's a cpu_list case,
      then skip.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-11-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d9f8d1b
    • J
      perf stat: Update or print per-thread stats · 14e72a21
      Jin Yao 提交于
      If the stats pointer in stat_config structure is not null, it will
      update the per-thread stats or print the per-thread stats on this
      buffer.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-9-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      14e72a21
    • J
      perf stat: Allocate shadow stats buffer for threads · 56739444
      Jin Yao 提交于
      After perf_evlist__create_maps() being executed, we can get all threads
      from /proc. And via thread_map__nr(), we can also get the number of
      threads.
      
      With the number of threads, the patch allocates a buffer which will
      record the shadow stats for these threads.
      
      The buffer pointer is saved in stat_config.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-8-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56739444
    • J
      perf stat: Print per-thread shadow stats · e0128b30
      Jin Yao 提交于
      The function perf_stat__print_shadow_stats() is called to print the
      shadow stats on a set of static variables.
      
      But the static variables are the limitations to support
      per-thread shadow stats.
      
      This patch lets the perf_stat__print_shadow_stats() support
      to print the shadow stats from a input parameter 'st'.
      
      It will not directly get value from static variable. Instead,
      it now uses runtime_stat_avg() and runtime_stat_n() to get and
      compute the values.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-6-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e0128b30
    • J
      perf stat: Update per-thread shadow stats · 1fcd0394
      Jin Yao 提交于
      The functions perf_stat__update_shadow_stats() is called to update the
      shadow stats on a set of static variables.
      
      But the static variables are the limitations to be extended to support
      per-thread shadow stats.
      
      This patch lets the perf_stat__update_shadow_stats() support to update
      the shadow stats on a input parameter 'st' and uses
      update_runtime_stat() to update the stats. It will not directly update
      the static variables as before.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-5-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1fcd0394
  16. 30 11月, 2017 1 次提交
  17. 31 10月, 2017 3 次提交
  18. 27 10月, 2017 1 次提交
  19. 13 9月, 2017 2 次提交
    • A
      perf stat: Fall weak group back even for EBADF · 35c1980e
      Andi Kleen 提交于
      It's not possible to run a package event and a per cpu event in the same
      group. This is used by some of the power metrics.  They work correctly
      when not using a group.
      
      Normally weak groups should handle that, but in this case EBADF is
      returned instead of the normal EINVAL.
      
        $ strace -e perf_event_open ./perf stat -v -e '{cstate_pkg/c2-residency/,msr/tsc/}:W' -a sleep 1
        Using CPUID GenuineIntel-6-3E
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, PERF_FLAG_FD_CLOEXEC) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = -1 EINVAL (Invalid argument)
        perf_event_open({type=0x17 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, -1, 0) = 3
        perf_event_open({type=0x7 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 0, 3, 0) = 4
        perf_event_open({type=0x7 /* PERF_TYPE_??? */, size=PERF_ATTR_SIZE_VER5, config=0, ...}, -1, 1, 0, 0) = -1 EBADF (Bad file descriptor)
      
      and perf errors out.
      
      Make weak groups trigger a fall back for EBADF too. Then this case works correctly:
      
        $ perf stat -v -e '{cstate_pkg/c2-residency/,msr/tsc/}:W' -a sleep 1
        Using CPUID GenuineIntel-6-3E
        Weak group for cstate_pkg/c2-residency//2 failed
        cstate_pkg/c2-residency/: 476709882 1000598460 1000598460
        msr/tsc/: 39625837911 12007369110 12007369110
      
         Performance counter stats for 'system wide':
      
               476,709,882      cstate_pkg/c2-residency/
            39,625,837,911      msr/tsc/
      
               1.000697588 seconds time elapsed
      
        This fixes perf stat -M Power ...
      
        $ perf stat -M Power --metric-only -a sleep 1
      
         Performance counter stats for 'system wide':
      
        Turbo_Utilization  C3_Core_Residency  C6_Core_Residency C7_Core_Residency  C2_Pkg_Residency   C3_Pkg_Residency  C6_Pkg_Residency  C7_Pkg_Residency
             1.0                 0.7                30.0               0.0               0.9                 0.1               0.4                 0.0
      
               1.001240740 seconds time elapsed
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170905211324.32427-1-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      35c1980e
    • A
      perf stat: Update walltime_nsecs_stats in interval mode · b90f1333
      Andi Kleen 提交于
      Some metrics (like GFLOPs) need walltime_nsecs_stats for each interval.
      Compute it for each interval instead of only at the end.
      
      Pointed out by Jiri.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170831194036.30146-12-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b90f1333