1. 27 12月, 2017 27 次提交
    • J
      perf evsel: Fix swap for samples with raw data · f9d8adb3
      Jiri Olsa 提交于
      When we detect a different endianity we swap event before processing.
      It's tricky for samples because we have no idea what's inside. We treat
      it as an array of u64s, swap them and later on we swap back parts which
      are different.
      
      We mangle this way also the tracepoint raw data, which ends up in report
      showing wrong data:
      
        1.95%  comm=Q^B pid=29285 prio=16777216 target_cpu=000
        1.67%  comm=l^B pid=0 prio=16777216 target_cpu=000
      
      Luckily the traceevent library handles the endianity by itself (thank
      you Steven!), so we can pass the RAW data directly in the other
      endianity.
      
        2.51%  comm=beah-rhts-task pid=1175 prio=120 target_cpu=002
        2.23%  comm=kworker/0:0 pid=11566 prio=120 target_cpu=000
      
      The fix is basically to swap back the raw data if different endianity is
      detected.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20171129184346.3656-1-jolsa@kernel.org
      [ Add util/memswap.c to python-ext-sources to link missing mem_bswap_64() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9d8adb3
    • M
      perf probe: Support escaped character in parser · c588d158
      Masami Hiramatsu 提交于
      Support the special characters escaped by '\' in parser.  This allows
      user to specify versions directly like below.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc_get_state\\@GLIBC_2.2.5
        Added new event:
          probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_get_state -aR sleep 1
      
        =====
      
      Or, you can use separators in source filename, e.g.
      
        =====
        # ./perf probe -x /opt/test/a.out foo+bar.c:3
        Semantic error :There is non-digit character in offset.
          Error: Command Parse Error.
        =====
      
      Usually "+" in source file cause parser error, but
      
        =====
        # ./perf probe -x /opt/test/a.out foo\\+bar.c:4
        Added new event:
          probe_a:main         (on @foo+bar.c:4 in /opt/test/a.out)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_a:main -aR sleep 1
        =====
      
      escaped "\+" allows you to specify that.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151309111236.18107.5634753157435343410.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c588d158
    • M
      perf string: Add {strdup,strpbrk}_esc() · 1e9f9e8a
      Masami Hiramatsu 提交于
      To support the special characters escaped by '\' in 'perf probe' event parser.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275052163.24652.18205979384585484358.stgit@devbox
      [ Split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1e9f9e8a
    • M
      perf probe: Find versioned symbols from map · 4b3a2716
      Masami Hiramatsu 提交于
      Commit d8040645 ("perf symbols: Allow user probes on versioned
      symbols") allows user to find default versioned symbols (with "@@") in
      map. However, it did not enable normal versioned symbol (with "@") for
      perf-probe.  E.g.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
        Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so
          Error: Failed to add events.
        =====
      
      This solves above issue by improving perf-probe symbol search function,
      as below.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
        Added new event:
          probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_get_state -aR sleep 1
      
        # ./perf probe -l
          probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
        =====
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4b3a2716
    • M
      perf probe: Add __return suffix for return events · e63c625a
      Masami Hiramatsu 提交于
      Add __return suffix for function return events automatically. Without
      this, user have to give --force option and will see the number suffix
      for each event like "function_1", which is not easy to recognize.
      Instead, this adds __return suffix to it automatically.  E.g.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so 'malloc*%return'
        Added new events:
          probe_libc:malloc_printerr__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_consolidate__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_check__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_hook_ini__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_trim__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_usable_size__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_stats__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_info__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:mallochook__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_get_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_set_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_set_state__return -aR sleep 1
      
        =====
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275046418.24652.6696011972866498489.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e63c625a
    • M
      perf probe: Cut off the version suffix from event name · a3110cd9
      Masami Hiramatsu 提交于
      Cut off the version suffix (e.g. @GLIBC_2.2.5 etc.) from automatic
      generated event name. This fixes wildcard event adding like below case;
      
        =====
        # perf probe -x /lib64/libc-2.25.so malloc*
        Internal error: "malloc_get_state@GLIBC_2" is wrong event name.
          Error: Failed to add events.
        =====
      
      This failure was caused by a versioned suffix symbol.
      
      With this fix, perf probe automatically cuts the suffix after @ as
      below.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc*
        Added new events:
          probe_libc:malloc_printerr (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_consolidate (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_check (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_hook_ini (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc    (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_trim (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_usable_size (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_stats (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_info (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:mallochook (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_get_state (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_set_state (on malloc* in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_set_state -aR sleep 1
      
        =====
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Reported-by: Nbhargavb <bhargavaramudu@gmail.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/NoneSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a3110cd9
    • M
      perf probe: Add warning message if there is unexpected event name · 9f5c6d87
      Masami Hiramatsu 提交于
      This improve the error message so that user can know event-name error
      before writing new events to kprobe-events interface.
      
      E.g.
         ======
         #./perf probe -x /lib64/libc-2.25.so malloc_get_state*
         Internal error: "malloc_get_state@GLIBC_2" is an invalid event name.
           Error: Failed to add events.
         ======
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275040665.24652.5188568529237584489.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9f5c6d87
    • A
      perf env: Adopt perf_env__arch() from the annotate code · 4e8fbc1c
      Arnaldo Carvalho de Melo 提交于
      And use it in the libunwind case, with both passing a valid perf_env to
      extract the arch to be normalized from and passing NULL with the same
      semantic as in the annotate code: to get it from uname() uts.machine.
      
      Now the code to generate per arch errno translation tables (int/string)
      can use it to decode perf.data files recorded in a different arch than
      that where 'perf trace' (or any other analysis tool) runs.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-p2epffgash69w38kvj3ntpc9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4e8fbc1c
    • A
      perf annotate: Use perf_env when obtaining the arch name · 3285deba
      Arnaldo Carvalho de Melo 提交于
      Paving the way to reuse these routines in other areas, like when
      generating errno tables.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-rh1qv051vb8gfdcswskrn53h@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3285deba
    • A
      perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate() · 5449f13c
      Arnaldo Carvalho de Melo 提交于
      To reduce its function signature, since we get this from 'evsel' which
      is already one of its arguments.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-070eap7t6uicg9c3w086xy2z@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5449f13c
    • H
      perf trace: Use generated syscall table on s390 too · 901bb028
      Hendrik Brueckner 提交于
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      
      It also enables users to specify wildcards, for example, perf trace -e
      'open*', just like was already possible on x86.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: linux-s390@vger.kernel.org
      LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/n/tip-htplh3nbrivi7g3cffbh4fsu@git.kernel.org
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      901bb028
    • H
      perf s390: Generate system call table from asm/unistd.h · 164a747f
      Hendrik Brueckner 提交于
      This should speed up accessing new system calls introduced with
      the kernel rather than waiting for libaudit updates to include
      them.
      
      Committer testing:
      
        $ rm -rf /tmp/build/perf
        $ mkdir /tmp/build/perf
        $ make srctree=/home/acme/git/perf -C tools/perf/arch/s390 OUTPUT=/tmp/build/perf/ archheaders
        make: Entering directory '/home/acme/git/perf/tools/perf/arch/s390'
        /bin/sh '/home/acme/git/perf/tools/perf/arch/s390/entry/syscalls//mksyscalltbl' 'cc' /home/acme/git/perf/tools/arch/s390/include/uapi/asm/unistd.h > /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
        make: Leaving directory '/home/acme/git/perf/tools/perf/arch/s390'
        $ head -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
        static const char *syscalltbl_s390_64[] = {
      	[1] = "exit",
      	[2] = "fork",
      	[3] = "read",
      	[4] = "write",
        $ tail -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
      	[378] = "s390_guarded_storage",
      	[379] = "statx",
      	[380] = "s390_sthyi",
        };
        #define SYSCALLTBL_S390_64_MAX_ID 380
        $
      
      Now to plug this into 'perf trace' proper.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: linux-s390@vger.kernel.org
      LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/n/tip-h5km60rdg3rqxvsys85q50l3@git.kernel.org
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      164a747f
    • H
      tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h · 7af7919f
      Hendrik Brueckner 提交于
      Will be used for generating the syscall id/string translation table.
      Signed-off-by: NHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: NThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: linux-s390@vger.kernel.org
      LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/n/tip-vjfbfvgjrnqnbdluqd7leo98@git.kernel.org
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7af7919f
    • P
      perf perf: Remove duplicate includes · 3315d14f
      Pravin Shedge 提交于
      These duplicate includes have been found with scripts/checkincludes.pl
      but they have been removed manually to avoid removing false positives.
      Signed-off-by: NPravin Shedge <pravin.shedge4linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3315d14f
    • J
      perf test: Handle properly readdir DT_UNKNOWN · 378811ac
      Jiri Olsa 提交于
      Some system can return DT_UNKNOWN in readdir's struct dirent::d_type and
      we must handle it properly. In this case we can directly check if the
      entity we found is directory and skip it.
      Reported-by: NMichael Petlan <mpetlan@redhat.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
      [ Split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      378811ac
    • J
      perf utils: Move is_directory() to path.h · 06c3f2aa
      Jiri Olsa 提交于
      So that it can be used more widely, like in the next patch, when it will
      be used to fix a bug in 'perf test' handling of dirent.d_type ==
      DT_UNKNOWN.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
      [ Split from a larger patch, removed needless includes in path.h ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      06c3f2aa
    • J
      perf stat: Resort '--per-thread' result · 29734550
      Jin Yao 提交于
      There are many threads reported if we enable '--per-thread'
      globally.
      
      1. Most of the threads are not counted or counting value 0.
      This patch removes these threads.
      
      2. We also resort the threads in display according to the
      counting value. It's useful for user to see the hottest
      threads easily.
      
      For example, the new results would be:
      
      root@skl:/tmp# perf stat --per-thread
      ^C
       Performance counter stats for 'system wide':
      
                  perf-24165              4.302433      cpu-clock (msec)          #    0.001 CPUs utilized
                vmstat-23127              1.562215      cpu-clock (msec)          #    0.000 CPUs utilized
            irqbalance-2780               0.827851      cpu-clock (msec)          #    0.000 CPUs utilized
                  sshd-23111              0.278308      cpu-clock (msec)          #    0.000 CPUs utilized
              thermald-2841               0.230880      cpu-clock (msec)          #    0.000 CPUs utilized
                  sshd-23058              0.207306      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/0:2-19991              0.133983      cpu-clock (msec)          #    0.000 CPUs utilized
         kworker/u16:1-18249              0.125636      cpu-clock (msec)          #    0.000 CPUs utilized
             rcu_sched-8                  0.085533      cpu-clock (msec)          #    0.000 CPUs utilized
         kworker/u16:2-23146              0.077139      cpu-clock (msec)          #    0.000 CPUs utilized
                 gmain-2700               0.041789      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/4:1-15354              0.028370      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/6:0-17528              0.023895      cpu-clock (msec)          #    0.000 CPUs utilized
          kworker/4:1H-1887               0.013209      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/5:2-31362              0.011627      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/0-11                 0.010892      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/3:2-12870              0.010220      cpu-clock (msec)          #    0.000 CPUs utilized
           ksoftirqd/0-7                  0.008869      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/1-14                 0.008476      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/7-50                 0.002944      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/3-26                 0.002893      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/4-32                 0.002759      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/2-20                 0.002429      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/6-44                 0.001491      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/5-38                 0.001477      cpu-clock (msec)          #    0.000 CPUs utilized
             rcu_sched-8                        10      context-switches          #    0.117 M/sec
         kworker/u16:1-18249                     7      context-switches          #    0.056 M/sec
                  sshd-23111                     4      context-switches          #    0.014 M/sec
                vmstat-23127                     4      context-switches          #    0.003 M/sec
                  perf-24165                     4      context-switches          #    0.930 K/sec
           kworker/0:2-19991                     3      context-switches          #    0.022 M/sec
         kworker/u16:2-23146                     3      context-switches          #    0.039 M/sec
           kworker/4:1-15354                     2      context-switches          #    0.070 M/sec
           kworker/6:0-17528                     2      context-switches          #    0.084 M/sec
                  sshd-23058                     2      context-switches          #    0.010 M/sec
           ksoftirqd/0-7                         1      context-switches          #    0.113 M/sec
            watchdog/0-11                        1      context-switches          #    0.092 M/sec
            watchdog/1-14                        1      context-switches          #    0.118 M/sec
            watchdog/2-20                        1      context-switches          #    0.412 M/sec
            watchdog/3-26                        1      context-switches          #    0.346 M/sec
            watchdog/4-32                        1      context-switches          #    0.362 M/sec
            watchdog/5-38                        1      context-switches          #    0.677 M/sec
            watchdog/6-44                        1      context-switches          #    0.671 M/sec
            watchdog/7-50                        1      context-switches          #    0.340 M/sec
          kworker/4:1H-1887                      1      context-switches          #    0.076 M/sec
              thermald-2841                      1      context-switches          #    0.004 M/sec
                 gmain-2700                      1      context-switches          #    0.024 M/sec
            irqbalance-2780                      1      context-switches          #    0.001 M/sec
           kworker/3:2-12870                     1      context-switches          #    0.098 M/sec
           kworker/5:2-31362                     1      context-switches          #    0.086 M/sec
         kworker/u16:1-18249                     2      cpu-migrations            #    0.016 M/sec
         kworker/u16:2-23146                     2      cpu-migrations            #    0.026 M/sec
             rcu_sched-8                         1      cpu-migrations            #    0.012 M/sec
                  sshd-23058                     1      cpu-migrations            #    0.005 M/sec
                  perf-24165             8,833,385      cycles                    #    2.053 GHz
                vmstat-23127             1,702,699      cycles                    #    1.090 GHz
            irqbalance-2780                739,847      cycles                    #    0.894 GHz
                  sshd-23111               269,506      cycles                    #    0.968 GHz
              thermald-2841                204,556      cycles                    #    0.886 GHz
                  sshd-23058               158,780      cycles                    #    0.766 GHz
           kworker/0:2-19991               112,981      cycles                    #    0.843 GHz
         kworker/u16:1-18249               100,926      cycles                    #    0.803 GHz
             rcu_sched-8                    74,024      cycles                    #    0.865 GHz
         kworker/u16:2-23146                55,984      cycles                    #    0.726 GHz
                 gmain-2700                 34,278      cycles                    #    0.820 GHz
           kworker/4:1-15354                20,665      cycles                    #    0.728 GHz
           kworker/6:0-17528                16,445      cycles                    #    0.688 GHz
           kworker/5:2-31362                 9,492      cycles                    #    0.816 GHz
            watchdog/3-26                    8,695      cycles                    #    3.006 GHz
          kworker/4:1H-1887                  8,238      cycles                    #    0.624 GHz
            watchdog/4-32                    7,580      cycles                    #    2.747 GHz
           kworker/3:2-12870                 7,306      cycles                    #    0.715 GHz
            watchdog/2-20                    7,274      cycles                    #    2.995 GHz
            watchdog/0-11                    6,988      cycles                    #    0.642 GHz
           ksoftirqd/0-7                     6,376      cycles                    #    0.719 GHz
            watchdog/1-14                    5,340      cycles                    #    0.630 GHz
            watchdog/5-38                    4,061      cycles                    #    2.749 GHz
            watchdog/6-44                    3,976      cycles                    #    2.667 GHz
            watchdog/7-50                    3,418      cycles                    #    1.161 GHz
                vmstat-23127             2,511,699      instructions              #    1.48  insn per cycle
                  perf-24165             1,829,908      instructions              #    0.21  insn per cycle
            irqbalance-2780              1,190,204      instructions              #    1.61  insn per cycle
              thermald-2841                143,544      instructions              #    0.70  insn per cycle
                  sshd-23111               128,138      instructions              #    0.48  insn per cycle
                  sshd-23058                57,654      instructions              #    0.36  insn per cycle
             rcu_sched-8                    44,063      instructions              #    0.60  insn per cycle
         kworker/u16:1-18249                42,551      instructions              #    0.42  insn per cycle
           kworker/0:2-19991                25,873      instructions              #    0.23  insn per cycle
         kworker/u16:2-23146                21,407      instructions              #    0.38  insn per cycle
                 gmain-2700                 13,691      instructions              #    0.40  insn per cycle
           kworker/4:1-15354                12,964      instructions              #    0.63  insn per cycle
           kworker/6:0-17528                10,034      instructions              #    0.61  insn per cycle
           kworker/5:2-31362                 5,203      instructions              #    0.55  insn per cycle
           kworker/3:2-12870                 4,866      instructions              #    0.67  insn per cycle
          kworker/4:1H-1887                  3,586      instructions              #    0.44  insn per cycle
           ksoftirqd/0-7                     3,463      instructions              #    0.54  insn per cycle
            watchdog/0-11                    3,135      instructions              #    0.45  insn per cycle
            watchdog/1-14                    3,135      instructions              #    0.59  insn per cycle
            watchdog/2-20                    3,135      instructions              #    0.43  insn per cycle
            watchdog/3-26                    3,135      instructions              #    0.36  insn per cycle
            watchdog/4-32                    3,135      instructions              #    0.41  insn per cycle
            watchdog/5-38                    3,135      instructions              #    0.77  insn per cycle
            watchdog/6-44                    3,135      instructions              #    0.79  insn per cycle
            watchdog/7-50                    3,135      instructions              #    0.92  insn per cycle
                vmstat-23127               539,181      branches                  #  345.139 M/sec
                  perf-24165               375,364      branches                  #   87.245 M/sec
            irqbalance-2780                262,092      branches                  #  316.593 M/sec
              thermald-2841                 31,611      branches                  #  136.915 M/sec
                  sshd-23111                21,874      branches                  #   78.596 M/sec
                  sshd-23058                10,682      branches                  #   51.528 M/sec
             rcu_sched-8                     8,693      branches                  #  101.633 M/sec
         kworker/u16:1-18249                 7,891      branches                  #   62.808 M/sec
           kworker/0:2-19991                 5,761      branches                  #   42.998 M/sec
         kworker/u16:2-23146                 4,099      branches                  #   53.138 M/sec
           kworker/4:1-15354                 2,755      branches                  #   97.110 M/sec
                 gmain-2700                  2,638      branches                  #   63.127 M/sec
           kworker/6:0-17528                 2,216      branches                  #   92.739 M/sec
           kworker/5:2-31362                 1,132      branches                  #   97.360 M/sec
           kworker/3:2-12870                 1,081      branches                  #  105.773 M/sec
          kworker/4:1H-1887                    725      branches                  #   54.887 M/sec
           ksoftirqd/0-7                       707      branches                  #   79.716 M/sec
            watchdog/0-11                      652      branches                  #   59.860 M/sec
            watchdog/1-14                      652      branches                  #   76.923 M/sec
            watchdog/2-20                      652      branches                  #  268.423 M/sec
            watchdog/3-26                      652      branches                  #  225.372 M/sec
            watchdog/4-32                      652      branches                  #  236.318 M/sec
            watchdog/5-38                      652      branches                  #  441.435 M/sec
            watchdog/6-44                      652      branches                  #  437.290 M/sec
            watchdog/7-50                      652      branches                  #  221.467 M/sec
                vmstat-23127                 8,960      branch-misses             #    1.66% of all branches
            irqbalance-2780                  3,047      branch-misses             #    1.16% of all branches
                  perf-24165                 2,876      branch-misses             #    0.77% of all branches
                  sshd-23111                 1,843      branch-misses             #    8.43% of all branches
              thermald-2841                  1,444      branch-misses             #    4.57% of all branches
                  sshd-23058                 1,379      branch-misses             #   12.91% of all branches
         kworker/u16:1-18249                   982      branch-misses             #   12.44% of all branches
             rcu_sched-8                       893      branch-misses             #   10.27% of all branches
         kworker/u16:2-23146                   578      branch-misses             #   14.10% of all branches
           kworker/0:2-19991                   376      branch-misses             #    6.53% of all branches
                 gmain-2700                    280      branch-misses             #   10.61% of all branches
           kworker/6:0-17528                   196      branch-misses             #    8.84% of all branches
           kworker/4:1-15354                   187      branch-misses             #    6.79% of all branches
           kworker/5:2-31362                   123      branch-misses             #   10.87% of all branches
            watchdog/0-11                       95      branch-misses             #   14.57% of all branches
            watchdog/4-32                       89      branch-misses             #   13.65% of all branches
           kworker/3:2-12870                    80      branch-misses             #    7.40% of all branches
            watchdog/3-26                       61      branch-misses             #    9.36% of all branches
          kworker/4:1H-1887                     60      branch-misses             #    8.28% of all branches
            watchdog/2-20                       52      branch-misses             #    7.98% of all branches
           ksoftirqd/0-7                        47      branch-misses             #    6.65% of all branches
            watchdog/1-14                       46      branch-misses             #    7.06% of all branches
            watchdog/7-50                       13      branch-misses             #    1.99% of all branches
            watchdog/5-38                        8      branch-misses             #    1.23% of all branches
            watchdog/6-44                        7      branch-misses             #    1.07% of all branches
      
             3.695150786 seconds time elapsed
      
      root@skl:/tmp# perf stat --per-thread -M IPC,CPI
      ^C
      
       Performance counter stats for 'system wide':
      
                vmstat-23127             2,000,783      inst_retired.any          #      1.5 IPC
              thermald-2841              1,472,670      inst_retired.any          #      1.3 IPC
                  sshd-23111               977,374      inst_retired.any          #      1.2 IPC
                  perf-24163               483,779      inst_retired.any          #      0.2 IPC
                 gmain-2700                341,213      inst_retired.any          #      0.9 IPC
                  sshd-23058               148,891      inst_retired.any          #      0.8 IPC
          rtkit-daemon-3288                 71,210      inst_retired.any          #      0.7 IPC
         kworker/u16:1-18249                39,562      inst_retired.any          #      0.3 IPC
             rcu_sched-8                    14,474      inst_retired.any          #      0.8 IPC
           kworker/0:2-19991                 7,659      inst_retired.any          #      0.2 IPC
           kworker/4:1-15354                 6,714      inst_retired.any          #      0.8 IPC
          rtkit-daemon-3289                  4,839      inst_retired.any          #      0.3 IPC
           kworker/6:0-17528                 3,321      inst_retired.any          #      0.6 IPC
           kworker/5:2-31362                 3,215      inst_retired.any          #      0.5 IPC
           kworker/7:2-23145                 3,173      inst_retired.any          #      0.7 IPC
          kworker/4:1H-1887                  1,719      inst_retired.any          #      0.3 IPC
            watchdog/0-11                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/1-14                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/2-20                    1,479      inst_retired.any          #      0.4 IPC
            watchdog/3-26                    1,479      inst_retired.any          #      0.4 IPC
            watchdog/4-32                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/5-38                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/6-44                    1,479      inst_retired.any          #      0.7 IPC
            watchdog/7-50                    1,479      inst_retired.any          #      0.7 IPC
         kworker/u16:2-23146                 1,408      inst_retired.any          #      0.5 IPC
                  perf-24163             2,249,872      cpu_clk_unhalted.thread
                vmstat-23127             1,352,455      cpu_clk_unhalted.thread
              thermald-2841              1,161,140      cpu_clk_unhalted.thread
                  sshd-23111               807,827      cpu_clk_unhalted.thread
                 gmain-2700                375,535      cpu_clk_unhalted.thread
                  sshd-23058               194,071      cpu_clk_unhalted.thread
         kworker/u16:1-18249               114,306      cpu_clk_unhalted.thread
          rtkit-daemon-3288                103,547      cpu_clk_unhalted.thread
           kworker/0:2-19991                46,550      cpu_clk_unhalted.thread
             rcu_sched-8                    18,855      cpu_clk_unhalted.thread
          rtkit-daemon-3289                 17,549      cpu_clk_unhalted.thread
           kworker/4:1-15354                 8,812      cpu_clk_unhalted.thread
           kworker/5:2-31362                 6,812      cpu_clk_unhalted.thread
          kworker/4:1H-1887                  5,270      cpu_clk_unhalted.thread
           kworker/6:0-17528                 5,111      cpu_clk_unhalted.thread
           kworker/7:2-23145                 4,667      cpu_clk_unhalted.thread
            watchdog/0-11                    4,663      cpu_clk_unhalted.thread
            watchdog/1-14                    4,663      cpu_clk_unhalted.thread
            watchdog/4-32                    4,626      cpu_clk_unhalted.thread
            watchdog/5-38                    4,403      cpu_clk_unhalted.thread
            watchdog/3-26                    3,936      cpu_clk_unhalted.thread
            watchdog/2-20                    3,850      cpu_clk_unhalted.thread
         kworker/u16:2-23146                 2,654      cpu_clk_unhalted.thread
            watchdog/6-44                    2,017      cpu_clk_unhalted.thread
            watchdog/7-50                    2,017      cpu_clk_unhalted.thread
                vmstat-23127             2,000,783      inst_retired.any          #      0.7 CPI
              thermald-2841              1,472,670      inst_retired.any          #      0.8 CPI
                  sshd-23111               977,374      inst_retired.any          #      0.8 CPI
                  perf-24163               495,037      inst_retired.any          #      4.7 CPI
                 gmain-2700                341,213      inst_retired.any          #      1.1 CPI
                  sshd-23058               148,891      inst_retired.any          #      1.3 CPI
          rtkit-daemon-3288                 71,210      inst_retired.any          #      1.5 CPI
         kworker/u16:1-18249                39,562      inst_retired.any          #      2.9 CPI
             rcu_sched-8                    14,474      inst_retired.any          #      1.3 CPI
           kworker/0:2-19991                 7,659      inst_retired.any          #      6.1 CPI
           kworker/4:1-15354                 6,714      inst_retired.any          #      1.3 CPI
          rtkit-daemon-3289                  4,839      inst_retired.any          #      3.6 CPI
           kworker/6:0-17528                 3,321      inst_retired.any          #      1.5 CPI
           kworker/5:2-31362                 3,215      inst_retired.any          #      2.1 CPI
           kworker/7:2-23145                 3,173      inst_retired.any          #      1.5 CPI
          kworker/4:1H-1887                  1,719      inst_retired.any          #      3.1 CPI
            watchdog/0-11                    1,479      inst_retired.any          #      3.2 CPI
            watchdog/1-14                    1,479      inst_retired.any          #      3.2 CPI
            watchdog/2-20                    1,479      inst_retired.any          #      2.6 CPI
            watchdog/3-26                    1,479      inst_retired.any          #      2.7 CPI
            watchdog/4-32                    1,479      inst_retired.any          #      3.1 CPI
            watchdog/5-38                    1,479      inst_retired.any          #      3.0 CPI
            watchdog/6-44                    1,479      inst_retired.any          #      1.4 CPI
            watchdog/7-50                    1,479      inst_retired.any          #      1.4 CPI
         kworker/u16:2-23146                 1,408      inst_retired.any          #      1.9 CPI
                  perf-24163             2,302,323      cycles
                vmstat-23127             1,352,455      cycles
              thermald-2841              1,161,140      cycles
                  sshd-23111               807,827      cycles
                 gmain-2700                375,535      cycles
                  sshd-23058               194,071      cycles
         kworker/u16:1-18249               114,306      cycles
          rtkit-daemon-3288                103,547      cycles
           kworker/0:2-19991                46,550      cycles
             rcu_sched-8                    18,855      cycles
          rtkit-daemon-3289                 17,549      cycles
           kworker/4:1-15354                 8,812      cycles
           kworker/5:2-31362                 6,812      cycles
          kworker/4:1H-1887                  5,270      cycles
           kworker/6:0-17528                 5,111      cycles
           kworker/7:2-23145                 4,667      cycles
            watchdog/0-11                    4,663      cycles
            watchdog/1-14                    4,663      cycles
            watchdog/4-32                    4,626      cycles
            watchdog/5-38                    4,403      cycles
            watchdog/3-26                    3,936      cycles
            watchdog/2-20                    3,850      cycles
         kworker/u16:2-23146                 2,654      cycles
            watchdog/6-44                    2,017      cycles
            watchdog/7-50                    2,017      cycles
      
             2.175726600 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      29734550
    • J
      perf stat: Remove --per-thread pid/tid limitation · 1d9f8d1b
      Jin Yao 提交于
      Currently, if we execute 'perf stat --per-thread' without specifying
      pid/tid, perf will return error.
      
      root@skl:/tmp# perf stat --per-thread
      The --per-thread option is only available when monitoring via -p -t options.
          -p, --pid <pid>       stat events on existing process id
          -t, --tid <tid>       stat events on existing thread id
      
      This patch removes this limitation. If no pid/tid specified, it returns
      all threads (get threads from /proc).
      
      Note that it doesn't support cpu_list yet so if it's a cpu_list case,
      then skip.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-11-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d9f8d1b
    • J
      perf thread_map: Enumerate all threads from /proc · 73c0ca1e
      Jin Yao 提交于
      This patch calls thread_map__new_all_cpus() to enumerate all threads
      from /proc if per-thread flag is enabled.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-10-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      73c0ca1e
    • J
      perf stat: Update or print per-thread stats · 14e72a21
      Jin Yao 提交于
      If the stats pointer in stat_config structure is not null, it will
      update the per-thread stats or print the per-thread stats on this
      buffer.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-9-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      14e72a21
    • J
      perf stat: Allocate shadow stats buffer for threads · 56739444
      Jin Yao 提交于
      After perf_evlist__create_maps() being executed, we can get all threads
      from /proc. And via thread_map__nr(), we can also get the number of
      threads.
      
      With the number of threads, the patch allocates a buffer which will
      record the shadow stats for these threads.
      
      The buffer pointer is saved in stat_config.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-8-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56739444
    • J
      perf stat: Remove a set of shadow stats static variables · 6a1e2c5c
      Jin Yao 提交于
      In previous patches, we have reconstructed the code and let it not
      access the static variables directly.
      
      This patch removes these static variables.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-7-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6a1e2c5c
    • J
      perf stat: Print per-thread shadow stats · e0128b30
      Jin Yao 提交于
      The function perf_stat__print_shadow_stats() is called to print the
      shadow stats on a set of static variables.
      
      But the static variables are the limitations to support
      per-thread shadow stats.
      
      This patch lets the perf_stat__print_shadow_stats() support
      to print the shadow stats from a input parameter 'st'.
      
      It will not directly get value from static variable. Instead,
      it now uses runtime_stat_avg() and runtime_stat_n() to get and
      compute the values.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-6-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e0128b30
    • J
      perf stat: Update per-thread shadow stats · 1fcd0394
      Jin Yao 提交于
      The functions perf_stat__update_shadow_stats() is called to update the
      shadow stats on a set of static variables.
      
      But the static variables are the limitations to be extended to support
      per-thread shadow stats.
      
      This patch lets the perf_stat__update_shadow_stats() support to update
      the shadow stats on a input parameter 'st' and uses
      update_runtime_stat() to update the stats. It will not directly update
      the static variables as before.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-5-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1fcd0394
    • J
      perf stat: Create the runtime_stat init/exit function · 8efb2df1
      Jin Yao 提交于
      It mainly initializes and releases the rblist which is defined in struct
      runtime_stat.
      
      For the original rblist 'runtime_saved_values', it's still kept there
      for keeping the patch bisectable.
      
      The rblist 'runtime_saved_values' will be removed in later patch at
      switching time.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-4-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8efb2df1
    • J
      perf stat: Extend rbtree to support per-thread shadow stats · 49cd456a
      Jin Yao 提交于
      Previously the rbtree was used to link generic metrics.
      
      This patches adds new ctx/type/stat into rbtree keys because we will use
      this rbtree to maintain shadow metrics to replace original a couple of
      static arrays for supporting per-thread shadow stats.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      49cd456a
    • J
      perf stat: Define a structure for per-thread shadow stats · e5fcc2ab
      Jin Yao 提交于
      Perf has a set of static variables to record the runtime shadow metrics
      stats.
      
      While if we want to record the runtime shadow stats for per-thread, it
      will be the limitation. This patch creates a structure and the next
      patches will use this structure to update the runtime shadow stats for
      per-thread.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e5fcc2ab
  2. 18 12月, 2017 4 次提交
    • A
      tools arch s390: Do not include header files from the kernel sources · 10b9baa7
      Arnaldo Carvalho de Melo 提交于
      Long ago we decided to be verbotten including files in the kernel git
      sources from tools/ living source code, to avoid disturbing kernel
      development (and perf's and other tools/) when, say, a kernel hacker
      adds something, tests everything but tools/ and have tools/ build
      broken.
      
      This got broken recently by s/390, fix it by copying
      arch/s390/include/uapi/asm/perf_regs.h to tools/arch/s390/include/uapi/asm/,
      making this one be used by means of <asm/perf_regs.h> and updating
      tools/perf/check_headers.sh to make sure we are notified when the
      original changes, so that we can check if anything is needed on the
      tooling side.
      
      This would have been caught by the 'tarkpg' test entry in:
      
      $ make -C tools/perf build-test
      
      When run on a s/390 build system or container.
      Acked-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: f704ef44 ("s390/perf: add support for perf_regs and libdw")
      Link: https://lkml.kernel.org/n/tip-n57139ic0v9uffx8wdqi3d8a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      10b9baa7
    • B
      perf jvmti: Generate correct debug information for inlined code · ca58d7e6
      Ben Gainey 提交于
      tools/perf/jvmti is broken in so far as it generates incorrect debug
      information. Specifically it attributes all debug lines to the original
      method being output even in the case that some code is being inlined
      from elsewhere.  This patch fixes the issue.
      
      To test (from within linux/tools/perf):
      
      export JDIR=/usr/lib/jvm/java-8-openjdk-amd64/
      make
      cat << __EOF > Test.java
      public class Test
      {
          private StringBuilder b = new StringBuilder();
      
          private void loop(int i, String... args)
          {
              for (String a : args)
                  b.append(a);
      
              long hc = b.hashCode() * System.nanoTime();
      
              b = new StringBuilder();
              b.append(hc);
      
              System.out.printf("Iteration %d = %d\n", i, hc);
          }
      
          public void run(String... args)
          {
              for (int i = 0; i < 10000; ++i)
              {
                  loop(i, args);
              }
          }
      
          public static void main(String... args)
          {
              Test t = new Test();
              t.run(args);
          }
      }
      __EOF
      $JDIR/bin/javac Test.java
      ./perf record -F 10000 -g -k mono $JDIR/bin/java -agentpath:`pwd`/libperf-jvmti.so Test
      ./perf inject --jit -i perf.data -o perf.data.jitted
      ./perf annotate -i perf.data.jitted --stdio | grep Test\.java: | sort -u
      
      Before this patch, Test.java line numbers get reported that are greater
      than the number of lines in the Test.java file.  They come from the
      source file of the inlined function, e.g. java/lang/String.java:1085.
      For further validation one can examine those lines in the JDK source
      distribution and confirm that they map to inlined functions called by
      Test.java.
      
      After this patch, the filename of the inlined function is output
      rather than the incorrect original source filename.
      Signed-off-by: NBen Gainey <ben.gainey@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Colin King <colin.king@canonical.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 598b7c69 ("perf jit: add source line info support")
      Link: http://lkml.kernel.org/r/20171122182541.d25599a3eb1ada3480d142fa@arm.comSigned-off-by: NKim Phillips <kim.phillips@arm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ca58d7e6
    • J
      perf tools: Fix up build in hardened environments · 61fb26a6
      Jiri Olsa 提交于
      On Fedora systems the perl and python CFLAGS/LDFLAGS include the
      hardened specs from redhat-rpm-config package. We apply them only for
      perl/python objects, which makes them not compatible with the rest of
      the objects and the build fails with:
      
        /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -f
      +PIC
        /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile w
      +ith -fPIC
        /usr/bin/ld: final link failed: Nonrepresentable section on output
        collect2: error: ld returned 1 exit status
        make[2]: *** [Makefile.perf:507: perf] Error 1
        make[1]: *** [Makefile.perf:210: sub-make] Error 2
        make: *** [Makefile:69: all] Error 2
      
      Mainly it's caused by perl/python objects being compiled with:
      
        -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
      
      which prevent the final link impossible, because it will check
      for 'proper' objects with following option:
      
        -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20171204082437.GC30564@kravaSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61fb26a6
    • J
      perf tools: Use shell function for perl cflags retrieval · 5cfee7a3
      Jiri Olsa 提交于
      Using the shell function for perl CFLAGS retrieval instead of back
      quotes (``). Both execute shell with the command, but the latter is more
      explicit and seems to be the preferred way.
      
      Also we don't have any other use of the back quotes in perf Makefiles.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171108102739.30338-2-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5cfee7a3
  3. 15 12月, 2017 1 次提交
  4. 12 12月, 2017 1 次提交
  5. 07 12月, 2017 1 次提交
    • I
      tooling/headers: Synchronize updated s390 and x86 UAPI headers · 34c9ca37
      Ingo Molnar 提交于
      There were two trivial updates to these upstream UAPI headers:
      
        arch/s390/include/uapi/asm/kvm.h
        arch/s390/include/uapi/asm/kvm_perf.h
        arch/x86/lib/x86-opcode-map.txt
      
      Synchronize them with their tooling copies.
      
      (The x86 opcode map includes a new instruction pattern now.)
      
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      34c9ca37
  6. 06 12月, 2017 6 次提交
    • W
      perf tools: Rename 'backward' to 'overwrite' in evlist, mmap and record · 0b72d69a
      Wang Nan 提交于
      Remove the backward/forward concept to make it uniform with user
      interface (the '--overwrite' option).
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Mengting Zhang <zhangmengting@huawei.com>
      Link: http://lkml.kernel.org/r/20171204165107.95327-4-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0b72d69a
    • W
      perf mmap: Don't discard prev in backward mode · 7fb4b407
      Wang Nan 提交于
      'perf record' can switch its output data file. The new output should
      only store the data after switching. However, in overwrite backward
      mode, the new output still can have data from before switching. That
      also brings extra overhead.
      
      At the end of mmap_read(), the position of the processed ring buffer is
      saved in md->prev. Next mmap_read should be end in md->prev if it is not
      overwriten. That avoids processing duplicate data.  However, md->prev is
      discarded. So next the mmap_read() has to process whole valid ring
      buffer, which probably includes old processed data.
      
      Avoid calling backward_rb_find_range() when md->prev is still
      available.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mengting Zhang <zhangmengting@huawei.com>
      Link: http://lkml.kernel.org/r/20171204165107.95327-3-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7fb4b407
    • W
      perf mmap: Fix perf backward recording · 71f566a3
      Wang Nan 提交于
      'perf record' backward recording doesn't work as we expected: it never
      overwrites when ring buffer gets full.
      
      Test:
      
      Run a busy python printing task background like this:
      
       while True:
           print 123
      
      send SIGUSR2 to perf to capture snapshot, then:
      
       # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output
       [ perf record: dump data: Woken up 1 times ]
       [ perf record: Dump perf.data.2017110101520743 ]
       [ perf record: dump data: Woken up 1 times ]
       [ perf record: Dump perf.data.2017110101521251 ]
       [ perf record: dump data: Woken up 1 times ]
       [ perf record: Dump perf.data.2017110101521692 ]
       ^C[ perf record: Woken up 1 times to write data ]
       [ perf record: Dump perf.data.2017110101521936 ]
       [ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ]
      
       # ./perf script -i ./perf.data.2017110101520743 | head -n3
                   perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
                   perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
                 python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4
       # ./perf script -i ./perf.data.2017110101521251 | head -n3
                   perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
                   perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
                 python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4
       # ./perf script -i ./perf.data.2017110101521692 | head -n3
                   perf  2717 [000] 12449.310785: raw_syscalls:sys_enter: NR 16 (5, 2400, 0, 59, 100, 0)
                   perf  2717 [000] 12449.310790: raw_syscalls:sys_enter: NR 7 (4112340, 2, ffffffff, 3df, 100, 0)
                 python  2545 [000] 12449.310800:  raw_syscalls:sys_exit: NR 1 = 4
      
      Timestamps never change, but my background task is a dead loop, can
      easily overwhelm the ring buffer.
      
      This patch fixes it by forcing unsetting PROT_WRITE for a backward ring
      buffer, so all backward ring buffers become overwrite ring buffers.
      
      Test result:
      
       # ./perf record --overwrite -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit --exclude-perf -a --switch-output
       [ perf record: dump data: Woken up 1 times ]
       [ perf record: Dump perf.data.2017110101285323 ]
       [ perf record: dump data: Woken up 1 times ]
       [ perf record: Dump perf.data.2017110101290053 ]
       [ perf record: dump data: Woken up 1 times ]
       [ perf record: Dump perf.data.2017110101290446 ]
       ^C[ perf record: Woken up 1 times to write data ]
       [ perf record: Dump perf.data.2017110101290837 ]
       [ perf record: Captured and wrote 0.826 MB perf.data.<timestamp> ]
       # ./perf script -i ./perf.data.2017110101285323 | head -n3
                 python  2545 [000] 11064.268083:  raw_syscalls:sys_exit: NR 1 = 4
                 python  2545 [000] 11064.268084: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
                 python  2545 [000] 11064.268086:  raw_syscalls:sys_exit: NR 1 = 4
       # ./perf script -i ./perf.data.2017110101290 | head -n3
       failed to open ./perf.data.2017110101290: No such file or directory
       # ./perf script -i ./perf.data.2017110101290053 | head -n3
                 python  2545 [000] 11071.564062: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
                 python  2545 [000] 11071.564064:  raw_syscalls:sys_exit: NR 1 = 4
                 python  2545 [000] 11071.564066: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
       # ./perf script -i ./perf.data.2017110101290 | head -n3
       perf.data.2017110101290053  perf.data.2017110101290446  perf.data.2017110101290837
       # ./perf script -i ./perf.data.2017110101290446 | head -n3
                   sshd  1321 [000] 11075.499473:  raw_syscalls:sys_exit: NR 14 = 0
                   sshd  1321 [000] 11075.499474: raw_syscalls:sys_enter: NR 14 (2, 7ffe98899490, 0, 8, 0, 3000)
                   sshd  1321 [000] 11075.499474:  raw_syscalls:sys_exit: NR 14 = 0
       # ./perf script -i ./perf.data.2017110101290837 | head -n3
                 python  2545 [000] 11079.280844:  raw_syscalls:sys_exit: NR 1 = 4
                 python  2545 [000] 11079.280847: raw_syscalls:sys_enter: NR 1 (1, 12cc330, 4, 7fc237280370, 7fc2373d0700, 2c7b0)
                 python  2545 [000] 11079.280850:  raw_syscalls:sys_exit: NR 1 = 4
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Mengting Zhang <zhangmengting@huawei.com>
      Link: http://lkml.kernel.org/r/20171204165107.95327-2-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      71f566a3
    • S
      perf report: Set browser mode right before setup_browser() · 712d36db
      Seokho Song 提交于
      There are codes that print messages to the screen between assignment of
      the use_browser variable and setup_browser().
      
      But since the GUI browser is not initialized during that period, all
      messages fail to show if the user passed the --gtk option to perf as GTK
      is not initialized yet.
      
      Reorder the code to assign use_browser variable right before
      setup_browser() is called.
      Signed-off-by: NSeokho Song <0xdevssh@gmail.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20171204160244.6332-1-0xdevssh@gmail.comSigned-off-by: NPark Ju Hyung <qkrwngud825@gmail.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      712d36db
    • W
      perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv · fbc2844e
      William Cohen 提交于
      The powerpc cpuid information includes chip revision information.
      Changes between chip revisions are usually minor bug fixes and usually
      do not affect the operation of the performance monitoring hardware.
      
      The original mapfile.csv matching requires enumerating every possible
      cpuid string.  When a new minor chip revision is produced a new entry
      has to be added to the mapfile.csv and the code recompiled to allow perf
      to have the implementation specific perf events for this new minor
      revision.  For users of various distibutions of Linux having to wait for
      a new release of the kernel's perf tool to be built with these trivial
      patches is inconvenient.
      
      Using regular expressions rather than exactly string matching of the
      entire cpuid string allows developers to write mapfile.csv files that do
      not require patches and recompiles for each of these minor version
      changes.  If special cases need to be made for some particular versions,
      they can be placed earlier in the mapfile.csv file before the more
      general matches.
      Signed-off-by: NWilliam Cohen <wcohen@redhat.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shriya <shriyak@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20171204145728.16792-1-wcohen@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fbc2844e
    • S
      perf c2c: Add a tip about cacheline events · 01251952
      Sangwon Hong 提交于
      Signed-off-by: NSangwon Hong <qpakzk@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1512188201-14109-1-git-send-email-qpakzk@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      01251952