1. 23 2月, 2015 12 次提交
  2. 19 2月, 2015 22 次提交
  3. 14 2月, 2015 3 次提交
    • A
      perf trace: Support --events foo:bar --no-syscalls · 726f3234
      Arnaldo Carvalho de Melo 提交于
      I.e. support tracing just tracepoints, without strace like
      raw_syscalls:*.
      
      [acme@ssdandy linux]$ trace --no-sys --ev sched:*exec,sched:*switch,sched:*exit usleep 1
        0.048 (     ): sched:sched_process_exec:filename=/usr/bin/usleep pid=27298 old_pid=27298)
        0.369 (     ): sched:sched_switch:usleep:27298 [120] S ==> swapper/5:0 [120])
        0.452 (     ): sched:sched_process_exit:comm=usleep pid=27298 prio=120)
      [acme@ssdandy linux]$
      
      TODO: remove that (...) thing when --no-syscalls is specified.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-vn0hsixsbhm31b2rpj97r96k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      726f3234
    • A
      perf trace: Allow mixing with other events · 14a052df
      Arnaldo Carvalho de Melo 提交于
      Basically adopting 'perf record' --event command line argument syntax:
      
       # trace -e \!mprotect,mmap,munmap,open,close,read,fstat,access,arch_prctl --event sched:*switch,sched:*exec,sched:*exit usleep 1
        0.048 (        ): sched:sched_process_exec:filename=/bin/usleep pid=24732 old_pid=24732)
        0.078 (0.002 ms): usleep/24732 brk(                          ) = 0x78f000
        0.430 (0.002 ms): usleep/24732 brk(                          ) = 0x78f000
        0.434 (0.003 ms): usleep/24732 brk(brk: 0x7b0000             ) = 0x7b0000
        0.438 (0.001 ms): usleep/24732 brk(                          ) = 0x7b0000
        0.460 (0.004 ms): usleep/24732 nanosleep(rqtp: 0x7ffff3696a40) ...
        0.460 (        ): sched:sched_switch:prev_comm=usleep prev_pid=24732 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120)
        0.515 (0.058 ms): usleep/24732  ... [continued]: nanosleep()) = 0
        0.520 (0.000 ms): usleep/24732 exit_group(
        0.550 (        ): sched:sched_process_exit:comm=usleep pid=24732 prio=120)
       #
      
      Next steps, probably in this order:
      
      1) Use ordered_events code, the logic in trace needs the events to be
         time ordered when needed, i.e. when multiple CPUs are involved.
      
      2) Callchains!
      
      3) Automatically account for interruptions when saying how long things
         took.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-gpst8mph575yb4wgf91qibyb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      14a052df
    • A
      perf trace: Handle multiple threads better wrt syscalls being intermixed · e596663e
      Arnaldo Carvalho de Melo 提交于
       $ trace time taskset -c 0 usleep 1
         0.845 ( 0.021 ms): time/16722 wait4(upid: 4294967295, stat_addr: 0x7fff17f443d4, ru: 0x7fff17f44438 ) ...
         0.865 ( 0.008 ms): time/16723 execve(arg0: 140733595272004, arg1: 140733595272720, arg2: 140733595272768, arg3: 139755107218496, arg4: 7307199665339051828, arg5: 3) = -2
         2.395 ( 1.523 ms): taskset/16723 execve(arg0: 140733595272013, arg1: 140733595272720, arg2: 140733595272768, arg3: 139755107218496, arg4: 7307199665339051828, arg5: 3) = 0
         2.411 ( 0.002 ms): taskset/16723 brk(                                                                  ) = 0x1915000
         3.300 ( 0.058 ms): usleep/16723 nanosleep(rqtp: 0x7ffff4ada190                                        ) = 0
       <SNIP>
         3.305 ( 0.000 ms): usleep/16723 exit_group(
         3.363 ( 2.539 ms): time/16722  ... [continued]: wait4()) = 16723
         3.366 ( 0.001 ms): time/16722 rt_sigaction(sig: INT, act: 0x7fff17f44160, oact: 0x7fff17f44200, sigsetsize: 8) = 0
      
      We we're not seeing this line:
      
        0.845 ( 0.021 ms): time/16722 wait4(upid: 4294967295, stat_addr: 0x7fff17f443d4, ru: 0x7fff17f44438 ) ...
      
      just the one when it finishes:
      
        3.363 ( 2.539 ms): time/16722  ... [continued]: wait4()) = 16723
      
      Still some issues left till we move to ordered_samples when multiple
      CPUs/threads are involved...
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-zq9x30a1ky3djqewqn2v3ja3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e596663e
  4. 13 2月, 2015 3 次提交
    • A
      perf trace: Print thread info when following children · 42052bea
      Arnaldo Carvalho de Melo 提交于
      The default for 'trace workload' is to set perf_event_attr.inherit to 1,
      i.e. to make it equivalent to 'strace -f workload', so we were ending
      with syscalls for multiple processes mixed up, fix it:
      
      Before:
      
        [root@ssdandy ~]# trace -e brk time usleep 1
           0.071 ( 0.002 ms): brk(              ) = 0x100e000
           0.802 ( 0.001 ms): brk(              ) = 0x1d99000
           1.132 ( 0.003 ms): brk(              ) = 0x1d99000
           1.136 ( 0.003 ms): brk(brk: 0x1dba000) = 0x1dba000
           1.140 ( 0.001 ms): brk(              ) = 0x1dba000
        0.00user 0.00system 0:00.00elapsed 63%CPU (0avgtext+0avgdata 528maxresident)k
        0inputs+0outputs (0major+181minor)pagefaults 0swaps
        [root@ssdandy ~]#
      
      After:
      
        [root@ssdandy ~]# trace -f -e brk time usleep 1
           0.072 ( 0.002 ms): time/26308 brk(               ) = 0x1e6e000
           0.860 ( 0.001 ms): usleep/26309 brk(             ) = 0xb91000
           1.193 ( 0.003 ms): usleep/26309 brk(             ) = 0xb91000
           1.197 ( 0.003 ms): usleep/26309 brk(brk: 0xbb2000) = 0xbb2000
           1.201 ( 0.001 ms): usleep/26309 brk(             ) = 0xbb2000
        0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 524maxresident)k
        0inputs+0outputs (0major+180minor)pagefaults 0swaps
        [root@ssdandy ~]#
      
      BTW: to achieve the 'strace workload' behaviour, i.e. without a explicit
      '-f', one has to use --no-inherit.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      echo Link: http://lkml.kernel.org/n/tip-`ranpwd -l 24`@git.kernel.org
      Link: http://lkml.kernel.org/n/tip-2wu2d5n65msxoq1i7vtcaft2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      42052bea
    • Y
      perf list: Place the header text in its right position · 619a303c
      Yunlong Song 提交于
      The hearer text 'List of pre-defined events (to be used in -e):' is
      placed in an improper function, which causes an abnormal output, e.g.
      'perf list hw' shows no guiding text at all, and 'perf list hw
      L1-dcache*' shows the guiding text incorrectly in the middle of the
      output.
      
      Example
      Before this patch:
      
       $ perf list hw L1-dcache*
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
       List of pre-defined events (to be used in -e):              <-- incorrect position
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      
      After this patch:
      
       $ perf list hw L1-dcache*
      
       List of pre-defined events (to be used in -e):              <-- correct position
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      Signed-off-by: NYunlong Song <yunlong.song@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1423833115-11199-8-git-send-email-yunlong.song@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      619a303c
    • K
      perf: Remove the extra validity check on nr_pages · 74390aa5
      Kaixu Xia 提交于
      The function is_power_of_2() also do the check on nr_pages, so the first
      check performed is unnecessary. On the other hand, the key point is to
      ensure @nr_pages is a power-of-two number and mostly @nr_pages is a
      nonzero value, so in the most cases, the function is_power_of_2() will
      be called.
      Signed-off-by: NKaixu Xia <xiakaixu@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Link: http://lkml.kernel.org/r/1422352512-75150-1-git-send-email-xiakaixu@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      74390aa5