1. 22 1月, 2015 2 次提交
    • A
      perf tools: Remove EOL whitespaces · 48000a1a
      Arnaldo Carvalho de Melo 提交于
      Janitorial stuff: boredom moment.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-u70i7shys3kths4hzru72bha@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      48000a1a
    • S
      perf mem: Enable sampling loads and stores simultaneously · 67121f85
      Stephane Eranian 提交于
      This patch modifies perf mem to default to sampling loads and stores
      simultaneously. It could only do one or the other before yet there was
      no hardware restriction preventing simultaneous collection. With this
      patch, one run is sufficient to collect both.
      
      It is still possible to sample only loads or stores by using the
      -t option:
       $ perf mem -t load rec
       $ perf mem -t load rep
      Or
       $ perf mem -t store rec
       $ perf mem -t store rep
      
      The perf report TUI will show one event at a time. The store output will
      contain a Weight column which will be empty.
      
      In V2, we updated the man pages to reflect the change and also simplify
      the initialization of the argv vector passed to the cmd_*() functions as
      per LKML feedback.
      
      In V3, we fixed typos in the changelog.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joe Mario <jmario@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Richard Fowles <rfowles@redhat.com>
      Link: http://lkml.kernel.org/r/20141217152355.GA10053@thinkpadSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      67121f85
  2. 09 12月, 2014 1 次提交
  3. 02 12月, 2014 2 次提交
    • A
      perf report: Add --branch-history option · fa94c36c
      Andi Kleen 提交于
      Add a --branch-history option to perf report that changes all the
      settings necessary for using the branches in callstacks.
      
      This is just a short cut to make this nicer to use, it does not enable
      any functionality by itself.
      
      v2: Change sort order. Rename option to --branch-history to
          be less confusing.
      v3: Updates
      v4: Fix conflict with newer perf base
      v5: Port to latest tip
      v6: Add more comments. Remove CCKEY_ADDRESS setting. Remove
          unnecessary branch_mode setting. Use a boolean.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1415844328-4884-5-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fa94c36c
    • A
      perf callchain: Support handling complete branch stacks as histograms · 8b7bad58
      Andi Kleen 提交于
      Currently branch stacks can be only shown as edge histograms for
      individual branches. I never found this display particularly useful.
      
      This implements an alternative mode that creates histograms over
      complete branch traces, instead of individual branches, similar to how
      normal callgraphs are handled. This is done by putting it in front of
      the normal callgraph and then using the normal callgraph histogram
      infrastructure to unify them.
      
      This way in complex functions we can understand the control flow that
      lead to a particular sample, and may even see some control flow in the
      caller for short functions.
      
      Example (simplified, of course for such simple code this is usually not
      needed), please run this after the whole patchkit is in, as at this
      point in the patch order there is no --branch-history, that will be
      added in a patch after this one:
      
      tcall.c:
      
      volatile a = 10000, b = 100000, c;
      
      __attribute__((noinline)) f2()
      {
      	c = a / b;
      }
      
      __attribute__((noinline)) f1()
      {
      	f2();
      	f2();
      }
      main()
      {
      	int i;
      	for (i = 0; i < 1000000; i++)
      		f1();
      }
      
      % perf record -b -g ./tsrc/tcall
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
      % perf report --no-children --branch-history
      ...
          54.91%  tcall.c:6  [.] f2                      tcall
                  |
                  |--65.53%-- f2 tcall.c:5
                  |          |
                  |          |--70.83%-- f1 tcall.c:11
                  |          |          f1 tcall.c:10
                  |          |          main tcall.c:18
                  |          |          main tcall.c:18
                  |          |          main tcall.c:17
                  |          |          main tcall.c:17
                  |          |          f1 tcall.c:13
                  |          |          f1 tcall.c:13
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:12
                  |          |          f1 tcall.c:12
                  |          |          f2 tcall.c:7
                  |          |          f2 tcall.c:5
                  |          |          f1 tcall.c:11
                  |          |
                  |           --29.17%-- f1 tcall.c:12
                  |                     f1 tcall.c:12
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:11
                  |                     f1 tcall.c:10
                  |                     main tcall.c:18
                  |                     main tcall.c:18
                  |                     main tcall.c:17
                  |                     main tcall.c:17
                  |                     f1 tcall.c:13
                  |                     f1 tcall.c:13
                  |                     f2 tcall.c:7
                  |                     f2 tcall.c:5
                  |                     f1 tcall.c:12
      
      The default output is unchanged.
      
      This is only implemented in perf report, no change to record or anywhere
      else.
      
      This adds the basic code to report:
      
      - add a new "branch" option to the -g option parser to enable this mode
      - when the flag is set include the LBR into the callstack in machine.c.
      
      The rest of the history code is unchanged and doesn't know the
      difference between LBR entry and normal call entry.
      
      - detect overlaps with the callchain
      - remove small loop duplicates in the LBR
      
      Current limitations:
      
      - The LBR flags (mispredict etc.) are not shown in the history
      and LBR entries have no special marker.
      - It would be nice if annotate marked the LBR entries somehow
      (e.g. with arrows)
      
      v2: Various fixes.
      v3: Merge further patches into this one. Fix white space.
      v4: Improve manpage. Address review feedback.
      v5: Rename functions. Better error message without -g. Fix crash without
          -b.
      v6: Rebase
      v7: Rebase. Use NO_ENTRY in memset.
      v8: Port to latest tip. Move add_callchain_ip to separate
          patch. Skip initial entries in callchain. Minor cleanups.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b7bad58
  4. 16 11月, 2014 1 次提交
  5. 18 10月, 2014 1 次提交
    • J
      perf script: Add period data column · 535aeaae
      Jiri Olsa 提交于
      Adding period data column to be displayed in perf script.  It's possible
      to get period values using -f option, like:
      
        $ perf script -f comm,tid,time,period,ip,sym,dso
                :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
                :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
                :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
                :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
                    ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
                    ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
                    ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
                    ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
                    ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
                    ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: "Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1408977943-16594-9-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      535aeaae
  6. 16 10月, 2014 1 次提交
  7. 18 9月, 2014 1 次提交
  8. 12 8月, 2014 1 次提交
  9. 25 7月, 2014 1 次提交
  10. 17 7月, 2014 2 次提交
  11. 10 7月, 2014 2 次提交
  12. 27 6月, 2014 2 次提交
    • S
      perf trace: Add possibility to switch off syscall events · e281a960
      Stanislav Fomichev 提交于
      Currently, we may either trace syscalls or syscalls+pagefaults.
      
      We'd like to be able to trace *only* pagefaults and this commit
      implements this feature.
      
      Example:
      
        [root@zoo /]# echo 1 > /proc/sys/vm/drop_caches ; trace --no-syscalls -F -p `pidof xchat`
             0.000 ( 0.000 ms): xchat/4574 majfault [g_unichar_get_script+0x11] => /usr/lib64/libglib-2.0.so.0.3800.2@0xc403b (x.)
             0.202 ( 0.000 ms): xchat/4574 majfault [_cairo_hash_table_lookup+0x53] => 0x2280ff0 (?.)
            20.854 ( 0.000 ms): xchat/4574 majfault [gdk_cairo_set_source_pixbuf+0x110] => /usr/bin/xchat@0x6da1f (x.)
          1022.000 ( 0.000 ms): xchat/4574 majfault [__memcpy_sse2_unaligned+0x29] => 0x7ff5a8ca0400 (?.)
        ^C[root@zoo /]#
      
      Below we can see malloc calls, 'trace' reading symbol tables in libraries to
      resolve symbols, etc.
      
        [root@zoo /]# echo 1 > /proc/sys/vm/drop_caches ; trace --no-syscalls -F all --cpu 1 sleep 10
             0.000 ( 0.000 ms): chrome/26589 minfault [0x1b53129] => /tmp/perf-26589.map@0x33cbcbf7f000 (x.)
            96.477 ( 0.000 ms): libvirtd/947 minfault [copy_user_enhanced_fast_string+0x5] => 0x7f7685bba000 (?k)
           113.164 ( 0.000 ms): Xorg/1063 minfault [0x786da] => 0x7fce52882a3c (?.)
          7162.801 ( 0.000 ms): chrome/3747 minfault [0x8e1a89] => 0xfcaefed0008 (?.)
      <SNIP>
          7773.138 ( 0.000 ms): chrome/3886 minfault [0x8e1a89] => 0xfcb0ce28008 (?.)
          7992.022 ( 0.000 ms): chrome/26574 minfault [0x1b5a708] => 0x3de7b5fc5000 (?.)
          8108.949 ( 0.000 ms): qemu-system-x8/4537 majfault [_int_malloc+0xee] => 0x7faffc466d60 (?.)
          8108.975 ( 0.000 ms): qemu-system-x8/4537 minfault [_int_malloc+0x102] => 0x7faffc466d60 (?.)
      <SNIP>
          8148.174 ( 0.000 ms): qemu-system-x8/4537 minfault [_int_malloc+0x102] => 0x7faffc4eb500 (?.)
          8270.855 ( 0.000 ms): chrome/26245 minfault [do_bo_emit_reloc+0xdb] => 0x45d092bc004 (?.)
          8270.869 ( 0.000 ms): chrome/26245 minfault [do_bo_emit_reloc+0x108] => 0x45d09150000 (?.)
      no symbols found in /usr/lib64/libspice-server.so.1.9.0, maybe install a debug package?
          8273.831 ( 0.000 ms): trace/20198 majfault [__memcmp_sse4_1+0xbc6] => /usr/lib64/libspice-server.so.1.9.0@0xdf000 (d.)
      <SNIP>
          8275.121 ( 0.000 ms): trace/20198 minfault [dso__load+0x38] => 0x14fe756 (?.)
      no symbols found in /usr/lib64/libelf-0.158.so, maybe install a debug package?
          8275.142 ( 0.000 ms): trace/20198 minfault [__memcmp_sse4_1+0xbc6] => /usr/lib64/libelf-0.158.so@0x0 (d.)
      <SNIP>
        [root@zoo /]#
      Signed-off-by: NStanislav Fomichev <stfomichev@yandex-team.ru>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1403799268-1367-6-git-send-email-stfomichev@yandex-team.ruSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e281a960
    • S
      perf trace: Add support for pagefault tracing · 598d02c5
      Stanislav Fomichev 提交于
      This patch adds optional pagefault tracing support to 'perf trace'.
      
      Using -F/--pf option user can specify whether he wants minor, major or
      all pagefault events to be traced. This patch adds only live mode,
      record and replace will come in a separate patch.
      
      Example output:
      
        1756272.905 ( 0.000 ms): curl/5937 majfault [0x7fa7261978b6] => /usr/lib/x86_64-linux-gnu/libkrb5.so.26.0.0@0x85288 (d.)
        1862866.036 ( 0.000 ms): wget/8460 majfault [__clear_user+0x3f] => 0x659cb4 (?k)
      Signed-off-by: NStanislav Fomichev <stfomichev@yandex-team.ru>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1403799268-1367-3-git-send-email-stfomichev@yandex-team.ruSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      598d02c5
  13. 20 6月, 2014 1 次提交
    • D
      perf bench: Add --repeat option · b6f0629a
      Davidlohr Bueso 提交于
      There are a number of benchmarks that do single runs and as a result
      does not really help users gain a general idea of how the workload
      performs. So the user must either manually do multiple runs or just use
      single bogus results.
      
      This option will enable users to specify the amount of runs (arbitrarily
      defaulted to 10, to use the existing benchmarks default) through the
      '--repeat' option.  Add it to perf-bench instead of implementing it
      always in each specific benchmark.
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1402942467-10671-2-git-send-email-davidlohr@hp.com
      [ Kept the existing default of 10, changing it to something else should
        be done on separate patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b6f0629a
  14. 10 6月, 2014 1 次提交
  15. 09 6月, 2014 2 次提交
    • D
      perf tools: Add dcacheline sort · 9b32ba71
      Don Zickus 提交于
      In perf's 'mem-mode', one can get access to a whole bunch of details specific to a
      particular sample instruction.  A bunch of those details relate to the data
      address.
      
      One interesting thing you can do with data addresses is to convert them into a unique
      cacheline they belong too.  Organizing these data cachelines into similar groups and sorting
      them can reveal cache contention.
      
      This patch creates an alogorithm based on various sample details that can help group
      entries together into data cachelines and allows 'perf report' to sort on it.
      
      The algorithm relies on having proper mmap2 support in the kernel to help determine
      if the memory map the data address belongs to is private to a pid or globally shared.
      
      The alogortithm is as follows:
      
      o group cpumodes together
      o group entries with discovered maps together
      o sort on major, minor, inode and inode generation numbers
      o if userspace anon, then sort on pid
      o sort on cachelines based on data addresses
      
      The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'.
      
      Sample output:
      
       #
       # Samples: 206  of event 'cpu/mem-loads/pp'
       # Total weight : 2534
       # Sort order   : dcacheline,pid
       #
       # Overhead       Samples                                                          Data Cacheline       Command:  Pid
       # ........  ............  ......................................................................  ..................
       #
          13.22%             1  [k] 0xffff88042f08ebc0                                                       swapper:    0
           9.27%             1  [k] 0xffff88082e8cea80                                                       swapper:    0
           3.59%             2  [k] 0xffffffff819ba180                                                       swapper:    0
           0.32%             1  [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0       swapper:    0
           0.32%             1  [k] timekeeper_seq+0xfffffffffffffff8                                        swapper:    0
      
      Note:  Added a '+1' to symlen size in hists__calc_col_len to prevent the next column
      from prematurely tabbing over and mis-aligning.  Not sure what the problem is.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      9b32ba71
    • D
      perf report: Add mem-mode documentation to report command · 75e906c9
      Don Zickus 提交于
      Add mem-mode sorting types and mem-mode itself to perf-report documentation.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/r/1400526833-141779-5-git-send-email-dzickus@redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
      75e906c9
  16. 05 6月, 2014 1 次提交
  17. 01 6月, 2014 2 次提交
  18. 21 5月, 2014 3 次提交
  19. 16 4月, 2014 3 次提交
  20. 14 4月, 2014 2 次提交
  21. 14 3月, 2014 2 次提交
  22. 15 1月, 2014 2 次提交
  23. 13 1月, 2014 2 次提交
  24. 18 12月, 2013 1 次提交
  25. 17 12月, 2013 1 次提交