1. 14 8月, 2013 1 次提交
  2. 12 8月, 2013 1 次提交
  3. 08 8月, 2013 3 次提交
    • M
      perf tools: Add support for pinned modifier · e9a7c414
      Michael Ellerman 提交于
      This commit adds support for a new modifier "D", which requests that the
      event, or group of events, be pinned to the PMU.
      
      The "p" modifier is already taken for precise, and "P" may be used in
      future to mean "fully precise".
      
      So we use "D", which stands for pinneD - and looks like a padlock, or if
      you're using the ":D" syntax perf smiles at you.
      
      This is an oft-requested feature from our HW folks, who want to be able
      to run a large number of events, but also want 100% accurate results for
      instructions per cycle.
      
      Comparison of results with and without pinning:
      
      $ perf stat -e '{cycles,instructions}:D' -e cycles,instructions,...
      
        79,590,480,683 cycles         #  0.000 GHz
       166,123,716,524 instructions   #  2.09  insns per cycle
                                      #  0.11  stalled cycles per insn
      
        79,352,134,463 cycles         #  0.000 GHz                     [11.11%]
       165,178,301,818 instructions   #  2.08  insns per cycle
                                      #  0.11  stalled cycles per insn [11.13%]
      
      As you can see although perf does a very good job of scaling the values
      in the non-pinned case, there is some small discrepancy.
      
      The patch is fairly straight forward, the one detail is that we need to
      make sure we only request pinning for the group leader when we have a
      group.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1375795686-4226-1-git-send-email-michael@ellerman.id.au
      [ Use perf_evsel__is_group_leader instead of open coded equivalent, as
        suggested by Jiri Olsa ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e9a7c414
    • A
      perf stat: Add support for --initial-delay option · 41191688
      Andi Kleen 提交于
      When measuring workloads the startup phase -- doing page faults, dynamic
      linking, opening files -- is often very different from the rest of the
      workload.  Especially with smaller kernels and using counter
      multiplexing this can give significant measurement errors.
      
      Multiplexing assumes that the workload is mostly the same over longer
      periods. But at startup there is typically some spike of activity which
      is relatively short.  If many groups are multiplexing the one group
      seeing the spike, and which is then scaled up over the time to run all
      groups, may see a significant error.
      
      Also in general it's often not useful to measure the startup, because it
      is so different from the rest.
      
      One way around this is to use interval mode and discard the first
      sample, but this can be awkward because interval mode doesn't support
      intervals of less than 100ms, and also a useful interval is not
      necessarily the same as a useful startup delay.
      
      This patch adds a new --initial-delay / -D option to skip measuring for
      the startup phase. The time can be specified in ms
      
      Here's a simple example:
      
      perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
      ...
                   3,721 page-faults
      ...
      
      If we just wait 20 ms the number of page faults is 1/3 less:
      
      perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
      ...
                   2,823 page-faults
      ...
      
      So we filtered out most of the startup noise from bash.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1375490473-1503-4-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      41191688
    • J
      perf tools: Add 'S' event/group modifier to read sample value · 3c176311
      Jiri Olsa 提交于
      Adding 'S' event/group modifier to specify that the event value/s are
      read by PERF_SAMPLE_READ sample type processing, instead of the period
      value offered by lower layers.
      
      There's additional behaviour change for 'S' modifier being specified on
      event group:
      
      Currently all the events within a group makes samples. If user now
      specifies 'S' within group modifier, only the leader will trigger
      samples. The rest of events in the group will have sampling disabled.
      
      And same as for single events, values of all events within the group
      (including leader) are read by PERF_SAMPLE_READ sample type processing.
      
      Following example will create event group with cycles and cache-misses
      events, setting the cycles as group leader and the only event to
      actually sample. Both cycles and cache-misses event period values are
      read by PERF_SAMPLE_READ sample type processing with PERF_FORMAT_GROUP
      read format.
      
      Example:
      
        $ perf record -e '{cycles,cache-misses}:S' ls
        ...
        $ perf report --group --show-total-period --stdio
        ...
        # Samples: 36  of event 'anon group { cycles, cache-misses }'
        # Event count (approx.): 12585593
        #
        #       Overhead          Period  Command      Shared Object                      Symbol
        # ..............  ..............  .......  .................  ..........................
        #
          19.92%   1.20%  2505936     31       ls  [kernel.kallsyms]  [k] mark_held_locks
          13.74%   0.47%  1729327     12       ls  [kernel.kallsyms]  [k] sched_clock_local
          13.64%  23.72%  1716147    612       ls  ld-2.14.90.so      [.] check_match.10805
          13.12%  23.22%  1650778    599       ls  libc-2.14.90.so    [.] _nl_intern_locale_data
          11.24%  29.19%  1414554    753       ls  [kernel.kallsyms]  [k] sched_clock_cpu
           8.50%   0.35%  1070150      9       ls  [kernel.kallsyms]  [k] check_chain_key
        ...
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/n/tip-iyoinu3axi11mymwnh2b7fxj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3c176311
  4. 22 7月, 2013 1 次提交
  5. 13 7月, 2013 4 次提交
  6. 09 7月, 2013 4 次提交
  7. 28 5月, 2013 3 次提交
  8. 01 4月, 2013 2 次提交
  9. 27 3月, 2013 1 次提交
  10. 26 3月, 2013 2 次提交
  11. 16 3月, 2013 2 次提交
    • F
      perf stat: Introduce --repeat forever · a7e191c3
      Frederik Deweerdt 提交于
      The following patch causes 'perf stat --repeat 0' to be interpreted as
      'forever', displaying the stats for every run.
      
      We act as if a single run was asked, and reset the stats in each
      iteration. In this mode SIGINT is passed to perf to be able to stop the
      loop with Ctrl+C.
      Signed-off-by: NFrederik Deweerdt <frederik.deweerdt@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20130301180227.GA24385@ks398093.ip-192-95-24.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a7e191c3
    • N
      perf annotate: Add basic support to event group view · b1dd4432
      Namhyung Kim 提交于
      Add --group option to enable event grouping.  When enabled, all the
      group members information will be shown with the leader so skip
      non-leader events.
      
      It only supports --stdio output currently.  Later patches will extend
      additional features.
      
       $ perf annotate --group --stdio
       ...
        Percent                 |      Source code & Disassembly of libpthread-2.15.so
       --------------------------------------------------------------------------------
                                :
                                :
                                :
                                :      Disassembly of section .text:
                                :
                                :      000000387dc0aa50 <__pthread_mutex_unlock_usercnt>:
           8.08    2.40    5.29 :        387dc0aa50:   mov    %rdi,%rdx
           0.00    0.00    0.00 :        387dc0aa53:   mov    0x10(%rdi),%edi
           0.00    0.00    0.00 :        387dc0aa56:   mov    %edi,%eax
           0.00    0.80    0.00 :        387dc0aa58:   and    $0x7f,%eax
           3.03    2.40    3.53 :        387dc0aa5b:   test   $0x7c,%dil
           0.00    0.00    0.00 :        387dc0aa5f:   jne    387dc0aaa9 <__pthread_mutex_unlock_use
           0.00    0.00    0.00 :        387dc0aa61:   test   %eax,%eax
           0.00    0.00    0.00 :        387dc0aa63:   jne    387dc0aa85 <__pthread_mutex_unlock_use
           0.00    0.00    0.00 :        387dc0aa65:   and    $0x80,%edi
           0.00    0.00    0.00 :        387dc0aa6b:   test   %esi,%esi
           3.03    5.60    7.06 :        387dc0aa6d:   movl   $0x0,0x8(%rdx)
           0.00    0.00    0.59 :        387dc0aa74:   je     387dc0aa7a <__pthread_mutex_unlock_use
           0.00    0.00    0.00 :        387dc0aa76:   subl   $0x1,0xc(%rdx)
           2.02    5.60    1.18 :        387dc0aa7a:   mov    %edi,%esi
           0.00    0.00    0.00 :        387dc0aa7c:   lock decl (%rdx)
          83.84   83.20   82.35 :        387dc0aa7f:   jne    387dc0aada <_L_unlock_586>
           0.00    0.00    0.00 :        387dc0aa81:   nop
           0.00    0.00    0.00 :        387dc0aa82:   xor    %eax,%eax
           0.00    0.00    0.00 :        387dc0aa84:   retq
       ...
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1362462812-30885-6-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b1dd4432
  12. 15 2月, 2013 3 次提交
  13. 07 2月, 2013 1 次提交
    • S
      perf stat: Add per processor socket count aggregation · d7e7a451
      Stephane Eranian 提交于
      This patch adds per-processor socket count aggregation for system-wide
      mode measurements. This is a useful mode to detect imbalance between
      sockets.
      
      To enable this mode, use --aggr-socket in addition
      to -a. (system-wide).
      
      The output includes the socket number and the number of online
      processors on that socket. This is useful to gauge the amount of
      aggregation.
      
       # ./perf stat -I 1000 -a --aggr-socket -e cycles sleep 2
       #           time socket cpus             counts events
            1.000097680 S0        4          5,788,785 cycles
            2.000379943 S0        4         27,361,546 cycles
            2.001167808 S0        4            818,275 cycles
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1360161962-9675-3-git-send-email-eranian@google.com
      [ committer note: Added missing man page entry based on above comments ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d7e7a451
  14. 01 2月, 2013 2 次提交
    • N
      perf evlist: Add --group option · e6ab07d0
      Namhyung Kim 提交于
      Add '-g/--group' option for showing event groups.  For simplicity it is
      currently not compatible with other options.
      
        $ perf evlist --group
        {ref-cycles,cycles}
      
        $ perf evlist
        ref-cycles
        cycles
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1358845787-1350-20-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e6ab07d0
    • N
      perf report: Add --group option · 01d14f16
      Namhyung Kim 提交于
      Add --group option to enable event grouping.  When enabled, all the
      group members information will be shown together with the leader.
      
        $ perf report --group
        ...
        # group: {ref-cycles,cycles}
        # ========
        #
        # Samples: 7K of event 'anon group { ref-cycles, cycles }'
        # Event count (approx.): 6876107743
        #
        #         Overhead  Command      Shared Object                      Symbol
        # ................  .......  .................  ..........................
        #
            99.84%  99.76%  noploop  noploop            [.] main
             0.07%   0.00%  noploop  ld-2.15.so         [.] strcmp
             0.03%   0.00%  noploop  [kernel.kallsyms]  [k] timerqueue_del
             0.03%   0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
             0.02%   0.00%  noploop  [kernel.kallsyms]  [k] account_user_time
             0.01%   0.00%  noploop  [kernel.kallsyms]  [k] __alloc_pages_nodemask
             0.00%   0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
             0.00%   0.11%  noploop  [kernel.kallsyms]  [k] _raw_spin_lock
             0.00%   0.06%  noploop  [kernel.kallsyms]  [k] find_get_page
             0.00%   0.02%  noploop  [kernel.kallsyms]  [k] rcu_check_callbacks
             0.00%   0.02%  noploop  [kernel.kallsyms]  [k] __current_kernel_time
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1358845787-1350-18-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      01d14f16
  15. 30 1月, 2013 1 次提交
    • S
      perf stat: Add interval printing · 13370a9b
      Stephane Eranian 提交于
      This patch adds a new printing mode for perf stat.  It allows interval
      printing. That means perf stat can now print event deltas at regular
      time interval.  This is useful to detect phases in programs.
      
      The -I option enables interval printing. It expects an interval duration
      in milliseconds. Minimum is 100ms. Once, activated perf stat prints
      events deltas since last printout. All modes are supported.
      
      $ perf stat -I 1000 -e cycles noploop 10
      noploop for 10 seconds
       #           time             counts events
            1.000109853      2,388,560,546 cycles
            2.000262846      2,393,332,358 cycles
            3.000354131      2,393,176,537 cycles
            4.000439503      2,393,203,790 cycles
            5.000527075      2,393,167,675 cycles
            6.000609052      2,393,203,670 cycles
            7.000691082      2,393,175,678 cycles
      
      The output format makes it easy to feed into a plotting program such as
      gnuplot when the -I option is used in combination with the -x option:
      
      $ perf stat -x, -I 1000 -e cycles noploop 10
      noploop for 10 seconds
      1.000084113,2378775498,cycles
      2.000245798,2391056897,cycles
      3.000354445,2392089414,cycles
      4.000459115,2390936603,cycles
      5.000565341,2392108173,cycles
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1359460064-3060-3-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      13370a9b
  16. 25 1月, 2013 3 次提交
    • A
      perf test: Allow skipping tests · 2ae82878
      Arnaldo Carvalho de Melo 提交于
      Sometimes a test is problematic for some reason and one wants to skip it,
      for instance:
      
      [root@sandy ~]# perf test
         1: vmlinux symtab matches kallsyms                        : Ok
         2: detect open syscall event                              : Ok
         3: detect open syscall event on all cpus                  : Ok
         4: read samples using the mmap interface                  : Ok
         5: parse events tests                                     :  Warning: bad op token {
          Warning: bad op token {
          Warning: bad op token {
          Warning: bad op token {
          Warning: bad op token {
          Warning: function is_writable_pte not defined
        Segmentation fault (core dumped)
      
      So now we can use -s/--skip while the problematic tests are being fixed,
      allowing us to test all the other entries:
      
        [root@sandy ~]# perf test -s 5
         1: vmlinux symtab matches kallsyms                        : Ok
         2: detect open syscall event                              : Ok
         3: detect open syscall event on all cpus                  : Ok
         4: read samples using the mmap interface                  : Ok
         5: parse events tests                                     : Skip (user override)
         6: x86 rdpmc test                                         : Ok
         7: Validate PERF_RECORD_* events & perf_sample fields     : Ok
         8: Test perf pmu format parsing                           : Ok
         9: Test dso data interface                                : Ok
        10: roundtrip evsel->name check                            : Ok
        11: Check parsing of sched tracepoints fields              : Ok
        12: Generate and check syscalls:sys_enter_open event fields: Ok
        13: struct perf_event_attr setup                           : Ok
        14: Test matching and linking mutliple hists               : Ok
        15: Try 'use perf' in python, checking link problems       : Ok
        [root@sandy ~]#
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-klzd8p57jzdryafqkmlppcb1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2ae82878
    • T
      perf script: Remove workqueue-stats script · 1de7b7e8
      Tom Zanussi 提交于
      The tracepoints used by the workqueue-stats script no longer exist so
      trying to run the script results in:
      
        # perf script record workqueue-stats
        invalid or unsupported event: 'workqueue:workqueue_creation'
        Run 'perf list' for a list of valid events
      
      So remove the script until it can be reworked using the new workqueue
      tracepoints.
      Signed-off-by: NTom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lkml.kernel.org/r/e7a7637d5df9df86887c3bff7683574665ec5360.1358527965.git.tom.zanussi@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1de7b7e8
    • N
      perf report: Update documentation for sort keys · 9811360e
      Namhyung Kim 提交于
      Add description of sort keys to the perf-report document and also add
      missing cpu and srcline keys to the command line help string.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1356599507-14226-11-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9811360e
  17. 12 12月, 2012 1 次提交
    • A
      perf top: Use perf_evlist__config() · 2376c67a
      Arnaldo Carvalho de Melo 提交于
      Using struct perf_record_opts to specify how to configure the evsel
      perf_event_attrs.
      
      This gets top closer to record in the way it sets up evsels, with the
      aim of sharing more and more to the point that both will be a single
      utility.
      
      In this direction top now uses the same callchain option parsing as
      record and that brings DWARF callchains to top, something that was
      already available for record.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-u03o0bsrqcjgskciso3pvsjr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2376c67a
  18. 09 12月, 2012 4 次提交
  19. 03 12月, 2012 1 次提交