1. 23 10月, 2013 8 次提交
    • A
      perf sched: Make struct perf_sched sched a local variable · 8a39df8f
      Adrian Hunter 提交于
      Change "struct perf_sched sched" from being global to being local.
      
      The build slowdown cured by f36f83f9 is dealt with in the following
      patch, by programatically setting perf_sched.curr_pid.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1382427258-17495-12-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8a39df8f
    • I
      perf bench: Change the procps visible command-name of invididual benchmark tests plus cleanups · 4157922a
      Ingo Molnar 提交于
      Before this patch, looking at 'perf bench sched pipe' behavior over
      'top' only told us that something related to perf is running:
      
            PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
          19934 mingo     20   0 54836 1296  952 R 18.6  0.0   0:00.56 perf
          19935 mingo     20   0 54836  384   36 S 18.6  0.0   0:00.56 perf
      
      After the patch it's clearly visible what's going on:
      
            PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
          19744 mingo     20   0  125m 3536 2644 R 68.2  0.0   0:01.12 sched-pipe
          19745 mingo     20   0  125m 1172  276 R 68.2  0.0   0:01.12 sched-pipe
      
      The benchmark-subsystem name is concatenated with the individual
      testcase name.
      
      Unfortunately 'perf top' does not show the reconfigured name, possibly
      because it caches ->comm[] values and does not recognize changes to
      them?
      
      Also clean up a few bits in builtin-bench.c while at it and reorganize
      the code and the output strings to be consistent.
      
      Use iterators to access the various arrays. Rename 'suites' concept to
      'benchmark collection' and the 'bench_suite' to 'benchmark/bench'. The
      many repetitions of 'suite' made the code harder to read and understand.
      
      The new output is:
      
        comet:~/tip/tools/perf> ./perf bench
        Usage:
              perf bench [<common options>] <collection> <benchmark> [<options>]
      
              # List of all available benchmark collections:
      
               sched: Scheduler and IPC benchmarks
                 mem: Memory access benchmarks
                numa: NUMA scheduling and MM benchmarks
                 all: All benchmarks
      
        comet:~/tip/tools/perf> ./perf bench sched
      
              # List of available benchmarks for collection 'sched':
      
           messaging: Benchmark for scheduling and IPC
                pipe: Benchmark for pipe() between two processes
                 all: Test all scheduler benchmarks
      
        comet:~/tip/tools/perf> ./perf bench mem
      
              # List of available benchmarks for collection 'mem':
      
              memcpy: Benchmark for memcpy()
              memset: Benchmark for memset() tests
                 all: Test all memory benchmarks
      
        comet:~/tip/tools/perf> ./perf bench numa
      
              # List of available benchmarks for collection 'numa':
      
                 mem: Benchmark for NUMA workloads
                 all: Test all NUMA benchmarks
      
      Individual benchmark modules were not touched.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <h.mitake@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20131023123756.GA17871@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4157922a
    • M
      perf probe: Find fentry mcount fuzzed parameter location · 3d918a12
      Masami Hiramatsu 提交于
      At this point, --fentry (mcount function entry) option for gcc fuzzes
      the debuginfo variable locations by skipping the mcount instruction
      offset (on x86, this is a 5 byte call instruction).
      
      This makes variable searching fail at the entry of functions which
      are mcount'ed.
      
      e.g.)
      Available variables at vfs_read
              @<vfs_read+0>
                      (No matched variables)
      
      This patch adds additional location search at the function entry point
      to solve this issue, which tries to find the earliest address for the
      variable location.
      
      Note that this only works with function parameters (formal parameters)
      because any local variables should not exist on the function entry
      address (those are not initialized yet).
      
      With this patch, perf probe shows correct parameters if possible;
       # perf probe --vars vfs_read
       Available variables at vfs_read
               @<vfs_read+0>
                       char*   buf
                       loff_t* pos
                       size_t  count
                       struct file*    file
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20131011071025.15557.13275.stgit@udc4-manage.rcp.hitachi.co.jpSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3d918a12
    • M
      perf probe: Support "$vars" meta argument syntax for local variables · 7969ec77
      Masami Hiramatsu 提交于
      Support "$vars" meta argument syntax for tracing all local variables at
      probe point.
      
      Now you can trace all available local variables (including function
      parameters) at the probe point by passing $vars.
      
       # perf probe --add foo $vars
      
      This automatically finds all local variables at foo() and adds it as
      probe arguments.
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20131011071023.15557.51770.stgit@udc4-manage.rcp.hitachi.co.jpSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7969ec77
    • A
      perf tools: Stop using 'self' in some more places · c824c433
      Arnaldo Carvalho de Melo 提交于
      As suggested by tglx, 'self' should be replaced by something that is
      more useful.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-fmblhc6tbb99tk1q8vowtsbj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c824c433
    • A
      perf test: Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test · 4ac2f1c1
      Arnaldo Carvalho de Melo 提交于
      [root@sandy ~]# perf test -v 22
      22: Test sample parsing                                    :
      --- start ---
      sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating
      ---- end ----
      Test sample parsing: FAILED!
      [root@sandy ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-cx83wuzz30m10m4s1xt0ocyq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ac2f1c1
    • A
      perf test: Clarify the "sample parsing" test entry · 11a4d435
      Arnaldo Carvalho de Melo 提交于
      Before:
      
        [root@sandy ~]# perf test -v 22
        22: Test sample parsing                                    :
        --- start ---
        sample format has changed - test needs updating
        ---- end ----
        Test sample parsing: FAILED!
        [root@sandy ~]#
      
      After:
      
        [root@sandy ~]# perf test -v 22
        22: Test sample parsing                                    :
        --- start ---
        sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating
        ---- end ----
        Test sample parsing: FAILED!
        [root@sandy ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-8cazc2fpmk70jcbww8c0cobx@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      11a4d435
    • I
      Merge tag 'perf-core-for-mingo' of... · aa30a2e0
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
        * Convert callchain children list to rbtree, greatly reducing the time
          taken for callchain processing, from Namhyung Kim.
      
        * Add --max-stack option to limit callchain stack scan in 'top' and 'report',
          improving callchain processing when reducing the stack depth is an option,
          from Waiman Long.
      
        * Compare dso's also when comparing symbols, to avoid grouping together
          symbols with the same name but on different DSOs, fix from Namhyung Kim.
      
        * 'perf trace' now can can use a 'perf probe' wannabe tracepoint to hook into
          the userspace -> kernel pathname copy so that it can map fds to pathnames
          without reading /proc/pid/fd/ symlinks. From Arnaldo Carvalho de Melo.
      
        * 'perf trace' now emits hints as to why tracing is not possible, helping the
          user to setup the system to allow tracing in the desired permission
          granularity, telling if the problem is due to debugfs not being mounted or
          with not enough permission for !root, /proc/sys/kernel/perf_event_paranoit
          value, etc. From Arnaldo Carvalho de Melo.
      
        * Add missing 'mmap2' in evsel debug print, from Adrian Hunter.
      
        * Add missing decrement in id sample parsing, not a fix per se, just to
          avoid a problem whem somebody adds another field, from Adrian Hunter.
      
        * Improve write_output error message in 'perf record', from Adrian Hunter.
      
        * Add missing sample flush for piped events, fix from Adrian Hunter.
      
        * Add missing members to perf_event__attr_swap(), fix from Adrian Hunter.
      
        * Assorted fixes for 32-bit build, from Adrian Hunter
      
        * Print addr by default for BTS in 'perf script', from Adrian Juntmer
      
        * Separating data file properties from session, code reorganization from
          Jiri Olsa.
      
        * Show error in 'perf list' if tracepoints not available, from Pekka Enberg.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      aa30a2e0
  2. 22 10月, 2013 9 次提交
  3. 21 10月, 2013 9 次提交
  4. 18 10月, 2013 4 次提交
    • A
      perf evsel: Add missing 'mmap2' from debug print · 40d54ec2
      Adrian Hunter 提交于
      The struct perf_event_attr now has a 'mmap2' member.  Add it to
      perf_event_attr__fprintf().
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1382099356-4918-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      40d54ec2
    • A
      perf trace: Improve messages related to /proc/sys/kernel/perf_event_paranoid · a8f23d8f
      Arnaldo Carvalho de Melo 提交于
      kernel/events/core.c has:
      
        /*
         * perf event paranoia level:
         *  -1 - not paranoid at all
         *   0 - disallow raw tracepoint access for unpriv
         *   1 - disallow cpu events for unpriv
         *   2 - disallow kernel profiling for unpriv
         */
        int sysctl_perf_event_paranoid __read_mostly = 1;
      
      So, with the default being 1, a non-root user can trace his stuff:
      
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        1
        [acme@zoo ~]$ yes > /dev/null &
        [1] 15338
        [acme@zoo ~]$ trace -p 15338 | head -5
             0.005 ( 0.005 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.045 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.085 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.125 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
             0.165 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --duration 1 sleep 1
          1002.148 (1001.218 ms): nanosleep(rqtp: 0x7fff46c79250                           ) = 0
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -- usleep 1 | tail -5
             0.905 ( 0.002 ms): brk(                                                     ) = 0x1c82000
             0.910 ( 0.003 ms): brk(brk: 0x1ca3000                                       ) = 0x1ca3000
             0.913 ( 0.001 ms): brk(                                                     ) = 0x1ca3000
             0.990 ( 0.059 ms): nanosleep(rqtp: 0x7fffe31a3280                           ) = 0
             0.995 ( 0.000 ms): exit_group(
        [acme@zoo ~]$
      
      But can't do system wide tracing:
      
        [acme@zoo ~]$ trace
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
        [acme@zoo ~]$ trace --cpu 0
        Error:	Operation not permitted.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 1.
        [acme@zoo ~]$
      
      If the paranoid level is >= 2, i.e. turn this perf stuff off for !root users:
      
        [acme@zoo ~]$ sudo sh -c 'echo 2 > /proc/sys/kernel/perf_event_paranoid'
        [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid
        2
        [acme@zoo ~]$
        [acme@zoo ~]$ trace usleep 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
        [acme@zoo ~]$ trace --cpu 1
        Error:	Permission denied.
        Hint:	Check /proc/sys/kernel/perf_event_paranoid setting.
        Hint:	For your workloads it needs to be <= 1
        Hint:	For system wide tracing it needs to be set to -1.
        Hint:	The current value is 2.
        [acme@zoo ~]$
      
      If the user manages to get what he/she wants, convincing root not
      to be paranoid at all...
      
        [root@zoo ~]# echo -1 > /proc/sys/kernel/perf_event_paranoid
        [root@zoo ~]# cat /proc/sys/kernel/perf_event_paranoid
        -1
        [root@zoo ~]#
      
        [acme@zoo ~]$ ps -eo user,pid,comm | grep Xorg
        root       729 Xorg
        [acme@zoo ~]$
        [acme@zoo ~]$ trace -a --duration 0.001 -e \!select,ioctl,writev | grep Xorg  | head -5
            23.143 ( 0.003 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.152 ( 0.004 ms): Xorg/729 read(fd: 31, buf: 0x2544af0, count: 4096     ) = 8
            23.161 ( 0.002 ms): Xorg/729 read(fd: 31, buf: 0x2544af0, count: 4096     ) = -1 EAGAIN Resource temporarily unavailable
            23.175 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
            23.235 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0
        [acme@zoo ~]$
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-di28olfwd28rvkox7v3hqhu1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a8f23d8f
    • A
      perf tools: Introduce filename__read_int helper · 97a07f10
      Arnaldo Carvalho de Melo 提交于
      Just opens a file and calls atoi() in at most its first 64 bytes.
      
      To read things like /proc/sys/kernel/perf_event_paranoid.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-669q04c5tou5pnt8jtiz6y2r@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      97a07f10
    • A
      perf evlist: Introduce perf_evlist__strerror_tp method · 6ef068cb
      Arnaldo Carvalho de Melo 提交于
      Out of 'perf trace', should be used by other tools that uses
      tracepoints.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ramkumar Ramachandra <artagnon@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-lyvtxhchz4ga8fwht15x8wou@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ef068cb
  5. 17 10月, 2013 1 次提交
  6. 16 10月, 2013 3 次提交
    • A
      perf trace: Use vfs_getname hook if available · c522739d
      Arnaldo Carvalho de Melo 提交于
      Initially it tries to find a probe:vfs_getname that should be setup
      with:
      
       perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'
      
      or with slight changes to cope with code flux in the getname_flags code.
      
      In the future, if a "vfs:getname" tracepoint becomes available, then it
      will be preferred.
      
      This is not strictly required and more expensive method of reading the
      /proc/pid/fd/ symlink will be used when the fd->path array entry is not
      populated by a previous vfs_getname + open syscall ret sequence.
      
      As with any other 'perf probe' probe the setup must be done just once
      and the probe will be left inactive, waiting for users, be it 'perf
      trace' of any other tool.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ujg8se8glq5izmu8cdkq15po@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c522739d
    • A
      perf trace: Split fd -> pathname array handling · 97119f37
      Arnaldo Carvalho de Melo 提交于
      So that the part that grows the array as needed is untied from the code
      that reads the /proc/pid/fd symlink and can be used for the vfs_getname
      hook that will set the fd -> path translation too, when available.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ydo5rumyv9hdc1vsfmqamugs@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      97119f37
    • P
      perf/x86: Optimize intel_pmu_pebs_fixup_ip() · 9536c8d2
      Peter Zijlstra 提交于
      There's been reports of high NMI handler overhead, highlighted by
      such kernel messages:
      
        [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
        [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs
      
      Don Zickus analyzed the source of the overhead and reported:
      
       > While there are a few places that are causing latencies, for now I focused on
       > the longest one first.  It seems to be 'copy_user_from_nmi'
       >
       > intel_pmu_handle_irq ->
       >	intel_pmu_drain_pebs_nhm ->
       >		__intel_pmu_drain_pebs_nhm ->
       >			__intel_pmu_pebs_event ->
       >				intel_pmu_pebs_fixup_ip ->
       >					copy_from_user_nmi
       >
       > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of
       > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles
       > (there are some cases where only 10 iterations are needed to go that high
       > too, but in generall over 50 or so).  At this point copy_user_from_nmi
       > seems to account for over 90% of the nmi latency.
      
      The solution to that is to avoid having to call copy_from_user_nmi() for
      every instruction.
      
      Since we already limit the max basic block size, we can easily
      pre-allocate a piece of memory to copy the entire thing into in one
      go.
      
      Don reported this test result:
      
       > Your patch made a huge difference in improvement.  The
       > copy_from_user_nmi() no longer hits the million of cycles.  I still
       > have a batch of 100,000-300,000 cycles.  My longest NMI paths used
       > to be dominated by copy_from_user_nmi, now it is not (I have to dig
       > up the new hot path).
      Reported-and-tested-by: NDon Zickus <dzickus@redhat.com>
      Cc: jmario@redhat.com
      Cc: acme@infradead.org
      Cc: dave.hansen@linux.intel.com
      Cc: eranian@google.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      9536c8d2
  7. 15 10月, 2013 3 次提交
    • A
      perf scripting perl: Fix build error on Fedora 12 · 6650b181
      Arnaldo Carvalho de Melo 提交于
      Cast __u64 to u64 to silence this warning on older distros, such as
      Fedora 12:
      
          CC       /tmp/build/perf/util/scripting-engines/trace-event-perl.o
        cc1: warnings being treated as errors
        util/scripting-engines/trace-event-perl.c: In function ‘perl_process_tracepoint’:
        util/scripting-engines/trace-event-perl.c:285: error: format ‘%lu’ expects type ‘long unsigned int’, but argument 2 has type ‘__u64’
        make[1]: *** [/tmp/build/perf/util/scripting-engines/trace-event-perl.o] Error 1
        make: *** [install] Error 2
        make: Leaving directory `/home/acme/git/linux/tools/perf'
        [acme@fedora12 linux]$
      Reported-by: NWaiman Long <Waiman.Long@hp.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Link: http://lkml.kernel.org/n/tip-nlxofdqcdjfm0w9o6bgq4kqv@git.kernel.org
      Link: http://lkml.kernel.org/r/1381265120-58532-1-git-send-email-Waiman.Long@hp.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6650b181
    • I
      Merge tag 'perf-core-for-mingo' of... · 1ff9ecf7
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
        * kcore annotation improvements, including build-id cache support,
          multi map 'call' instruction navigation fixes, kcore address
          validation, objdump workarounds. From Adrian Hunter.
      
        * 'trace' beautifiers for lots of syscall arguments, from Arnaldo Carvalho de Melo.
      
        * More compact 'trace' output by suppressing zeroed args, from Arnaldo Carvalho de Melo.
      
        * Show thread COMM by default in 'trace', from Arnaldo Carvalho de Melo.
      
        * Show path associated with fd in live sessions, using a 'vfs_getname'
          'perf probe' created dynamic tracepoint or by looking at /proc/pid/fd, from Arnaldo Carvalho de Melo.
      
        * Memory and mmap leak fixes from Chenggang Qin.
      
        * Add option to show full timestamp in 'trace', from David Ahern.
      
        * Add 'record' command in 'trace', to record raw_syscalls:*, from David Ahern.
      
        * Add summary option to dump syscall statistics in 'trace', from David Ahern.
      
        * Fix comm resolution in 'trace' when reading events from file, from David Ahern.
      
        * Improved messages when doing profiling in all or a subset of CPUs
          using a workload as the session delimitator, as in:
      
           'perf stat --cpu 0,2 sleep 10s'
      
          from Arnaldo Carvalho de Melo.
      
        * Add units to nanosec-based counters in 'perf stat', from David Ahern.
      
        * Assorted build fixes for from David Ahern and Jiri Olsa.
      
        * 'perf lock' fixes and cleanups, from Davidlohr Bueso.
      
        * Memory leak fixes in 'perf test', from Felipe Pena.
      
        * Build system super speedups, from Ingo Molnar.
      
        * Fix mmap_read event overflow, from Jiri Olsa.
      
        * Code cleanups from Jiri Olsa.
      
        * Allow specifying B/K/M/G unit to the --mmap-pages arguments, from Jiri Olsa.
      
        * Separate the GTK support in a separate libperf-gtk.so DSO, that is
          only loaded when --gtk is specified, from Namhyung Kim.
      
        * Fixes for some memory leaks, from Namhyumg Kim.
      
        * Fix srcline sort key behavior, from Namhyung Kim.
      
        * Fix failing assertions in numa bench, from Petr Holasek.
      
        * perf bash completion fixes and improvements from Ramkumar Ramachandra.
      
        * Improve error messages in 'trace', providing hints about system configuration
          steps needed for using it, from Ramkumar Ramachandra.
      
        * Remove bogus info when using 'perf stat' -e cycles/instructions, from
          Ramkumar Ramachandra.
      
        * Support for Openembedded/Yocto -dbg packages, from Ricardo Ribalda Delgado.
      
        * Implement addr2line directly using libbfd, from Roberto Vitillo.
      
        * Add new option --ignore-vmlinux for perf top, from Willy Tarreau.
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      1ff9ecf7
    • I
      Merge tag 'v3.12-rc5' into perf/core · 426ee9e3
      Ingo Molnar 提交于
      Merge Linux v3.12-rc5, to pick up the latest fixes.
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      426ee9e3
  8. 14 10月, 2013 3 次提交