1. 28 4月, 2016 18 次提交
  2. 27 4月, 2016 15 次提交
    • I
      Merge tag 'perf-core-for-mingo-20160427' of... · a8944c5b
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-20160427' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
      - perf trace --pf maj/min/all works with --call-graph: (Arnaldo Carvalho de Melo)
      
        Tracing write syscalls and major page faults with callchains while starting
        firefox, limiting the stack to 5 frames:
      
       # perf trace -e write --pf maj --max-stack 5 firefox
         589.549 ( 0.014 ms): firefox/15377 write(fd: 4, buf: 0x7fff80acc898, count: 151) = 151
                                             [0xfaed] (/usr/lib64/libpthread-2.22.so)
                                             fire_glxtest_process+0x5c (/usr/lib64/firefox/libxul.so)
                                             InstallGdkErrorHandler+0x41 (/usr/lib64/firefox/libxul.so)
                                             XREMain::XRE_mainInit+0x12c (/usr/lib64/firefox/libxul.so)
                                             XREMain::XRE_main+0x1e4 (/usr/lib64/firefox/libxul.so)
         760.704 ( 0.000 ms): firefox/15332 majfault [gtk_tree_view_accessible_get_type+0x0] => /usr/lib64/libgtk-3.so.0.1800.9@0xa0850 (x.)
                                             gtk_tree_view_accessible_get_type+0x0 (/usr/lib64/libgtk-3.so.0.1800.9)
                                             gtk_tree_view_class_intern_init+0x1a54 (/usr/lib64/libgtk-3.so.0.1800.9)
                                             g_type_class_ref+0x6dd (/usr/lib64/libgobject-2.0.so.0.4600.2)
                                             [0x115378] (/usr/lib64/libgnutls.so.30.6.3)
      
        This automagically selects "--call-graph dwarf", use "--call-graph fp" on systems
        where -fno-omit-frame-pointer was used to built the components of interest, to
        incur in less overhead, or tune "--call-graph dwarf" appropriately, see 'perf record --help'.
      
      - Allow /proc/sys/kernel/perf_event_max_stack, that defaults to the old hard coded value
        of PERF_MAX_STACK_DEPTH (127), useful for huge callstacks for things like Groovy, Ruby, etc,
        and also to reduce overhead by limiting it to a smaller value, upcoming work will allow
        this to be done per-event (Arnaldo Carvalho de Melo)
      
      - Make 'perf trace --min-stack' be honoured by --pf and --event (Arnaldo Carvalho de Melo)
      
      - Make 'perf evlist -v' decode perf_event_attr->branch_sample_type (Arnaldo Carvalho de Melo)
      
         # perf record --call lbr usleep 1
         # perf evlist -v
         cycles:ppp: ... sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, ...
                  branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
         #
      
      - Clear dummy entry accumulated period, fixing such 'perf top/report' output
        as: (Kan Liang)
      
          4769.98%  0.01%  0.00%  0.01%  tchain_edit  [kernel] [k] update_fast_timekeeper
      
      - System calls with pid_t arguments gets them augmented with the COMM event
        more thoroughly:
      
        # trace -e perf_event_open perf stat -e cycles -p 15608
         6.876 ( 0.014 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15608 (hexchat), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
         6.882 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15639 (gmain), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
         6.889 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15640 (gdbus), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5
                                                                  ^^^^^^^^^^^^^^^^^^
         ^C
      
      - Fix offline module name mismatch issue in 'perf probe' (Ravi Bangoria)
      
      - Fix module probe issue if no dwarf support in (Ravi Bangoria)
      
      Assorted fixes:
      
      - Fix off-by-one in write_buildid() (Andrey Ryabinin)
      
      - Fix segfault when printing callchains in 'perf script' (Chris Phlipot)
      
      - Replace assignment with comparison on assert check in 'perf test' entry (Colin Ian King)
      
      - Fix off-by-one comparison in intel-pt code (Colin Ian King)
      
      - Close target file on error path in 'perf probe' (Masami Hiramatsu)
      
      - Set default kprobe group name if not given in 'perf probe' (Masami Hiramatsu)
      
      - Avoid partial perf_event_header reads (Wang Nan)
      
      Infrastructure changes:
      
      - Update x86's syscall_64.tbl copy, adding preadv2 & pwritev2 (Arnaldo Carvalho de Melo)
      
      - Make the x86 clean quiet wrt syscall table removal (Jiri Olsa)
      
      Cleanups:
      
      - Simplify wrapper for LOCK_PI in 'perf bench futex' (Davidlohr Bueso)
      
      - Remove duplicate const qualifier (Eric Engestrom)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a8944c5b
    • A
      perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack · 4cb93446
      Arnaldo Carvalho de Melo 提交于
      There is an upper limit to what tooling considers a valid callchain,
      and it was tied to the hardcoded value in the kernel,
      PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl,
      make it read it and use that as the upper limit, falling back to
      PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4cb93446
    • A
      perf core: Allow setting up max frame stack depth via sysctl · c5dfd78e
      Arnaldo Carvalho de Melo 提交于
      The default remains 127, which is good for most cases, and not even hit
      most of the time, but then for some cases, as reported by Brendan, 1024+
      deep frames are appearing on the radar for things like groovy, ruby.
      
      And in some workloads putting a _lower_ cap on this may make sense. One
      that is per event still needs to be put in place tho.
      
      The new file is:
      
        # cat /proc/sys/kernel/perf_event_max_stack
        127
      
      Chaging it:
      
        # echo 256 > /proc/sys/kernel/perf_event_max_stack
        # cat /proc/sys/kernel/perf_event_max_stack
        256
      
      But as soon as there is some event using callchains we get:
      
        # echo 512 > /proc/sys/kernel/perf_event_max_stack
        -bash: echo: write error: Device or resource busy
        #
      
      Because we only allocate the callchain percpu data structures when there
      is a user, which allows for changing the max easily, its just a matter
      of having no callchain users at that point.
      Reported-and-Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5dfd78e
    • A
      perf bench: Remove one more die() call · c2a218c6
      Arnaldo Carvalho de Melo 提交于
      Propagate the error instead.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-z6erjg35d1gekevwujoa0223@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2a218c6
    • A
      perf tools: Update x86's syscall_64.tbl, adding preadv2 & pwritev2 · 042a1810
      Arnaldo Carvalho de Melo 提交于
      Introduced in commit 4babf2c5 ("x86: wire up preadv2 and pwritev2").
      
      This will make 'perf trace' aware of them.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-vojoylgce2cetsy36446s5ny@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      042a1810
    • R
      perf probe: Fix module probe issue if no dwarf support · c61fb959
      Ravi Bangoria 提交于
      Perf is not able to register probe in kernel module when dwarf supprt
      is not there(and so it goes for symtab). Perf passes full path of
      module where only module name is required which is causing the problem.
      This patch fixes this issue.
      
      Before applying patch:
      
        $ dpkg -s libdw-dev
        dpkg-query: package 'libdw-dev' is not installed and no information is...
      
        $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
        Added new event:
          probe:kprobe_init (on kprobe_init in /linux/samples/kprobes/kprobe_example.ko)
      
        You can now use it in all perf tools, such as:
      
        perf record -e probe:kprobe_init -aR sleep 1
      
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
        p:probe/kprobe_init /linux/samples/kprobes/kprobe_example.ko:kprobe_init
      
        $ sudo ./perf record -a -e probe:kprobe_init
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.105 MB perf.data ]
      
        $ sudo ./perf script 	# No output here
      
      After applying patch:
      
        $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
        Added new event:
          probe:kprobe_init    (on kprobe_init in kprobe_example)
      
        You can now use it in all perf tools, such as:
      
        perf record -e probe:kprobe_init -aR sleep 1
      
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
        p:probe/kprobe_init kprobe_example:kprobe_init
      
        $ sudo ./perf record -a -e probe:kprobe_init
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.105 MB perf.data (2 samples) ]
      
        $ sudo ./perf script
        insmod 13990 [002]  5961.216833: probe:kprobe_init: ...
        insmod 13995 [002]  5962.889384: probe:kprobe_init: ...
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1461680741-12517-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c61fb959
    • R
      perf probe: Fix offline module name missmatch issue · 63a29613
      Ravi Bangoria 提交于
      Perf can add a probe on kernel module which has not been loaded yet.
      
      The current implementation finds the module name from path. But if the
      filename is different from the actual module name then perf fails to
      register a probe while loading module because of mismatch in the names.
      
      For example, samples/kobject/kobject-example.ko is loaded as
      kobject_example.
      
      Before applying patch:
      
        $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
          Added new event:
            probe:foo_show       (on foo_show in kobject-example)
      
          You can now use it in all perf tools, such as:
      
          perf record -e probe:foo_show -aR sleep 1
      
        $ cat /sys/kernel/debug/tracing/kprobe_events
          p:probe/foo_show kobject-example:foo_show
      
        $ insmod kobject-example.ko
      
        $ lsmod
          Module                  Size  Used by
          kobject_example        16384  0
      
        Generate read to /sys/kernel/kobject_example/foo while recording data
        with below command
        $ sudo ./perf record -e probe:foo_show -a
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.093 MB perf.data ]
      
        $./perf report --stdio -F overhead,comm,dso,sym
          Error:
          The perf.data.old file has no samples!
      
      After applying patch:
      
        $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
          Added new event:
            probe:foo_show       (on foo_show in kobject_example)
      
          You can now use it in all perf tools, such as:
      
          perf record -e probe:foo_show -aR sleep 1
      
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
          p:probe/foo_show kobject_example:foo_show
      
        $ insmod kobject-example.ko
      
        $ lsmod
          Module                  Size  Used by
          kobject_example        16384  0
      
        Generate read to /sys/kernel/kobject_example/foo while recording data
        with below command
        $ sudo ./perf record -e probe:foo_show -a
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]
      
        $ sudo ./perf report  --stdio -F overhead,comm,dso,sym
          ...
          # Samples: 8  of event 'probe:foo_show'
          # Event count (approx.): 8
          #
          # Overhead  Command  Shared Object      Symbol
          # ........  .......  .................  ............
          #
             100.00%  cat      [kobject_example]  [k] foo_show
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1461680741-12517-2-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63a29613
    • A
      perf trace: Read thread's COMM from /proc when not set · 073e5fca
      Arnaldo Carvalho de Melo 提交于
      We get notifications for threads that gets created while we're tracing,
      but for preexisting threads we may end not having synthesized them, like
      when tracing a 'perf trace' session that will use '--pid' to trace some
      other thread.
      
      And besides we should probably stop synthesizing those records and
      instead read thread information in a lazy way, i.e. just when we need,
      like done in this patch:
      
      Now the 'pid_t' argument in 'perf_event_open' gets translated to a COMM:
      
        # perf trace -e perf_event_open perf stat -e cycles -p 31601
           0.027 ( 0.027 ms): perf/23393 perf_event_open(attr_uptr: 0x2fdd0d8, pid: 31601 (abrt-dump-journ), cpu: -1, group_fd: -1, flags: FD_CLOEXEC)
                                                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      = 3
      ^C
      
      And in other syscalls containing pid_t without thread->comm_set at the
      time of the formatting.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ioeps6dlwst17d6oozc9shtk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      073e5fca
    • A
      perf thread: Introduce method to set comm from /proc/pid/self · 2f3027ac
      Arnaldo Carvalho de Melo 提交于
      Will be used for lazy comm loading in 'perf trace'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-7ogbkuoka1y2qsmcckqxvl5m@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f3027ac
    • A
      tools lib api fs: Add helper to read string from procfs file · 4bd112df
      Arnaldo Carvalho de Melo 提交于
      To read things like /proc/self/comm.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ztpkbmseidt0hq2psr46o0h9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4bd112df
    • A
      perf trace: Do not beautify the 'pid' parameter as a simple integer · ccd9b2a7
      Arnaldo Carvalho de Melo 提交于
      Leave it alone so that it ends up assigned to SCA_PID via its type,
      'pid_t', that will look up the pid on the machine thread rb_tree and
      possibly find its COMM.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-r7dujgmhtxxfajuunpt1bkuo@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ccd9b2a7
    • A
      perf trace: Move perf_flags beautifier to tools/perf/trace/beauty/ · 62de344e
      Arnaldo Carvalho de Melo 提交于
      To reduce the size of builtin-trace.c.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-8r3gmymyn3r0ynt4yuzspp9g@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      62de344e
    • M
      perf probe: Set default kprobe group name if it is not given · 2a12ec13
      Masami Hiramatsu 提交于
      Set kprobe group name as "probe" if it is not given.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160426090413.11891.95640.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2a12ec13
    • M
      perf probe: Let probe_file__add_event return 0 if succeeded · 6ed0720a
      Masami Hiramatsu 提交于
      Since other methods return 0 if succeeded (or filedesc), let
      probe_file__add_event() return 0 instead of the length of written bytes.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160426090303.11891.18232.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ed0720a
    • M
      perf tools: Add lsdir() helper to read a directory · e1ce726e
      Masami Hiramatsu 提交于
      As a utility function, add lsdir() which reads given directory and store
      entry name into a strlist.  lsdir accepts a filter function so that user
      can filter out unneeded entries.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160426090242.11891.79014.stgit@devbox
      [ Do not use the 'dirname' it is used in some distros ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e1ce726e
  3. 26 4月, 2016 7 次提交