1. 28 4月, 2016 9 次提交
  2. 27 4月, 2016 15 次提交
    • I
      Merge tag 'perf-core-for-mingo-20160427' of... · a8944c5b
      Ingo Molnar 提交于
      Merge tag 'perf-core-for-mingo-20160427' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
      - perf trace --pf maj/min/all works with --call-graph: (Arnaldo Carvalho de Melo)
      
        Tracing write syscalls and major page faults with callchains while starting
        firefox, limiting the stack to 5 frames:
      
       # perf trace -e write --pf maj --max-stack 5 firefox
         589.549 ( 0.014 ms): firefox/15377 write(fd: 4, buf: 0x7fff80acc898, count: 151) = 151
                                             [0xfaed] (/usr/lib64/libpthread-2.22.so)
                                             fire_glxtest_process+0x5c (/usr/lib64/firefox/libxul.so)
                                             InstallGdkErrorHandler+0x41 (/usr/lib64/firefox/libxul.so)
                                             XREMain::XRE_mainInit+0x12c (/usr/lib64/firefox/libxul.so)
                                             XREMain::XRE_main+0x1e4 (/usr/lib64/firefox/libxul.so)
         760.704 ( 0.000 ms): firefox/15332 majfault [gtk_tree_view_accessible_get_type+0x0] => /usr/lib64/libgtk-3.so.0.1800.9@0xa0850 (x.)
                                             gtk_tree_view_accessible_get_type+0x0 (/usr/lib64/libgtk-3.so.0.1800.9)
                                             gtk_tree_view_class_intern_init+0x1a54 (/usr/lib64/libgtk-3.so.0.1800.9)
                                             g_type_class_ref+0x6dd (/usr/lib64/libgobject-2.0.so.0.4600.2)
                                             [0x115378] (/usr/lib64/libgnutls.so.30.6.3)
      
        This automagically selects "--call-graph dwarf", use "--call-graph fp" on systems
        where -fno-omit-frame-pointer was used to built the components of interest, to
        incur in less overhead, or tune "--call-graph dwarf" appropriately, see 'perf record --help'.
      
      - Allow /proc/sys/kernel/perf_event_max_stack, that defaults to the old hard coded value
        of PERF_MAX_STACK_DEPTH (127), useful for huge callstacks for things like Groovy, Ruby, etc,
        and also to reduce overhead by limiting it to a smaller value, upcoming work will allow
        this to be done per-event (Arnaldo Carvalho de Melo)
      
      - Make 'perf trace --min-stack' be honoured by --pf and --event (Arnaldo Carvalho de Melo)
      
      - Make 'perf evlist -v' decode perf_event_attr->branch_sample_type (Arnaldo Carvalho de Melo)
      
         # perf record --call lbr usleep 1
         # perf evlist -v
         cycles:ppp: ... sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, ...
                  branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
         #
      
      - Clear dummy entry accumulated period, fixing such 'perf top/report' output
        as: (Kan Liang)
      
          4769.98%  0.01%  0.00%  0.01%  tchain_edit  [kernel] [k] update_fast_timekeeper
      
      - System calls with pid_t arguments gets them augmented with the COMM event
        more thoroughly:
      
        # trace -e perf_event_open perf stat -e cycles -p 15608
         6.876 ( 0.014 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15608 (hexchat), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
         6.882 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15639 (gmain), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
         6.889 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15640 (gdbus), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5
                                                                  ^^^^^^^^^^^^^^^^^^
         ^C
      
      - Fix offline module name mismatch issue in 'perf probe' (Ravi Bangoria)
      
      - Fix module probe issue if no dwarf support in (Ravi Bangoria)
      
      Assorted fixes:
      
      - Fix off-by-one in write_buildid() (Andrey Ryabinin)
      
      - Fix segfault when printing callchains in 'perf script' (Chris Phlipot)
      
      - Replace assignment with comparison on assert check in 'perf test' entry (Colin Ian King)
      
      - Fix off-by-one comparison in intel-pt code (Colin Ian King)
      
      - Close target file on error path in 'perf probe' (Masami Hiramatsu)
      
      - Set default kprobe group name if not given in 'perf probe' (Masami Hiramatsu)
      
      - Avoid partial perf_event_header reads (Wang Nan)
      
      Infrastructure changes:
      
      - Update x86's syscall_64.tbl copy, adding preadv2 & pwritev2 (Arnaldo Carvalho de Melo)
      
      - Make the x86 clean quiet wrt syscall table removal (Jiri Olsa)
      
      Cleanups:
      
      - Simplify wrapper for LOCK_PI in 'perf bench futex' (Davidlohr Bueso)
      
      - Remove duplicate const qualifier (Eric Engestrom)
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      a8944c5b
    • A
      perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack · 4cb93446
      Arnaldo Carvalho de Melo 提交于
      There is an upper limit to what tooling considers a valid callchain,
      and it was tied to the hardcoded value in the kernel,
      PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl,
      make it read it and use that as the upper limit, falling back to
      PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4cb93446
    • A
      perf core: Allow setting up max frame stack depth via sysctl · c5dfd78e
      Arnaldo Carvalho de Melo 提交于
      The default remains 127, which is good for most cases, and not even hit
      most of the time, but then for some cases, as reported by Brendan, 1024+
      deep frames are appearing on the radar for things like groovy, ruby.
      
      And in some workloads putting a _lower_ cap on this may make sense. One
      that is per event still needs to be put in place tho.
      
      The new file is:
      
        # cat /proc/sys/kernel/perf_event_max_stack
        127
      
      Chaging it:
      
        # echo 256 > /proc/sys/kernel/perf_event_max_stack
        # cat /proc/sys/kernel/perf_event_max_stack
        256
      
      But as soon as there is some event using callchains we get:
      
        # echo 512 > /proc/sys/kernel/perf_event_max_stack
        -bash: echo: write error: Device or resource busy
        #
      
      Because we only allocate the callchain percpu data structures when there
      is a user, which allows for changing the max easily, its just a matter
      of having no callchain users at that point.
      Reported-and-Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5dfd78e
    • A
      perf bench: Remove one more die() call · c2a218c6
      Arnaldo Carvalho de Melo 提交于
      Propagate the error instead.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-z6erjg35d1gekevwujoa0223@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2a218c6
    • A
      perf tools: Update x86's syscall_64.tbl, adding preadv2 & pwritev2 · 042a1810
      Arnaldo Carvalho de Melo 提交于
      Introduced in commit 4babf2c5 ("x86: wire up preadv2 and pwritev2").
      
      This will make 'perf trace' aware of them.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-vojoylgce2cetsy36446s5ny@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      042a1810
    • R
      perf probe: Fix module probe issue if no dwarf support · c61fb959
      Ravi Bangoria 提交于
      Perf is not able to register probe in kernel module when dwarf supprt
      is not there(and so it goes for symtab). Perf passes full path of
      module where only module name is required which is causing the problem.
      This patch fixes this issue.
      
      Before applying patch:
      
        $ dpkg -s libdw-dev
        dpkg-query: package 'libdw-dev' is not installed and no information is...
      
        $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
        Added new event:
          probe:kprobe_init (on kprobe_init in /linux/samples/kprobes/kprobe_example.ko)
      
        You can now use it in all perf tools, such as:
      
        perf record -e probe:kprobe_init -aR sleep 1
      
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
        p:probe/kprobe_init /linux/samples/kprobes/kprobe_example.ko:kprobe_init
      
        $ sudo ./perf record -a -e probe:kprobe_init
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.105 MB perf.data ]
      
        $ sudo ./perf script 	# No output here
      
      After applying patch:
      
        $ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
        Added new event:
          probe:kprobe_init    (on kprobe_init in kprobe_example)
      
        You can now use it in all perf tools, such as:
      
        perf record -e probe:kprobe_init -aR sleep 1
      
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
        p:probe/kprobe_init kprobe_example:kprobe_init
      
        $ sudo ./perf record -a -e probe:kprobe_init
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.105 MB perf.data (2 samples) ]
      
        $ sudo ./perf script
        insmod 13990 [002]  5961.216833: probe:kprobe_init: ...
        insmod 13995 [002]  5962.889384: probe:kprobe_init: ...
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1461680741-12517-1-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c61fb959
    • R
      perf probe: Fix offline module name missmatch issue · 63a29613
      Ravi Bangoria 提交于
      Perf can add a probe on kernel module which has not been loaded yet.
      
      The current implementation finds the module name from path. But if the
      filename is different from the actual module name then perf fails to
      register a probe while loading module because of mismatch in the names.
      
      For example, samples/kobject/kobject-example.ko is loaded as
      kobject_example.
      
      Before applying patch:
      
        $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
          Added new event:
            probe:foo_show       (on foo_show in kobject-example)
      
          You can now use it in all perf tools, such as:
      
          perf record -e probe:foo_show -aR sleep 1
      
        $ cat /sys/kernel/debug/tracing/kprobe_events
          p:probe/foo_show kobject-example:foo_show
      
        $ insmod kobject-example.ko
      
        $ lsmod
          Module                  Size  Used by
          kobject_example        16384  0
      
        Generate read to /sys/kernel/kobject_example/foo while recording data
        with below command
        $ sudo ./perf record -e probe:foo_show -a
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.093 MB perf.data ]
      
        $./perf report --stdio -F overhead,comm,dso,sym
          Error:
          The perf.data.old file has no samples!
      
      After applying patch:
      
        $ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
          Added new event:
            probe:foo_show       (on foo_show in kobject_example)
      
          You can now use it in all perf tools, such as:
      
          perf record -e probe:foo_show -aR sleep 1
      
        $ sudo cat /sys/kernel/debug/tracing/kprobe_events
          p:probe/foo_show kobject_example:foo_show
      
        $ insmod kobject-example.ko
      
        $ lsmod
          Module                  Size  Used by
          kobject_example        16384  0
      
        Generate read to /sys/kernel/kobject_example/foo while recording data
        with below command
        $ sudo ./perf record -e probe:foo_show -a
          [ perf record: Woken up 1 times to write data ]
          [ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]
      
        $ sudo ./perf report  --stdio -F overhead,comm,dso,sym
          ...
          # Samples: 8  of event 'probe:foo_show'
          # Event count (approx.): 8
          #
          # Overhead  Command  Shared Object      Symbol
          # ........  .......  .................  ............
          #
             100.00%  cat      [kobject_example]  [k] foo_show
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1461680741-12517-2-git-send-email-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      63a29613
    • A
      perf trace: Read thread's COMM from /proc when not set · 073e5fca
      Arnaldo Carvalho de Melo 提交于
      We get notifications for threads that gets created while we're tracing,
      but for preexisting threads we may end not having synthesized them, like
      when tracing a 'perf trace' session that will use '--pid' to trace some
      other thread.
      
      And besides we should probably stop synthesizing those records and
      instead read thread information in a lazy way, i.e. just when we need,
      like done in this patch:
      
      Now the 'pid_t' argument in 'perf_event_open' gets translated to a COMM:
      
        # perf trace -e perf_event_open perf stat -e cycles -p 31601
           0.027 ( 0.027 ms): perf/23393 perf_event_open(attr_uptr: 0x2fdd0d8, pid: 31601 (abrt-dump-journ), cpu: -1, group_fd: -1, flags: FD_CLOEXEC)
                                                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      = 3
      ^C
      
      And in other syscalls containing pid_t without thread->comm_set at the
      time of the formatting.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ioeps6dlwst17d6oozc9shtk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      073e5fca
    • A
      perf thread: Introduce method to set comm from /proc/pid/self · 2f3027ac
      Arnaldo Carvalho de Melo 提交于
      Will be used for lazy comm loading in 'perf trace'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-7ogbkuoka1y2qsmcckqxvl5m@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2f3027ac
    • A
      tools lib api fs: Add helper to read string from procfs file · 4bd112df
      Arnaldo Carvalho de Melo 提交于
      To read things like /proc/self/comm.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ztpkbmseidt0hq2psr46o0h9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4bd112df
    • A
      perf trace: Do not beautify the 'pid' parameter as a simple integer · ccd9b2a7
      Arnaldo Carvalho de Melo 提交于
      Leave it alone so that it ends up assigned to SCA_PID via its type,
      'pid_t', that will look up the pid on the machine thread rb_tree and
      possibly find its COMM.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-r7dujgmhtxxfajuunpt1bkuo@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ccd9b2a7
    • A
      perf trace: Move perf_flags beautifier to tools/perf/trace/beauty/ · 62de344e
      Arnaldo Carvalho de Melo 提交于
      To reduce the size of builtin-trace.c.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-8r3gmymyn3r0ynt4yuzspp9g@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      62de344e
    • M
      perf probe: Set default kprobe group name if it is not given · 2a12ec13
      Masami Hiramatsu 提交于
      Set kprobe group name as "probe" if it is not given.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160426090413.11891.95640.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2a12ec13
    • M
      perf probe: Let probe_file__add_event return 0 if succeeded · 6ed0720a
      Masami Hiramatsu 提交于
      Since other methods return 0 if succeeded (or filedesc), let
      probe_file__add_event() return 0 instead of the length of written bytes.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160426090303.11891.18232.stgit@devboxSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ed0720a
    • M
      perf tools: Add lsdir() helper to read a directory · e1ce726e
      Masami Hiramatsu 提交于
      As a utility function, add lsdir() which reads given directory and store
      entry name into a strlist.  lsdir accepts a filter function so that user
      can filter out unneeded entries.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160426090242.11891.79014.stgit@devbox
      [ Do not use the 'dirname' it is used in some distros ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e1ce726e
  3. 26 4月, 2016 10 次提交
  4. 25 4月, 2016 6 次提交
    • A
      perf trace: Make --pf honour --min-stack too · 1df54290
      Arnaldo Carvalho de Melo 提交于
      To check deeply nested page fault callchains.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-wuji34xx003kr88nmqt6jkgf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1df54290
    • A
      perf trace: Make --event honour --min-stack too · 7ad35615
      Arnaldo Carvalho de Melo 提交于
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-shj0fazntmskhjild5i6x73l@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7ad35615
    • C
      perf script: Fix segfault when printing callchains · e557b674
      Chris Phlipot 提交于
      This fixes a bug caused by an unitialized callchain cursor. The crash
      frist appeared in:
      
      6f736735 ("perf evsel: Require that callchains be resolved before
      calling fprintf_{sym,callchain}")
      
      The callchain cursor is a struct that contains pointers, that when
      uninitialized will cause unpredictable behavior (usually a crash)
      when trying to append to the callchain.
      
      The existing implementation has the following issues:
      
      1. The callchain cursor used is not initialized, resulting in
      	unpredictable behavior when used.
      2. The cursor is declared on the stack. Even if it is properly initalized,
      	the implmentation will leak memory when the function returns,
      	since all the references to the callchain_nodes allocated by
      	callchain_cursor_append will be lost when the cursor goes out of
      	scope.
      3. Storing the cursor on the stack is inefficient. Even if memory is
      	properly freed when it goes out of scope, a performance penalty
      	will be incurred due to reallocation of callchain nodes.
      	callchain_cursor_append is designed to avoid these reallocations
      	when an existing cursor is reused.
      
      This patch fixes the crash by replacing cursor_callchain with a reference
      to the global callchain_cursor which also resolves all 3 issues mentioned
      above.
      
      How to reproduce the crash:
      
        $ perf record --call-graph=dwarf stress -t 1 -c 1
        $ perf script > /dev/null
        Segfault
      Signed-off-by: NChris Phlipot <cphlipot0@gmail.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: 6f736735 ("perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}")
      Link: http://lkml.kernel.org/r/1461119531-2529-1-git-send-email-cphlipot0@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e557b674
    • A
      perf trace: Make --pf maj/min/all use callchains too · 0c3a6ef4
      Arnaldo Carvalho de Melo 提交于
      Forgot about page faults, a software event, when adding support for callchains,
      fix it:
      
        # trace --no-syscalls --pf maj --call dwarf
           0.000 ( 0.000 ms): Xorg/2068 majfault [sfbSegment1+0x0] => /usr/lib64/xorg/modules/drivers/intel_drv.so@0x11b490 (x.)
                                             sfbSegment1+0x0 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             fbPolySegment32+0x361 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             sna_poly_segment+0x743 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             damagePolySegment+0x77 (/usr/libexec/Xorg)
                                             ProcPolySegment+0xe7 (/usr/libexec/Xorg)
                                             Dispatch+0x25f (/usr/libexec/Xorg)
                                             dix_main+0x3c3 (/usr/libexec/Xorg)
                                             __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
                                             _start+0x29 (/usr/libexec/Xorg)
           0.257 ( 0.000 ms): Xorg/2068 majfault [miZeroClipLine+0x0] => /usr/libexec/Xorg@0x18e830 (x.)
                                             miZeroClipLine+0x0 (/usr/libexec/Xorg)
                                             _fbSegment+0x2c0 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             sfbSegment1+0x67 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             fbPolySegment32+0x361 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             sna_poly_segment+0x743 (/usr/lib64/xorg/modules/drivers/intel_drv.so)
                                             damagePolySegment+0x77 (/usr/libexec/Xorg)
                                             ProcPolySegment+0xe7 (/usr/libexec/Xorg)
                                             Dispatch+0x25f (/usr/libexec/Xorg)
                                             dix_main+0x3c3 (/usr/libexec/Xorg)
                                             __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
                                             _start+0x29 (/usr/libexec/Xorg)
      ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-8h6ssirw5z15qyhy2lwd6f89@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0c3a6ef4
    • A
      perf trace: Extract evsel contructor from perf_evlist__add_pgfault · 0ae537cb
      Arnaldo Carvalho de Melo 提交于
      Prep work for next patches, where we'll need access to the created
      evsels, to possibly configure callchains.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-2pcgsgnkgellhlcao4aub8tu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0ae537cb
    • A
      perf buildid: Fix off-by-one in write_buildid() · 70a2cba9
      Andrey Ryabinin 提交于
      write_buildid() increments 'name_len' with intention to take into
      account trailing zero byte. However, 'name_len' was already incremented
      in machine__write_buildid_table() before.  So this leads to
      out-of-bounds read in do_write():
      
        $ ./perf record sleep 0
        [ perf record: Woken up 1 times to write data ]
        =================================================================
        ==15899==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000099fc92 at pc 0x7f1aa9c7eab5 bp 0x7fff940f84d0 sp 0x7fff940f7c78
        READ of size 19 at 0x00000099fc92 thread T0
            #0 0x7f1aa9c7eab4  (/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/libasan.so.2+0x44ab4)
            #1 0x649c5b in do_write util/header.c:67
            #2 0x649c5b in write_padded util/header.c:82
            #3 0x57e8bc in write_buildid util/build-id.c:239
            #4 0x57e8bc in machine__write_buildid_table util/build-id.c:278
        ...
      
        0x00000099fc92 is located 0 bytes to the right of global variable '*.LC99' defined in 'util/symbol.c' (0x99fc80) of size 18
          '*.LC99' is ascii string '[kernel.kallsyms]'
        ...
      
        Shadow bytes around the buggy address:
          0x00008012bf80: f9 f9 f9 f9 00 00 00 00 00 00 03 f9 f9 f9 f9 f9
        =>0x00008012bf90: 00 00[02]f9 f9 f9 f9 f9 00 00 00 00 00 05 f9 f9
          0x00008012bfa0: f9 f9 f9 f9 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1461053847-5633-1-git-send-email-aryabinin@virtuozzo.com
      [ Remove the off-by one at the origin, to keep len(s) == strlen(s) assumption ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      70a2cba9