1. 26 8月, 2015 5 次提交
    • W
      perf probe: Support probing at absolute address · da15bd9d
      Wang Nan 提交于
      It should be useful to allow 'perf probe' probe at absolute offset of a
      target. For example, when (u)probing at a instruction of a shared object
      in a embedded system where debuginfo is not avaliable but we know the
      offset of that instruction by manually digging.
      
      This patch enables following perf probe command syntax:
      
        # perf probe 0xffffffff811e6615
      
      And
      
        # perf probe /lib/x86_64-linux-gnu/libc-2.19.so 0xeb860
      
      In the above example, we don't need a anchor symbol, so it is possible
      to compute absolute addresses using other methods and then use 'perf
      probe' to create the probing points.
      
      v1 -> v2:
        Drop the leading '+' in cmdline;
        Allow uprobing at offset 0x0;
        Improve 'perf probe -l' result when uprobe at area without debuginfo.
      
      v2 -> v3:
        Split bugfix to a separated patch.
      
      Test result:
      
        # perf probe 0xffffffff8119d175 %ax
        # perf probe sys_write %ax
        # perf probe /lib64/libc-2.18.so 0x0 %ax
        # perf probe /lib64/libc-2.18.so 0x5 %ax
        # perf probe /lib64/libc-2.18.so 0xd8e40 %ax
        # perf probe /lib64/libc-2.18.so __write %ax
        # perf probe /lib64/libc-2.18.so 0xd8e49 %ax
        # cat /sys/kernel/debug/tracing/uprobe_events
      
        p:probe_libc/abs_0 /lib64/libc-2.18.so:0x          (null) arg1=%ax
        p:probe_libc/abs_5 /lib64/libc-2.18.so:0x0000000000000005 arg1=%ax
        p:probe_libc/abs_d8e40 /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax
        p:probe_libc/__write /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax
        p:probe_libc/abs_d8e49 /lib64/libc-2.18.so:0x00000000000d8e49 arg1=%ax
      
        # cat /sys/kernel/debug/tracing/kprobe_events
      
        p:probe/abs_ffffffff8119d175 0xffffffff8119d175 arg1=%ax
        p:probe/sys_write _text+1692016 arg1=%ax
      
        # perf probe -l
      
        Failed to find debug information for address 5
          probe:abs_ffffffff8119d175 (on sys_write+5 with arg1)
          probe:sys_write      (on sys_write with arg1)
          probe_libc:__write   (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1)
          probe_libc:abs_0     (on 0x0 in /lib64/libc-2.18.so with arg1)
          probe_libc:abs_5     (on 0x5 in /lib64/libc-2.18.so with arg1)
          probe_libc:abs_d8e40 (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1)
          probe_libc:abs_d8e49 (on __GI___libc_write+9 in /lib64/libc-2.18.so with arg1)
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1440586666-235233-7-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      da15bd9d
    • W
      perf probe: Fix error reported when offset without function · 6c6e024f
      Wang Nan 提交于
      This patch fixes a bug that, when offset is provided but function is
      lost, parse_perf_probe_point() will give a "" string as function name,
      so the checking code at the end of parse_perf_probe_point() become
      useless.  For example:
      
        # perf probe +0x1234
        Failed to find symbol  in kernel
          Error: Failed to add events.
      
      After this patch:
      
        # perf probe +0x1234
        Semantic error :Offset requires an entry function.
          Error: Command Parse Error.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1440586666-235233-6-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6c6e024f
    • W
      perf probe: Fix list result when address is zero · be07afe9
      Wang Nan 提交于
      When manually added uprobe point with zero address, 'perf probe -l'
      reports error. For example:
      
        # echo p:probe_libc/abs_0 /path/to/lib.bin:0x0 arg1=%ax > \
                 /sys/kernel/debug/tracing/uprobe_events
      
        # perf probe -l
        Error: Failed to show event list.
      
      Probing at 0x0 is possible and useful when lib.bin is not a normal
      shared object but is manually mapped. However, in this case kernel
      report:
      
        # cat /sys/kernel/debug/tracing/uprobe_events
        p:probe_libc/abs_0 /path/to/lib.bin:0x          (null) arg1=%ax
      
      This patch supports the above kernel output.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1440586666-235233-5-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      be07afe9
    • W
      perf probe: Fix list result when symbol can't be found · 614e2fdb
      Wang Nan 提交于
      'perf probe -l' reports error if it is unable find symbol through
      address. Here is an example.
      
        # echo 'p:probe_libc/abs_5 /lib64/libc.so.6:0x5' >
                /sys/kernel/debug/tracing/uprobe_events
        # cat /sys/kernel/debug/tracing/uprobe_events
         p:probe_libc/abs_5 /lib64/libc.so.6:0x0000000000000005
        # perf probe -l
          Error: Failed to show event list
      
      Also, this situation triggers a logical inconsistency in
      convert_to_perf_probe_point() that, it returns ENOMEM but actually it
      never try strdup().
      
      This patch removes !tp->module && !is_kprobe condition, so it always
      uses address to build function name if symbol not found.
      
      Test result:
      
        # perf probe -l
          probe_libc:abs_5     (on 0x5 in /lib64/libc.so.6)
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1440586666-235233-4-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      614e2fdb
    • W
      perf probe: Prevent segfault when reading probe point with absolute address · e486367f
      Wang Nan 提交于
      'perf probe -l' panic if there is a manually inserted probing point with
      absolute address. For example:
      
        # echo 'p:probe/abs_ffffffff811e6615 0xffffffff811e6615' > /sys/kernel/debug/tracing/kprobe_events
        # perf probe -l
        Segmentation fault (core dumped)
      
      This patch fix this problem by considering the situation that
      "tp->symbol == NULL" in find_perf_probe_point_from_dwarf() and
      find_perf_probe_point_from_map().
      
      After this patch:
      
        # perf probe -l
        probe:abs_ffffffff811e6615 (on SyS_write+5@fs/read_write.c)
      
      And when debug info is missing:
      
        # rm -rf ~/.debug
        # mv /lib/modules/4.2.0-rc1+/build/vmlinux /lib/modules/4.2.0-rc1+/build/vmlinux.bak
        # perf probe -l
        probe:abs_ffffffff811e6615 (on sys_write+5)
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1440509256-193590-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e486367f
  2. 25 8月, 2015 8 次提交
  3. 22 8月, 2015 1 次提交
  4. 21 8月, 2015 6 次提交
    • W
      perf probe: Try to use symbol table if searching debug info failed · 1c0bd0e8
      Wang Nan 提交于
      A problem can occur in a statically linked perf when vmlinux can be found:
      
       # perf probe --add sys_epoll_pwait
       probe-definition(0): sys_epoll_pwait
       symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null)
       0 arguments
       Looking at the vmlinux_path (7 entries long)
       Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols
       Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux
       Try to find probe point from debuginfo.
       Symbol sys_epoll_pwait address found : ffffffff8122bd40
       Matched function: SyS_epoll_pwait
       Failed to get call frame on 0xffffffff8122bd40
       An error occurred in debuginfo analysis (-2).
         Error: Failed to add events. Reason: No such file or directory (Code: -2)
      
      The reason is caused by libdw that, if libdw is statically linked, it
      can't load libebl_{arch}.so reliable.
      
      In this case it is still possible to get the address from
      /proc/kalksyms.  However, perf tries that only when libdw returns
      -EBADF.
      
      This patch gives it another chance to utilize symbol table, even if
      libdw returns an error code other than -EBADF.
      
      After applying this patch:
      
       # perf probe -nv --add sys_epoll_pwait
       probe-definition(0): sys_epoll_pwait
       symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null)
       0 arguments
       Looking at the vmlinux_path (7 entries long)
       Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols
       Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux
       Try to find probe point from debuginfo.
       Symbol sys_epoll_pwait address found : ffffffff8122bd40
       Matched function: SyS_epoll_pwait
       Failed to get call frame on 0xffffffff8122bd40
       An error occurred in debuginfo analysis (-2).
       Trying to use symbols.
       Opening /sys/kernel/debug/tracing/kprobe_events write=1
       Added new event:
       Writing event: p:probe/sys_epoll_pwait _text+2276672
         probe:sys_epoll_pwait (on sys_epoll_pwait)
      
       You can now use it in all perf tools, such as:
      
       	perf record -e probe:sys_epoll_pwait -aR sleep 1
      
      Although libdw returns an error (Failed to get call frame), perf tries
      symbol table and finally gets correct address.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1440151770-129878-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1c0bd0e8
    • A
      perf tools: Initialize reference counts in map__clone() · 66671d00
      Arnaldo Carvalho de Melo 提交于
      Map clone was written before we introduced reference counts for
      maps and dsos, so all that was needed was just a copy and then we
      would insert it into the new map_groups instance.
      
      Fix it by, after copying, initializing the map->refcnt, grabbing
      a struct dso refcount and resetting pointers that may be used
      to determine if a map, when deleted, is in a rb_tree.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-pd4mr80o5b9gvk50iineacec@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      66671d00
    • A
      perf tools: Add Intel BTS support · d0170af7
      Adrian Hunter 提交于
      Intel BTS support fits within the new auxtrace infrastructure.  Recording is
      supporting by identifying the Intel BTS PMU, parsing options and setting up
      events.
      
      Decoding is supported by queuing up trace data by thread and then decoding
      synchronously delivering synthesized event samples into the session processing
      for tools to consume.
      
      Committer note:
      
      E.g:
      
        [root@felicio ~]# perf record --per-thread -e intel_bts// ls
        anaconda-ks.cfg  apctest.output  bin  kernel-rt-3.10.0-298.rt56.171.el7.x86_64.rpm  libexec  lock_page.bpf.c  perf.data  perf.data.old
        [ perf record: Woken up 3 times to write data ]
        [ perf record: Captured and wrote 4.367 MB perf.data ]
        [root@felicio ~]# perf evlist -v
        intel_bts//: type: 6, size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1
        dummy:u: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1
        [root@felicio ~]# perf script # the navigate in the pager to some interesting place:
          ls 1843 1 branches: ffffffff810a60cb flush_signal_handlers ([kernel.kallsyms]) => ffffffff8121a522 setup_new_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8121a529 setup_new_exec ([kernel.kallsyms]) => ffffffff8122fa30 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fa5d do_close_on_exec ([kernel.kallsyms]) => ffffffff81767ae0 _raw_spin_lock ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff81767af4 _raw_spin_lock ([kernel.kallsyms]) => ffffffff8122fa62 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fac9 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fad2 do_close_on_exec ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8122fadd do_close_on_exec ([kernel.kallsyms]) => ffffffff8120fc80 filp_close ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8120fcaf filp_close ([kernel.kallsyms]) => ffffffff8120fcb6 filp_close ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8120fcc2 filp_close ([kernel.kallsyms]) => ffffffff812547f0 dnotify_flush ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff81254823 dnotify_flush ([kernel.kallsyms]) => ffffffff8120fcc7 filp_close ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8120fccd filp_close ([kernel.kallsyms]) => ffffffff81261790 locks_remove_posix ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff812617a3 locks_remove_posix ([kernel.kallsyms]) => ffffffff812617b9 locks_remove_posix ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) => ffffffff8120fcd2 filp_close ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8120fcd5 filp_close ([kernel.kallsyms]) => ffffffff812142c0 fput ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff812142d6 fput ([kernel.kallsyms]) => ffffffff812142df fput ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff8121430c fput ([kernel.kallsyms]) => ffffffff810b6580 task_work_add ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff810b65ad task_work_add ([kernel.kallsyms]) => ffffffff810b65b1 task_work_add ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff810b65c1 task_work_add ([kernel.kallsyms]) => ffffffff810bc710 kick_process ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff810bc725 kick_process ([kernel.kallsyms]) => ffffffff810bc742 kick_process ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff810bc742 kick_process ([kernel.kallsyms]) => ffffffff810b65c6 task_work_add ([kernel.kallsyms])
          ls 1843 1 branches: ffffffff810b65c9 task_work_add ([kernel.kallsyms]) => ffffffff81214311 fput ([kernel.kallsyms])
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-9-git-send-email-adrian.hunter@intel.com
      [ Merged sample->time fix for bug found after first round of testing on slightly older kernel ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d0170af7
    • A
      perf tools: Fix Intel PT timestamp handling · 81cd60cc
      Adrian Hunter 提交于
      Events that don't sample the timestamp have a timestamp value of -1.
      
      Intel PT processing wasn't taking that into account.
      
      This is particularly noticeable with Intel BTS because timestamps are
      not requested by default.
      
      Then, if the conversion of -1 to TSC results in a small number, the
      processing is unaffected.
      
      However if the conversion results in a big number, then the data is
      processed prematurely before relevant sideband data like mmap events,
      which in turn results in samples with unknown dsos.
      
      Commiter note:
      
      Since BTS wasn't upstream, I split the patch to fold the BTS part with
      the patch introducing it, to avoid having this bug in the commit
      history. PT was already upstream, so this patch contains that part.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1440060692-5585-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      81cd60cc
    • A
      perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy · 133de940
      Adrian Hunter 提交于
      The "/proc/kcore requires CAP_SYS_RAWIO" message comes up all the time
      for 'perf script' if vmlinux is not found and the user isn't root, even
      when the kernel is not being traced and even though the message is only
      really relevant for annotation.
      
      Change it to pr_debug and instead put a note in the message displayed if
      annotation is not possible.
      
      Also, the file being accessed might not be /proc/kcore.  Tools can be
      directed to a different location using the --kallsyms option in which
      case kcore is expected to be in the same directory.  Adjust the message
      so it is not misleading in that case.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Li Zhang <zhlcindy@linux.vnet.ibm.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1440065260-8802-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      133de940
    • A
      perf script: Fix segfault using --show-mmap-events · 05169df5
      Adrian Hunter 提交于
      Patch "perf script: Don't assume evsel position of tracking events"
      changed 'perf script' to use 'perf_evlist__id2evsel()'. That results
      in a segfault if there is more than 1 event and there are
      synthesized mmap events e.g.
      
      	$ perf record -e cycles,instructions -p$$ sleep 1
      	$ perf script --show-mmap-events
      	Segmentation fault (core dumped)
      
      That happens because these synthesized events have an 'id' of zero
      which does not match any 'evsel'.
      
      Currently, these synthesized events use the sample type of the first
      evsel.
      
      Change 'perf_evlist__id2evsel()' to reflect that which also makes
      it consistent with 'perf_evlist__event2evsel()'.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Fixes: 06b234ec ("perf script: Don't assume evsel position of tracking events")
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1440059205-1765-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      05169df5
  5. 20 8月, 2015 2 次提交
  6. 17 8月, 2015 10 次提交
    • A
      perf tools: Take Intel PT into use · 5efb1d54
      Adrian Hunter 提交于
      To record an AUX area, the weak function auxtrace_record__init() must be
      implemented.
      
      Equally to decode an AUX area, the AUX area tracing type must be added
      to the perf_event__process_auxtrace_info() function.
      
      This patch makes those two changes plus hooks up default config for the
      intel_pt PMU.  Also some brief documentation is provided for using the
      tools with intel_pt.
      
      Commiter note:
      
      E.g:
      
        [root@perf4 ~]# dmesg
        451 [0.405807] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
        [root@perf4 ~]# perf --version
        perf version 4.1.g53874a
        [root@perf4 ~]#  perf record -e intel_pt//u -a sleep 10
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.383 MB perf.data ]
        [root@perf4 ~]# perf evlist
        intel_pt//u
        sched:sched_switch
        dummy:u
        [root@perf4 ~]# perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 0  of event 'intel_pt//u'
        # Event count (approx.): 0
        #
        # Overhead  Command  Shared Object  Symbol
        # ........  .......  .............  ......
        #
      
        # Samples: 393  of event 'sched:sched_switch'
        # Event count (approx.): 393
        #
        # Overhead  Command         Shared Object     Symbol
        # ........  ..............  ................  ..............
          49.62%  swapper         [kernel.vmlinux]  [k] __schedule
          10.69%  rcu_sched       [kernel.vmlinux]  [k] __schedule
           6.62%  rcuos/0         [kernel.vmlinux]  [k] __schedule
           5.60%  kworker/0:1     [kernel.vmlinux]  [k] __schedule
           3.56%  rcuos/3         [kernel.vmlinux]  [k] __schedule
           3.05%  kworker/u384:2  [kernel.vmlinux]  [k] __schedule
           2.54%  kworker/2:0     [kernel.vmlinux]  [k] __schedule
           2.54%  tuned           [kernel.vmlinux]  [k] __schedule
        <SNIP>
        # Samples: 0  of event 'dummy:u'
        # Event count (approx.): 0
        #
        # Overhead  Command  Shared Object  Symbol
        # ........  .......  .............  ......
      
        # Samples: 28  of event 'instructions:u'
        # Event count (approx.): 5030172
        #
        # Overhead  Command     Shared Object        Symbol
        # ........  ..........  ...................  ................................
        #
          21.43%  tuned       libpython2.7.so.1.0  [.] PyEval_EvalFrameEx
                       |
                       ---PyEval_EvalFrameEx
                          |
                          |--83.33%-- PyEval_EvalCodeEx
                          |          PyEval_EvalFrameEx
                          |          |
                          |          |--60.00%-- PyEval_EvalCodeEx
                          |          |          PyEval_EvalFrameEx
                          |          |          PyEval_EvalFrameEx
                          |          |
                          |           --40.00%-- PyEval_EvalFrameEx
                          |
                           --16.67%-- PyEval_EvalFrameEx
                                     PyEval_EvalCodeEx
                                     PyEval_EvalFrameEx
                                     PyEval_EvalCodeEx
                                     PyEval_EvalFrameEx
                                     PyEval_EvalFrameEx
      
          14.29%  tuned       libpython2.7.so.1.0  [.] _PyType_Lookup
                       |
                       ---_PyType_Lookup
                          _PyObject_GenericGetAttrWithDict
                          PyEval_EvalFrameEx
                          PyEval_EvalCodeEx
                          PyEval_EvalFrameEx
                          PyEval_EvalCodeEx
                          PyEval_EvalFrameEx
                          |
                          |--75.00%-- PyEval_EvalFrameEx
                          |
                           --25.00%-- PyEval_EvalCodeEx
                                     PyEval_EvalFrameEx
                                     PyEval_EvalFrameEx
      
           3.57%  irqbalance  irqbalance           [.] 0x0000000000004038
                  |
                  ---0x4038
                     0x4761
                     0x4761
                     0x4761
                     0x49f1
                     0x2295
      
           3.57%  irqbalance  libc-2.17.so         [.] __GI_____strtoull_l_internal
                  |
                  ---__GI_____strtoull_l_internal
                     0x6f49
                     0x229a
      
           3.57%  irqbalance  libc-2.17.so         [.] __strchrnul
                  |
                  ---__strchrnul
                     vfprintf
                     __vsprintf_chk
                     __sprintf_chk
                     0x2724
                     0x4038
                     0x2331
      
           3.57%  irqbalance  libc-2.17.so         [.] __strstr_sse42
                  |
                  ---__strstr_sse42
                     0x71e0
                     0x229f
      
        # And now to some userspace ftrace on uninstrumented binaries 8-) :
        # Hand edited to make it a bit more compact, replacing /home/acme/bin/perf
        # with /bin/perf:
      
        [root@perf4 ~]# perf script
           perf 8921 [3] 7.310889: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310889: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310890: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310890: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310893: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310893: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310956: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310956: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310961: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310961: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.310968: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.310968: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.311040: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.311040: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.311046: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
           perf 8921 [3] 7.311046: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
           perf 8921 [3] 7.311050: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
           perf 8921 [3] 7.311050: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
      :
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-8-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5efb1d54
    • A
      perf tools: Add Intel PT support · 90e457f7
      Adrian Hunter 提交于
      Add support for Intel Processor Trace.
      
      Intel PT support fits within the new auxtrace infrastructure.  Recording
      is supporting by identifying the Intel PT PMU, parsing options and
      setting up events.
      
      Decoding is supported by queuing up trace data by cpu or thread and then
      decoding synchronously delivering synthesized event samples into the
      session processing for tools to consume.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-7-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      90e457f7
    • A
      perf tools: Add Intel PT decoder · f4aa0819
      Adrian Hunter 提交于
      Add support for decoding an Intel Processor Trace.
      
      Intel PT trace data must be 'decoded' which involves walking the object
      code and matching the trace data packets.
      
      The decoder requests a buffer of binary data via a get_trace()
      call-back, which it decodes using instruction information which it gets
      via another call-back walk_insn().
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-6-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f4aa0819
    • A
      perf tools: Add Intel PT log · 53af9284
      Adrian Hunter 提交于
      Add a facility to log Intel Processor Trace decoding.  The log is
      intended for debugging purposes only.
      
      The log file name is "intel_pt.log" and is opened in the current
      directory.  The log contains a record of all packets and instructions
      decoded and can get very large (10 MB would be a small one).
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-5-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      53af9284
    • A
      perf tools: Add Intel PT instruction decoder · 237fae79
      Adrian Hunter 提交于
      Add support for decoding instructions for Intel Processor Trace.  The
      kernel x86 instruction decoder is copied for this.
      
      This essentially provides intel_pt_get_insn() which takes a binary
      buffer, uses the kernel's x86 instruction decoder to get details of the
      instruction and then categorizes it for consumption by an Intel PT
      decoder.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1439450095-30122-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      237fae79
    • A
      perf tools: Add Intel PT packet decoder · a4e92590
      Adrian Hunter 提交于
      Add support for decoding Intel Processor Trace packets.
      
      This essentially provides intel_pt_get_packet() which takes a buffer of
      binary data and returns the decoded packet.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-3-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a4e92590
    • A
      perf auxtrace: Add Intel PT as an AUX area tracing type · 55ea4ab4
      Adrian Hunter 提交于
      Add the Intel Processor Trace type constant PERF_AUXTRACE_INTEL_PT.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1437150840-31811-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      55ea4ab4
    • A
      perf tools: Add a helper function to probe whether cpu-wide tracing is possible · 83509565
      Adrian Hunter 提交于
      Add a helper function to probe whether cpu-wide tracing is possible.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1439458857-30636-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      83509565
    • A
      perf symbols: Fix annotation of vdso · f0ee3b46
      Adrian Hunter 提交于
      Older kernels attempt to prelink vdso to its virtual address.  To permit
      annotation using objdump, the map__rip_2objdump() calculation must
      result in that same address which we can infer from the start and offset
      of the text section.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1439556606-11297-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f0ee3b46
    • A
      perf annotate: Fix 32-bit compilation error in util/annotate.c · 3d7245b0
      Adrian Hunter 提交于
      Fix the following 32-bit compilation errors:
      
        util/annotate.c: In function ‘addr_map_symbol__account_cycles’:
        util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘u64’ [-Werror=format=]
          pr_debug2("BB with bad start: addr %lx start %lx sym %lx saddr %lx\n",
            ^
        util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘u64’ [-Werror=format=]
        util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘u64’ [-Werror=format=]
      
      These were introduced by the patch:
      
      "perf report: Add infrastructure for a cycles histogram"
      
      Also change the 'saddr' variable from 'unsigned long' to 'u64'
      noting that theoretically we could be processing data captured
      on a 64-bit machine but processing it on a 32-bit machine.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Fixes: d4957633 ("perf report: Add infrastructure for a cycles histogram")
      Link: http://lkml.kernel.org/r/1439536294-18241-1-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3d7245b0
  7. 14 8月, 2015 1 次提交
  8. 13 8月, 2015 4 次提交
    • K
      perf report: Show call graph from reference events · 9e207ddf
      Kan Liang 提交于
      Introduce --show-ref-call-graph for perf report to print reference
      callgraph for no callgraph event.
      
      Here is an example.
      
       perf report --show-ref-call-graph --stdio
      
       # To display the perf.data header info, please use
       --header/--header-only options.
       #
       #
       # Total Lost Samples: 0
       #
       # Samples: 5  of event 'cpu/cpu-cycles,call-graph=fp/'
       # Event count (approx.): 144985
       #
       # Children      Self  Command  Shared Object     Symbol
       # ........  ........  .......  ................  ........................................
       #
          72.30%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--22.62%-- __GI___libc_nanosleep
                        --77.38%-- [...]
      
      ......
      
       # Samples: 6  of event 'cpu/instructions,call-graph=no/', show reference callgraph
       # Event count (approx.): 172780
       #
       # Children      Self  Command  Shared Object     Symbol
       # ........  ........  .......  ................  ........................................
       #
          73.16%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--31.44%-- __GI___libc_nanosleep
                        --68.56%-- [...]
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9e207ddf
    • K
      perf callchain: Allow disabling call graphs per event · f9db0d0f
      Kan Liang 提交于
      This patch introduce "call-graph=no" to disable per-event callgraph.
      
      Here is an example.
      
        perf record -e 'cpu/cpu-cycles,call-graph=fp/,cpu/instructions,call-graph=no/' sleep 1
      
        perf report --stdio
      
        # To display the perf.data header info, please use
        --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 6  of event 'cpu/cpu-cycles,call-graph=fp/'
        # Event count (approx.): 774218
        #
        # Children      Self  Command  Shared Object     Symbol
        # ........  ........  .......  ................  ........................................
        #
          61.94%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--97.30%-- __brk
                       |
                        --2.70%-- mmap64
                                  _dl_check_map_versions
                                  _dl_check_all_versions
      
          61.94%     0.00%  sleep    [kernel.vmlinux]  [k] perf_event_mmap
                    |
                    ---perf_event_mmap
                       |
                       |--97.30%-- do_brk
                       |          sys_brk
                       |          entry_SYSCALL_64_fastpath
                       |          __brk
                       |
                        --2.70%-- mmap_region
                                  do_mmap_pgoff
                                  vm_mmap_pgoff
                                  sys_mmap_pgoff
                                  sys_mmap
                                  entry_SYSCALL_64_fastpath
                                  mmap64
                                  _dl_check_map_versions
                                  _dl_check_all_versions
        ......
      
        # Samples: 6  of event 'cpu/instructions,call-graph=no/'
        # Event count (approx.): 359692
        #
        # Children      Self  Command  Shared Object     Symbol
        # ........  ........  .......  ................  .................................
        #
           89.03%     0.00%  sleep    [unknown]         [.] 0xffff6598ffff6598
           89.03%     0.00%  sleep    ld-2.17.so        [.] _dl_resolve_conflicts
           89.03%     0.00%  sleep    [kernel.vmlinux]  [k] page_fault
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9db0d0f
    • K
      perf callchain: Per-event type selection support · d457c963
      Kan Liang 提交于
      This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per
      event. This in term can reduce sampling overhead and the size of the
      perf.data.
      
      Here is an example.
      
        perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1
      
       perf evlist -v
       cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112,
       config: 0x3c, { sample_period, sample_freq }: 1000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
       inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
       1, exclude_guest: 1, mmap2: 1, comm_exec: 1
       cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, {
       sample_period, sample_freq }: 4000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID,
       disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1,
       exclude_guest: 1
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d457c963
    • M
      perf probe: Fix to show lines of sys_ functions correctly · 75186a9b
      Masami Hiramatsu 提交于
      "perf probe --lines sys_poll" shows only the first line of sys_poll,
      because the SYSCALL_DEFINE macro:
      
        ----
        SYSCALL_DEFINE*(foo,...)
        {
          body;
        }
        ----
      
        is expanded as below (on debuginfo)
      
        ----
      
        static inline int SYSC_foo(...)
        {
          body;
        }
        int SyS_foo(...) <- is an alias of sys_foo.
        {
          return SYSC_foo(...);
        }
        ----
      
      So, "perf probe --lines sys_foo" decodes SyS_foo function and it also skips
      inlined functions(SYSC_foo) inside the target function because those functions
      are usually defined somewhere else.
      
      To fix this issue, this fix checks whether the inlined function is defined at
      the same point of the target function, and if so, it doesn't skip the inline
      function.
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20150812012406.11811.94691.stgit@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      75186a9b
  9. 12 8月, 2015 1 次提交
  10. 11 8月, 2015 2 次提交