1. 08 12月, 2015 2 次提交
  2. 30 10月, 2015 1 次提交
    • W
      perf bpf: Attach eBPF filter to perf event · 1f45b1d4
      Wang Nan 提交于
      This is the final patch which makes basic BPF filter work. After
      applying this patch, users are allowed to use BPF filter like:
      
       # perf record --event ./hello_world.o ls
      
      A bpf_fd field is appended to 'struct evsel', and setup during the
      callback function add_bpf_event() for each 'probe_trace_event'.
      
      PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
      created perf event. The file descriptor of the eBPF program is passed to
      perf record using previous patches, and stored into evsel->bpf_fd.
      
      It is possible that different perf event are created for one kprobe
      events for different CPUs. In this case, when trying to call the ioctl,
      EEXIST will be return. This patch doesn't treat it as an error.
      
      Committer note:
      
      The bpf proggie used so far:
      
        __attribute__((section("fork=_do_fork"), used))
        int fork(void *ctx)
        {
      	  return 0;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
      
      failed to produce any samples, even with forks happening and it being
      running in system wide mode.
      
      That is because now the filter is being associated, and the code above
      always returns zero, meaning that all forks will be probed but filtered
      away ;-/
      
      Change it to 'return 1;' instead and after that:
      
        # trace --no-syscalls --event /tmp/foo.o
           0.000 perf_bpf_probe:fork:(ffffffff8109be30))
           2.333 perf_bpf_probe:fork:(ffffffff8109be30))
           3.725 perf_bpf_probe:fork:(ffffffff8109be30))
           4.550 perf_bpf_probe:fork:(ffffffff8109be30))
        ^C#
      
      And it works with all tools, including 'perf trace'.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f45b1d4
  3. 28 10月, 2015 2 次提交
    • W
      perf tools: Enable pre-event inherit setting by config terms · 374ce938
      Wang Nan 提交于
      This patch allows perf record setting event's attr.inherit bit by
      config terms like:
      
        # perf record -e cycles/no-inherit/ ...
        # perf record -e cycles/inherit/ ...
      
      So user can control inherit bit for each event separately.
      
      In following example, a.out fork()s in main then do some complex
      CPU intensive computations in both of its children.
      
      Basic result with and without inherit:
      
        # perf record -e cycles -e instructions ./a.out
        [ perf record: Woken up 9 times to write data ]
        [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
        # perf report --stdio
        # ...
        # Samples: 23K of event 'cycles'
        # Event count (approx.): 23641752891
        ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30428312415
      
        # perf record -i -e cycles -e instructions ./a.out
        [ perf record: Woken up 5 times to write data ]
        [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
        ...
        # Samples: 12K of event 'cycles'
        # Event count (approx.): 11699501775
        ...
        # Samples: 12K of event 'instructions'
        # Event count (approx.): 15058023559
      
      Cancel inherit for one event when globally enable:
      
        # perf record -e cycles/no-inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
        ...
        # Samples: 12K of event 'cycles/no-inherit/'
        # Event count (approx.): 11895759282
       ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30668000441
      
      Enable inherit for one event when globally disable:
      
        # perf record -i -e cycles/inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
        ...
        # Samples: 23K of event 'cycles/inherit/'
        # Event count (approx.): 23285400229
        ...
        # Samples: 11K of event 'instructions'
        # Event count (approx.): 14969050259
      
      Committer note:
      
      One can check if the bit was set, in addition to seeing the result in
      the perf.data file size as above by doing one of:
      
        # perf record -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      So, the inherit bit was set in both, now, if we disable it globally using
      --no-inherit:
      
        # perf record --no-inherit -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
      
      No inherit bit set, then disabling it and setting just on the cycles event:
      
        # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
        # perf evlist -v
        cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      We can see it as well in by using a more verbose level of debug messages in
      the tool that sets up the perf_event_attr, 'perf record' in this case:
      
        [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          mmap                             1
          comm                             1
          freq                             1
          task                             1
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          config                           0x1
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          freq                             1
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
      
      <SNIP>
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
      [ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      374ce938
    • J
      perf evsel: Move id_offset out of struct perf_evsel union member · af339981
      Jiri Olsa 提交于
      Because the 'perf stat record' patches will use the id_offset member
      together with the priv pointer.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NKan Liang <kan.liang@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1445784728-21732-29-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af339981
  4. 06 10月, 2015 1 次提交
  5. 15 9月, 2015 2 次提交
  6. 14 9月, 2015 1 次提交
  7. 29 8月, 2015 1 次提交
  8. 13 8月, 2015 1 次提交
    • K
      perf callchain: Per-event type selection support · d457c963
      Kan Liang 提交于
      This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per
      event. This in term can reduce sampling overhead and the size of the
      perf.data.
      
      Here is an example.
      
        perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1
      
       perf evlist -v
       cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112,
       config: 0x3c, { sample_period, sample_freq }: 1000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
       inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
       1, exclude_guest: 1, mmap2: 1, comm_exec: 1
       cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, {
       sample_period, sample_freq }: 4000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID,
       disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1,
       exclude_guest: 1
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d457c963
  9. 11 8月, 2015 1 次提交
  10. 09 8月, 2015 1 次提交
  11. 05 8月, 2015 1 次提交
  12. 30 7月, 2015 2 次提交
  13. 21 7月, 2015 1 次提交
    • W
      perf record: Apply filter to all events in a glob matching · 15bfd2cc
      Wang Nan 提交于
      There is an old problem in perf's filter applying which first posted at
      Sep. 2014 at https://lkml.org/lkml/2014/9/9/944 that, if passing
      multiple events in a glob matching expression in cmdline then add
      '--filter' after them, the filter will be applied on only the last one.
      
      For example:
      
       # dd if=/dev/zero of=/dev/null &
       [1] 464
       # perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.239 MB perf.data (2094 samples) ]
       # perf report --stdio | tee
       ...
       # Samples: 2K of event 'syscalls:sys_enter_read'
       # Event count (approx.): 2092
       ...
       # Samples: 2  of event 'syscalls:sys_exit_read'
       # Event count (approx.): 2
       ...
      
      In this example, filter only applied on 'syscalls:sys_exit_read', and
      there's no way to set filter for ''syscalls:sys_enter_read'.
      
      This patch adds a 'cmdline_group_boundary' for 'struct evsel', and
      apply filter on all events between two boundary marks.
      
      After applying this patch:
      
       # perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.031 MB perf.data (3 samples) ]
       # perf report --stdio | tee
       ...
       # Samples: 1  of event 'syscalls:sys_enter_read'
       # Event count (approx.): 1
       ...
       # Samples: 2  of event 'syscalls:sys_exit_read'
       # Event count (approx.): 2
       ...
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Reported-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1436513770-8896-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      15bfd2cc
  14. 06 7月, 2015 3 次提交
    • A
      perf evsel: Introduce append_filter() method · 64ec84f5
      Arnaldo Carvalho de Melo 提交于
      To allow building filters in evsel->filter, that will eventually be
      applied via perf_evsel__apply_filter().
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-sjfoes3pycx7nlpmgedca13v@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      64ec84f5
    • A
      perf evsel: Introduce set_filter method · 12467ae4
      Arnaldo Carvalho de Melo 提交于
      Replaces existing filter string with the one provided.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-jst49z83li0yx3g18o54u51a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      12467ae4
    • A
      perf evsel: Rename set_filter to apply_filter · f47805a2
      Arnaldo Carvalho de Melo 提交于
      We need to be able to go on constructing a complex filter in multiple
      stages, since we can only set one filter per event.
      
      For instance, we need to be able, in 'perf trace' to filter by the
      'common_pid' field all the time, if only for the tracer itself, to
      avoid a feedback loop, and, in addition, we may want to filter the
      raw_syscalls:sys_{enter,exit} events by its 'id' filter, when using
      'perf trace -e open,close' or 'perf trace -e !open,close', i.e. when
      we are interested in just a subset of syscalls or when we are not
      interested in it.
      
      So we will have:
      
         perf_evsel__set_filter(evsel, char *filter)
      
             Replaces whatever is in evsel->filter.
      
         perf_evsel__append_filter(evsel, const char *op, char *filter)
      
             Appends, using op ("&&" or "||") with what is in evsel->filter.
      
         perf_evsel__apply_filter(evsel, filter):
      
              That actually applies a filter, be it the one being
              constructed in evsel->filter, or any other, for tools
              with more specific ways to build the filter, issuing
              the appropriate ioctl for all the evsel fds.
      
      The same changes will be made to the evlist__{set,apply} variants to
      keep everything consistent.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-2s5z9xtpnc2lwio3cv5x0jek@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f47805a2
  15. 26 6月, 2015 6 次提交
  16. 16 6月, 2015 1 次提交
  17. 18 5月, 2015 1 次提交
    • A
      perf tools: Elliminate alignment holes · 86066064
      Arnaldo Carvalho de Melo 提交于
      perf_evsel:
      
      Before:
      
      	/* size: 320, cachelines: 5, members: 35 */
      	/* sum members: 304, holes: 3, sum holes: 16 */
      
      After:
      
      	/* size: 304, cachelines: 5, members: 35 */
      	/* last cacheline: 48 bytes */
      
      perf_evlist:
      
      Before:
      
      	/* size: 2544, cachelines: 40, members: 17 */
      	/* sum members: 2533, holes: 2, sum holes: 11 */
      	/* last cacheline: 48 bytes */
      
      After:
      
      	/* size: 2536, cachelines: 40, members: 17 */
      	/* sum members: 2533, holes: 1, sum holes: 3 */
      	/* last cacheline: 40 bytes */
      
      timechart:
      
      Before:
      
      	/* size: 288, cachelines: 5, members: 21 */
      	/* sum members: 271, holes: 2, sum holes: 10 */
      	/* padding: 7 */
      	/* last cacheline: 32 bytes */
      
      After:
      
      	/* size: 272, cachelines: 5, members: 21 */
      	/* sum members: 271, holes: 1, sum holes: 1 */
      	/* last cacheline: 16 bytes */
      
      thread:
      
      Before:
      
      	/* size: 112, cachelines: 2, members: 15 */
      	/* sum members: 101, holes: 2, sum holes: 11 */
      	/* last cacheline: 48 bytes */
      
      After:
      
      	/* size: 104, cachelines: 2, members: 15 */
      	/* sum members: 101, holes: 1, sum holes: 3 */
      	/* last cacheline: 40 bytes */
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-a543w7zjl9yyrg9nkf1teukp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      86066064
  18. 08 4月, 2015 1 次提交
    • P
      perf tools: Merge all perf_event_attr print functions · 2c5e8c52
      Peter Zijlstra 提交于
      Currently there's 3 (that I found) different and incomplete
      implementations of printing perf_event_attr.
      
      This is quite silly. Merge the lot.
      
      While this patch does not retain the exact form all printing that I
      found is debug output and thus it should not be critical.
      
      Also, I cannot find a single print_event_desc() caller.
      
      Pre:
      
       $ perf record -vv -e cycles -- sleep 1
       ------------------------------------------------------------
       perf_event_attr:
        type                0
        size                104
        config              0
        sample_period       4000
        sample_freq         4000
        sample_type         0x107
        read_format         0
        disabled            1    inherit             1
        pinned              0    exclusive           0
        exclude_user        0    exclude_kernel      0
        exclude_hv          0    exclude_idle        0
        mmap                1    comm                1
        mmap2               1    comm_exec           1
        freq                1    inherit_stat        0
        enable_on_exec      1    task                1
        watermark           0    precise_ip          0
        mmap_data           0    sample_id_all       1
        exclude_host        0    exclude_guest       1
        excl.callchain_kern 0    excl.callchain_user 0
        wakeup_events       0
        wakeup_watermark    0
        bp_type             0
        bp_addr             0
        config1             0
        bp_len              0
        config2             0
        branch_sample_type  0
        sample_regs_user    0
        sample_stack_user   0
        sample_regs_intr    0
       ------------------------------------------------------------
      
       $ perf evlist  -vv
       cycles: sample_freq=4000, size: 104, sample_type: IP|TID|TIME|PERIOD,
       disabled: 1, inherit: 1, mmap: 1, mmap2: 1, comm: 1, comm_exec: 1,
       freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1
      
       Post:
      
       $ ./perf record -vv -e cycles -- sleep 1
       ------------------------------------------------------------
       perf_event_attr:
        size                             112
        { sample_period, sample_freq }   4000
        sample_type                      IP|TID|TIME|PERIOD
        disabled                         1
        inherit                          1
        mmap                             1
        comm                             1
        freq                             1
        enable_on_exec                   1
        task                             1
        sample_id_all                    1
        exclude_guest                    1
        mmap2                            1
        comm_exec                        1
      ------------------------------------------------------------
      
       $ ./perf evlist  -vv
       cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
       IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq:
       1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1,
       mmap2: 1, comm_exec: 1
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20150407091150.644238729@infradead.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2c5e8c52
  19. 03 4月, 2015 1 次提交
  20. 19 2月, 2015 1 次提交
    • K
      perf tools: Construct LBR call chain · 384b6055
      Kan Liang 提交于
      LBR call stack only has user-space callchains. It is output in the
      PERF_SAMPLE_BRANCH_STACK data format. For kernel callchains, it's
      still in the form of PERF_SAMPLE_CALLCHAIN.
      
      The perf tool has to handle both data sources to construct a
      complete callstack.
      
      For the "perf report -D" option, both lbr and fp information will be
      displayed.
      
      A new call chain recording option "lbr" is introduced into the perf
      tool for LBR call stack. The user can use --call-graph lbr to get
      the call stack information from hardware.
      
      Here are some examples.
      
      When profiling bc(1) on Fedora 19:
      
        echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph lbr bc -l < cmd
      
      If enabling LBR, perf report output looks like:
      
          50.36%       bc  bc                 [.] bc_divide
                       |
                       --- bc_divide
                           execute
                           run_code
                           yyparse
                           main
                           __libc_start_main
                           _start
          33.66%       bc  bc                 [.] _one_mult
                       |
                       --- _one_mult
                           bc_divide
                           execute
                           run_code
                           yyparse
                           main
                           __libc_start_main
                           _start
           7.62%       bc  bc                 [.] _bc_do_add
                       |
                       --- _bc_do_add
                          |
                          |--99.89%-- 0x2000186a8
                           --0.11%-- [...]
           6.83%       bc  bc                 [.] _bc_do_sub
                       |
                       --- _bc_do_sub
                          |
                          |--99.94%-- bc_add
                          |          execute
                          |          run_code
                          |          yyparse
                          |          main
                          |          __libc_start_main
                          |          _start
                           --0.06%-- [...]
           0.46%       bc  libc-2.17.so       [.] __memset_sse2
                       |
                       --- __memset_sse2
                          |
                          |--54.13%-- bc_new_num
                          |          |
                          |          |--51.00%-- bc_divide
                          |          |          execute
                          |          |          run_code
                          |          |          yyparse
                          |          |          main
                          |          |          __libc_start_main
                          |          |          _start
                          |          |
                          |          |--30.46%-- _bc_do_sub
                          |          |          bc_add
                          |          |          execute
                          |          |          run_code
                          |          |          yyparse
                          |          |          main
                          |          |          __libc_start_main
                          |          |          _start
                          |          |
                          |           --18.55%-- _bc_do_add
                          |                     bc_add
                          |                     execute
                          |                     run_code
                          |                     yyparse
                          |                     main
                          |                     __libc_start_main
                          |                     _start
                          |
                           --45.87%-- bc_divide
                                     execute
                                     run_code
                                     yyparse
                                     main
                                     __libc_start_main
                                     _start
      
      If using FP, perf report output looks like:
      
        echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph fp bc -l < cmd
      
          50.49%       bc  bc                 [.] bc_divide
                       |
                       --- bc_divide
          33.57%       bc  bc                 [.] _one_mult
                       |
                       --- _one_mult
           7.61%       bc  bc                 [.] _bc_do_add
                       |
                       --- _bc_do_add
                           0x2000186a8
           6.88%       bc  bc                 [.] _bc_do_sub
                       |
                       --- _bc_do_sub
           0.42%       bc  libc-2.17.so       [.] __memcpy_ssse3_back
                       |
                       --- __memcpy_ssse3_back
      
      If using LBR, perf report -D output looks like:
      
      3458145275743 0x2fd750 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x2): 9748/9748: 0x408ea8 period: 609644 addr: 0
      ... LBR call chain: nr:8
      .....  0: fffffffffffffe00
      .....  1: 0000000000408e50
      .....  2: 000000000040a458
      .....  3: 000000000040562e
      .....  4: 0000000000408590
      .....  5: 00000000004022c0
      .....  6: 00000000004015dd
      .....  7: 0000003d1cc21b43
      ... FP chain: nr:2
      .....  0: fffffffffffffe00
      .....  1: 0000000000408ea8
       ... thread: bc:9748
       ...... dso: /usr/bin/bc
      
      The LBR call stack has the following known limitations:
      
       - Zero length calls are not filtered out by the hardware
      
       - Exception handing such as setjmp/longjmp will have calls/returns not
         match
      
       - Pushing different return address onto the stack will have
         calls/returns not match
      
       - If callstack is deeper than the LBR, only the last entries are
         captured
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Simon Que <sque@chromium.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1420482185-29830-3-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      384b6055
  21. 02 12月, 2014 2 次提交
  22. 25 11月, 2014 5 次提交
  23. 29 10月, 2014 2 次提交
    • A
      perf tools: Add id index · 3c659eed
      Adrian Hunter 提交于
      Add an index of the event identifiers, in preparation for Intel PT.
      
      The event id (also called the sample id) is a unique number
      allocated by the kernel to the event created by perf_event_open().  Events
      can include the event id by having a sample type including PERF_SAMPLE_ID or
      PERF_SAMPLE_IDENTIFIER.
      
      Currently the main use of the event id is to match an event back to the
      evsel to which it belongs i.e. perf_evlist__id2evsel()
      
      The purpose of this patch is to make it possible to match an event back to
      the mmap from which it was read.  The reason that is useful is because the
      mmap represents a time-ordered context (either for a cpu or for a thread).
      Intel PT decodes trace information on that basis.  In full-trace mode, that
      information can be recorded when the Intel PT trace is read, but in
      sample-mode the Intel PT trace data is embedded in a sample and it is in
      that case that the "id index" is needed.
      
      So the mmaps are numbered (idx) and the cpu and tid recorded against the id
      by perf_evlist__set_sid_idx() which is called by perf_evlist__mmap_per_evsel().
      
      That information is recorded on the perf.data file in the new "id index".
      idx, cpu and tid are added to struct perf_sample_id (which is the node of
      evlist's hash table to match ids to evsels).  The information can be
      retrieved using perf_evlist__id2sid().  Note however this all depends on
      having a sample type including PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER,
      otherwise ids are not recorded.
      
      The "id index" is a synthesized event record which will be created when
      Intel PT sampling is used by calling perf_event__synthesize_id_index().
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1414417770-18602-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3c659eed
    • A
      perf tools: Add facility to export data in database-friendly way · 0db15b1e
      Adrian Hunter 提交于
      This patch introduces an abstraction for exporting sample data in a
      database-friendly way.  The abstraction does not implement the actual
      output.  A subsequent patch takes this facility into use for extending
      the script interface.
      
      The abstraction is needed because static data like symbols, dsos, comms
      etc need to be exported only once.  That means allocating them a unique
      identifier and recording it on each structure.  The member 'db_id' is
      used for that.  'db_id' is just a 64-bit sequence number.
      
      Exporting centres around the db_export__sample() function which exports
      the associated data structures if they have not yet been allocated a
      db_id.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1414061124-26830-6-git-send-email-adrian.hunter@intel.com
      [ committer note: Stash db_id using symbol_conf.priv_size + symbol__priv() and foo->priv areas ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0db15b1e