1. 12 4月, 2016 2 次提交
  2. 02 4月, 2016 1 次提交
    • W
      perf bpf: Add sample types for 'bpf-output' event · d37ba880
      Wang Nan 提交于
      Before this patch we can see very large time in the events before the
      'bpf-output' event. For example:
      
        # perf trace -vv -T --ev sched:sched_switch \
                            --ev bpf-output/no-inherit,name=evt/ \
                            --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                            usleep 10
        ...
        18446744073709.551 (18446564645918.480 ms): usleep/4157 nanosleep(rqtp: 0x7ffd3f0dc4e0) ...
        18446744073709.551 (         ): evt:Raise a BPF event!..)
        179427791.076 (         ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
        179427791.081 (         ): sched:sched_switch:usleep:4157 [120] S ==> swapper/2:0 [120])
        ...
      
      We can also see the differences between bpf-output events and
      breakpoint events:
      
      For bpf output event:
         sample_type                    IP|TID|RAW|IDENTIFIER
      
      For tracepoint events:
         sample_type                    IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER
      
      This patch fix this differences by adding more sample type for
      bpf-output events.
      
      After this patch:
      
        # perf trace -vv -T --ev sched:sched_switch \
                            --ev bpf-output/no-inherit,name=evt/ \
                            --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                            usleep 10
        ...
        179877370.878 ( 0.003 ms): usleep/5336 nanosleep(rqtp: 0x7ffff866c450) ...
        179877370.878 (         ): evt:Raise a BPF event!..)
        179877370.878 (         ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
        179877370.882 (         ): sched:sched_switch:usleep:5336 [120] S ==> swapper/4:0 [120])
        179877370.945 (         ): evt:Raise a BPF event!..)
        ...
      
        # ./perf trace -vv -T --ev sched:sched_switch \
                              --ev bpf-output/no-inherit,name=evt/ \
                              --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                              usleep 10 2>&1 | grep sample_type
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
      
      The 'IDENTIFIER' info is not required because all events have the same
      sample_type.
      
      Committer notes:
      
      Further testing, on top of the changes making 'perf trace' avoid samples
      from events without PERF_SAMPLE_TIME:
      
      Before:
      
        # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
        <SNIP>
          0.560 ( 0.001 ms): brk(                                                   ) = 0x55e5a1df8000
          18446640227439.430 (18446640227438.859 ms): nanosleep(rqtp: 0x7ffc96643370) ...
          18446640227439.430 (         ): evt:Raise a BPF event!..)
          0.576 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
          18446640227439.430 (         ): evt:Raise a BPF event!..)
          0.645 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
          0.646 ( 0.076 ms):  ... [continued]: nanosleep()) = 0
        #
      
      After:
      
        # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
        <SNIP>
           0.292 ( 0.001 ms): brk(                          ) = 0x55c7cd6e1000
           0.302 ( 0.004 ms): nanosleep(rqtp: 0x7ffedd8bc0f0) ...
           0.302 (         ): evt:Raise a BPF event!..)
           0.303 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
           0.397 (         ): evt:Raise a BPF event!..)
           0.397 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
           0.398 ( 0.100 ms):  ... [continued]: nanosleep()) = 0
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Reported-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1459517202-42320-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d37ba880
  3. 23 3月, 2016 1 次提交
  4. 23 2月, 2016 1 次提交
    • W
      perf tools: Introduce bpf-output event · 03e0a7df
      Wang Nan 提交于
      Commit a43eec30 ("bpf: introduce bpf_perf_event_output() helper")
      adds a helper to enable a BPF program to output data to a perf ring
      buffer through a new type of perf event, PERF_COUNT_SW_BPF_OUTPUT. This
      patch enables perf to create events of that type. Now a perf user can
      use the following cmdline to receive output data from BPF programs:
      
        # perf record -a -e bpf-output/no-inherit,name=evt/ \
                          -e ./test_bpf_output.c/map:channel.event=evt/ ls /
        # perf script
           perf 1560 [004] 347747.086295:  evt: ffffffff811fd201 sys_write ...
           perf 1560 [004] 347747.086300:  evt: ffffffff811fd201 sys_write ...
           perf 1560 [004] 347747.086315:  evt: ffffffff811fd201 sys_write ...
                  ...
      
      Test result:
      
        # cat test_bpf_output.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        struct bpf_map_def {
       	unsigned int type;
       	unsigned int key_size;
       	unsigned int value_size;
       	unsigned int max_entries;
        };
      
        #define SEC(NAME) __attribute__((section(NAME), used))
        static u64 (*ktime_get_ns)(void) =
       	(void *)BPF_FUNC_ktime_get_ns;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
       	(void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
       	(void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
       	(void *)BPF_FUNC_perf_event_output;
      
        struct bpf_map_def SEC("maps") channel = {
       	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
       	.key_size = sizeof(int),
       	.value_size = sizeof(u32),
       	.max_entries = __NR_CPUS__,
        };
      
        SEC("func_write=sys_write")
        int func_write(void *ctx)
        {
       	struct {
       		u64 ktime;
       		int cpuid;
       	} __attribute__((packed)) output_data;
       	char error_data[] = "Error: failed to output: %d\n";
      
       	output_data.cpuid = get_smp_processor_id();
       	output_data.ktime = ktime_get_ns();
       	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
       				    &output_data, sizeof(output_data));
       	if (err)
       		trace_printk(error_data, sizeof(error_data), err);
       	return 0;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************ END ***************************/
      
        # perf record -a -e bpf-output/no-inherit,name=evt/ \
                          -e ./test_bpf_output.c/map:channel.event=evt/ ls /
        # perf script | grep ls
           ls  2242 [003] 347851.557563:   evt: ffffffff811fd201 sys_write ...
           ls  2242 [003] 347851.557571:   evt: ffffffff811fd201 sys_write ...
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-11-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      03e0a7df
  5. 18 2月, 2016 1 次提交
    • J
      perf record: Add --all-user/--all-kernel options · 85723885
      Jiri Olsa 提交于
      Allow user to easily switch all events to user or kernel space with simple
      --all-user or --all-kernel options.
      
      This will be handy within perf mem/c2c wrappers to switch easily monitoring
      modes.
      
      Committer note:
      
      Testing it:
      
        # perf record --all-kernel --all-user -a sleep 2
         Error: option `all-user' cannot be used with all-kernel
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
              --all-user        Configure all used events to run in user space.
              --all-kernel      Configure all used events to run in kernel space.
        # perf record --all-user --all-kernel -a sleep 2
         Error: option `all-kernel' cannot be used with all-user
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
              --all-kernel      Configure all used events to run in kernel space.
              --all-user        Configure all used events to run in user space.
        # perf record --all-user -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.416 MB perf.data (162 samples) ]
        # perf report | grep '\[k\]'
        # perf record --all-kernel -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.423 MB perf.data (296 samples) ]
        # perf report | grep '\[\.\]'
        #
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-2-git-send-email-jolsa@kernel.org
      [ Made those options to be mutually exclusive ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      85723885
  6. 26 1月, 2016 1 次提交
  7. 09 1月, 2016 1 次提交
    • N
      perf evlist: Add --trace-fields option to show trace fields · 775d8a1b
      Namhyung Kim 提交于
      To use dynamic sort keys, it might be good to add an option to see the
      list of field names.
      
        $ perf evlist -i perf.data.sched
        sched:sched_switch
        sched:sched_stat_wait
        sched:sched_stat_sleep
        sched:sched_stat_iowait
        sched:sched_stat_runtime
        sched:sched_process_fork
        sched:sched_wakeup
        sched:sched_wakeup_new
        sched:sched_migrate_task
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      
        $ perf evlist -i perf.data.sched --trace-fields
        sched:sched_switch: trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
        sched:sched_stat_wait: trace_fields: comm,pid,delay
        sched:sched_stat_sleep: trace_fields: comm,pid,delay
        sched:sched_stat_iowait: trace_fields: comm,pid,delay
        sched:sched_stat_runtime: trace_fields: comm,pid,runtime,vruntime
        sched:sched_process_fork: trace_fields: parent_comm,parent_pid,child_comm,child_pid
        sched:sched_wakeup: trace_fields: comm,pid,prio,success,target_cpu
        sched:sched_wakeup_new: trace_fields: comm,pid,prio,success,target_cpu
        sched:sched_migrate_task: trace_fields: comm,pid,prio,orig_cpu,dest_cpu
      
      Committer notes:
      
      For another file, in verbose mode:
      
        # perf evlist -v --trace-fields
        sched:sched_switch: type: 2, size: 112, config: 0x10b, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
        #
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1452125549-1511-5-git-send-email-namhyung@kernel.org
      [ Replaced 'trace_fields=' with 'trace_fields: ' to make the output consistent in -v mode ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      775d8a1b
  8. 14 12月, 2015 1 次提交
  9. 08 12月, 2015 2 次提交
  10. 27 11月, 2015 1 次提交
  11. 30 10月, 2015 1 次提交
    • W
      perf bpf: Attach eBPF filter to perf event · 1f45b1d4
      Wang Nan 提交于
      This is the final patch which makes basic BPF filter work. After
      applying this patch, users are allowed to use BPF filter like:
      
       # perf record --event ./hello_world.o ls
      
      A bpf_fd field is appended to 'struct evsel', and setup during the
      callback function add_bpf_event() for each 'probe_trace_event'.
      
      PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
      created perf event. The file descriptor of the eBPF program is passed to
      perf record using previous patches, and stored into evsel->bpf_fd.
      
      It is possible that different perf event are created for one kprobe
      events for different CPUs. In this case, when trying to call the ioctl,
      EEXIST will be return. This patch doesn't treat it as an error.
      
      Committer note:
      
      The bpf proggie used so far:
      
        __attribute__((section("fork=_do_fork"), used))
        int fork(void *ctx)
        {
      	  return 0;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
      
      failed to produce any samples, even with forks happening and it being
      running in system wide mode.
      
      That is because now the filter is being associated, and the code above
      always returns zero, meaning that all forks will be probed but filtered
      away ;-/
      
      Change it to 'return 1;' instead and after that:
      
        # trace --no-syscalls --event /tmp/foo.o
           0.000 perf_bpf_probe:fork:(ffffffff8109be30))
           2.333 perf_bpf_probe:fork:(ffffffff8109be30))
           3.725 perf_bpf_probe:fork:(ffffffff8109be30))
           4.550 perf_bpf_probe:fork:(ffffffff8109be30))
        ^C#
      
      And it works with all tools, including 'perf trace'.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f45b1d4
  12. 28 10月, 2015 1 次提交
    • W
      perf tools: Enable pre-event inherit setting by config terms · 374ce938
      Wang Nan 提交于
      This patch allows perf record setting event's attr.inherit bit by
      config terms like:
      
        # perf record -e cycles/no-inherit/ ...
        # perf record -e cycles/inherit/ ...
      
      So user can control inherit bit for each event separately.
      
      In following example, a.out fork()s in main then do some complex
      CPU intensive computations in both of its children.
      
      Basic result with and without inherit:
      
        # perf record -e cycles -e instructions ./a.out
        [ perf record: Woken up 9 times to write data ]
        [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
        # perf report --stdio
        # ...
        # Samples: 23K of event 'cycles'
        # Event count (approx.): 23641752891
        ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30428312415
      
        # perf record -i -e cycles -e instructions ./a.out
        [ perf record: Woken up 5 times to write data ]
        [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
        ...
        # Samples: 12K of event 'cycles'
        # Event count (approx.): 11699501775
        ...
        # Samples: 12K of event 'instructions'
        # Event count (approx.): 15058023559
      
      Cancel inherit for one event when globally enable:
      
        # perf record -e cycles/no-inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
        ...
        # Samples: 12K of event 'cycles/no-inherit/'
        # Event count (approx.): 11895759282
       ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30668000441
      
      Enable inherit for one event when globally disable:
      
        # perf record -i -e cycles/inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
        ...
        # Samples: 23K of event 'cycles/inherit/'
        # Event count (approx.): 23285400229
        ...
        # Samples: 11K of event 'instructions'
        # Event count (approx.): 14969050259
      
      Committer note:
      
      One can check if the bit was set, in addition to seeing the result in
      the perf.data file size as above by doing one of:
      
        # perf record -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      So, the inherit bit was set in both, now, if we disable it globally using
      --no-inherit:
      
        # perf record --no-inherit -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
      
      No inherit bit set, then disabling it and setting just on the cycles event:
      
        # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
        # perf evlist -v
        cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      We can see it as well in by using a more verbose level of debug messages in
      the tool that sets up the perf_event_attr, 'perf record' in this case:
      
        [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          mmap                             1
          comm                             1
          freq                             1
          task                             1
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          config                           0x1
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          freq                             1
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
      
      <SNIP>
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
      [ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      374ce938
  13. 22 10月, 2015 1 次提交
  14. 06 10月, 2015 2 次提交
  15. 15 9月, 2015 2 次提交
  16. 14 9月, 2015 1 次提交
  17. 01 9月, 2015 1 次提交
    • S
      perf record: Add ability to name registers to record · bcc84ec6
      Stephane Eranian 提交于
      This patch modifies the -I/--int-regs option to enablepassing the name
      of the registers to sample on interrupt. Registers can be specified by
      their symbolic names. For instance on x86, --intr-regs=ax,si.
      
      The motivation is to reduce the size of the perf.data file and the
      overhead of sampling by only collecting the registers useful to a
      specific analysis. For instance, for value profiling, sampling only the
      registers used to passed arguements to functions.
      
      With no parameter, the --intr-regs still records all possible registers
      based on the architecture.
      
      To name registers, it is necessary to use the long form of the option,
      i.e., --intr-regs:
      
        $ perf record --intr-regs=si,di,r8,r9 .....
      
      To record any possible registers:
      
        $ perf record -I .....
        $ perf report --intr-regs ...
      
      To display the register, one can use perf report -D
      
      To list the available registers:
      
        $ perf record --intr-regs=\?
        available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1441039273-16260-4-git-send-email-eranian@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bcc84ec6
  18. 29 8月, 2015 1 次提交
  19. 13 8月, 2015 2 次提交
    • K
      perf callchain: Allow disabling call graphs per event · f9db0d0f
      Kan Liang 提交于
      This patch introduce "call-graph=no" to disable per-event callgraph.
      
      Here is an example.
      
        perf record -e 'cpu/cpu-cycles,call-graph=fp/,cpu/instructions,call-graph=no/' sleep 1
      
        perf report --stdio
      
        # To display the perf.data header info, please use
        --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 6  of event 'cpu/cpu-cycles,call-graph=fp/'
        # Event count (approx.): 774218
        #
        # Children      Self  Command  Shared Object     Symbol
        # ........  ........  .......  ................  ........................................
        #
          61.94%     0.00%  sleep    [kernel.vmlinux]  [k] entry_SYSCALL_64_fastpath
                    |
                    ---entry_SYSCALL_64_fastpath
                       |
                       |--97.30%-- __brk
                       |
                        --2.70%-- mmap64
                                  _dl_check_map_versions
                                  _dl_check_all_versions
      
          61.94%     0.00%  sleep    [kernel.vmlinux]  [k] perf_event_mmap
                    |
                    ---perf_event_mmap
                       |
                       |--97.30%-- do_brk
                       |          sys_brk
                       |          entry_SYSCALL_64_fastpath
                       |          __brk
                       |
                        --2.70%-- mmap_region
                                  do_mmap_pgoff
                                  vm_mmap_pgoff
                                  sys_mmap_pgoff
                                  sys_mmap
                                  entry_SYSCALL_64_fastpath
                                  mmap64
                                  _dl_check_map_versions
                                  _dl_check_all_versions
        ......
      
        # Samples: 6  of event 'cpu/instructions,call-graph=no/'
        # Event count (approx.): 359692
        #
        # Children      Self  Command  Shared Object     Symbol
        # ........  ........  .......  ................  .................................
        #
           89.03%     0.00%  sleep    [unknown]         [.] 0xffff6598ffff6598
           89.03%     0.00%  sleep    ld-2.17.so        [.] _dl_resolve_conflicts
           89.03%     0.00%  sleep    [kernel.vmlinux]  [k] page_fault
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f9db0d0f
    • K
      perf callchain: Per-event type selection support · d457c963
      Kan Liang 提交于
      This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per
      event. This in term can reduce sampling overhead and the size of the
      perf.data.
      
      Here is an example.
      
        perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1
      
       perf evlist -v
       cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112,
       config: 0x3c, { sample_period, sample_freq }: 1000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1,
       inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all:
       1, exclude_guest: 1, mmap2: 1, comm_exec: 1
       cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, {
       sample_period, sample_freq }: 4000, sample_type:
       IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID,
       disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1,
       exclude_guest: 1
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d457c963
  20. 11 8月, 2015 2 次提交
  21. 10 8月, 2015 1 次提交
  22. 06 8月, 2015 1 次提交
  23. 05 8月, 2015 1 次提交
  24. 30 7月, 2015 2 次提交
  25. 24 7月, 2015 2 次提交
  26. 21 7月, 2015 1 次提交
    • W
      perf record: Apply filter to all events in a glob matching · 15bfd2cc
      Wang Nan 提交于
      There is an old problem in perf's filter applying which first posted at
      Sep. 2014 at https://lkml.org/lkml/2014/9/9/944 that, if passing
      multiple events in a glob matching expression in cmdline then add
      '--filter' after them, the filter will be applied on only the last one.
      
      For example:
      
       # dd if=/dev/zero of=/dev/null &
       [1] 464
       # perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.239 MB perf.data (2094 samples) ]
       # perf report --stdio | tee
       ...
       # Samples: 2K of event 'syscalls:sys_enter_read'
       # Event count (approx.): 2092
       ...
       # Samples: 2  of event 'syscalls:sys_exit_read'
       # Event count (approx.): 2
       ...
      
      In this example, filter only applied on 'syscalls:sys_exit_read', and
      there's no way to set filter for ''syscalls:sys_enter_read'.
      
      This patch adds a 'cmdline_group_boundary' for 'struct evsel', and
      apply filter on all events between two boundary marks.
      
      After applying this patch:
      
       # perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
       [ perf record: Woken up 1 times to write data ]
       [ perf record: Captured and wrote 0.031 MB perf.data (3 samples) ]
       # perf report --stdio | tee
       ...
       # Samples: 1  of event 'syscalls:sys_enter_read'
       # Event count (approx.): 1
       ...
       # Samples: 2  of event 'syscalls:sys_exit_read'
       # Event count (approx.): 2
       ...
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Reported-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1436513770-8896-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      15bfd2cc
  27. 06 7月, 2015 4 次提交
  28. 26 6月, 2015 2 次提交