1. 25 2月, 2016 4 次提交
  2. 24 2月, 2016 15 次提交
  3. 23 2月, 2016 11 次提交
    • A
      perf tools: Dont stop PMU parsing on alias parse error · 940db6dc
      Andi Kleen 提交于
      When an error happens during alias parsing currently the complete
      parsing of all attributes of the PMU is stopped. This is breaks old perf
      on a newer kernel that may have not-yet-know alias attributes (such as
      .scale or .per-pkg).
      
      Continue when some attribute is unparseable.
      
      This is IMHO a stable candidate and should be backported to older
      versions to avoid problems with newer kernels.
      
      v2: Print warnings when something goes wrong.
      v3: Change warning to debug output
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: stable@vger.kernel.org # v3.6+
      Link: http://lkml.kernel.org/r/1455749095-18358-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      940db6dc
    • J
      perf script: Display addr/data_src/weight columns for raw events · ff7b1915
      Jiri Olsa 提交于
      Adding addr/data_src/weight columns for raw events.
      
      Example:
        $ perf script
        ...
        true 11883 322960.489590: ...  ffff8801aa0b8400        68501042             246 ffffffff813b2cd
        true 11883 322960.489600: ...  ffff8800b90b38d8        68501042             251 ffffffff811d0b7
        true 11883 322960.489612: ...  ffff880196893130        6a100142              94 ffffffff8177fb8
        true 11883 322960.489637: ...  ffff880164277b40        68100842             101 ffffffff813b2cd
        true 11883 322960.489683: ...  ffff880035d3d818        68501042             201 ffffffff811d0b7
        true 11883 322960.489733: ...      7fb9616efcf0        68100242             199     7fb961aaba9
        true 11883 322960.489818: ...  ffffea000481c39c        6a100142             122 ffffffff811b634
      
                                       ^^^^^^^^^^^^^^^^        ^^^^^^^^             ^^^
                                                   addr        data_src          weight
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-23-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ff7b1915
    • J
      perf script: Add data_src and weight column definitions · 94ddddfa
      Jiri Olsa 提交于
      Adding data_src and weight column definitions, so it's displayed for
      related sample types.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-22-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      94ddddfa
    • J
      perf tools: Use ARRAY_SIZE in mem sort display functions · b19a1b6a
      Jiri Olsa 提交于
      There's no need to define extra macros for that.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-13-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b19a1b6a
    • J
      perf mem: Add -e record option · ce1e22b0
      Jiri Olsa 提交于
      Adding -e option for perf mem record command, to be able to specify
      memory event directly.
      
      Get list of available events:
      
        $ perf mem record -e list
        ldlat-loads
        ldlat-stores
      
      Monitor ldlat-loads:
        $ perf mem record -e ldlat-loads true
      
      Committer notes:
      
      Further testing:
      
        # perf mem record -e ldlat-loads true
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.020 MB perf.data (10 samples) ]
        # perf evlist
        cpu/mem-loads,ldlat=30/P
        #
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-6-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ce1e22b0
    • J
      perf tools: Add monitored events array · acbe613e
      Jiri Olsa 提交于
      It will ease up configuration of memory events and addition of other
      memory events in following patches.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-5-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      acbe613e
    • J
      perf tools: Introduce cl_offset function · d3927110
      Jiri Olsa 提交于
      It'll be used in following patches.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-4-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3927110
    • J
      perf tools: Make cl_address global · e95cf700
      Jiri Olsa 提交于
      It'll be used in following patches.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e95cf700
    • D
      tools lib traceevent: Implement '%' operation · 0e47b38d
      Daniel Bristot de Oliveira 提交于
      The operation '%' is not implemented on event-parse.c, causing an error
      when parsing events with '%' the operation in its printk format. For
      example,
      
        # perf record -e sched:sched_deadline_yield ~/yield-test
          Warning: [sched:sched_deadline_yield] unknown op '%'
        ....
        # perf script
          Warning: [sched:sched_deadline_yield] unknown op '%'
              test  1641 [006]  3364.109319: sched:sched_deadline_yield: \
                              [FAILED TO PARSE] now=3364109314595        \
                              deadline=3364139295135 runtime=19975597
      
      This patch implements the '%' operation. With this patch, we see the
      correct output:
      
        # perf record -e sched:sched_deadline_yield ~/yield-test
          No Warning
      
        # perf script
              yield-test  4005 [001]  4623.650978: sched:sched_deadline_yield: \
                      now=4623.650974050                                       \
                      deadline=4623.680957364 remaining_runtime=19979611
      Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Juri Lelli <juri.lelli@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
      Link: http://lkml.kernel.org/r/5c96a395c56cea6d3d13d949051bdece86cc26e0.1456157869.git.bristot@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0e47b38d
    • W
      perf tools: Introduce bpf-output event · 03e0a7df
      Wang Nan 提交于
      Commit a43eec30 ("bpf: introduce bpf_perf_event_output() helper")
      adds a helper to enable a BPF program to output data to a perf ring
      buffer through a new type of perf event, PERF_COUNT_SW_BPF_OUTPUT. This
      patch enables perf to create events of that type. Now a perf user can
      use the following cmdline to receive output data from BPF programs:
      
        # perf record -a -e bpf-output/no-inherit,name=evt/ \
                          -e ./test_bpf_output.c/map:channel.event=evt/ ls /
        # perf script
           perf 1560 [004] 347747.086295:  evt: ffffffff811fd201 sys_write ...
           perf 1560 [004] 347747.086300:  evt: ffffffff811fd201 sys_write ...
           perf 1560 [004] 347747.086315:  evt: ffffffff811fd201 sys_write ...
                  ...
      
      Test result:
      
        # cat test_bpf_output.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        struct bpf_map_def {
       	unsigned int type;
       	unsigned int key_size;
       	unsigned int value_size;
       	unsigned int max_entries;
        };
      
        #define SEC(NAME) __attribute__((section(NAME), used))
        static u64 (*ktime_get_ns)(void) =
       	(void *)BPF_FUNC_ktime_get_ns;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
       	(void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
       	(void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
       	(void *)BPF_FUNC_perf_event_output;
      
        struct bpf_map_def SEC("maps") channel = {
       	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
       	.key_size = sizeof(int),
       	.value_size = sizeof(u32),
       	.max_entries = __NR_CPUS__,
        };
      
        SEC("func_write=sys_write")
        int func_write(void *ctx)
        {
       	struct {
       		u64 ktime;
       		int cpuid;
       	} __attribute__((packed)) output_data;
       	char error_data[] = "Error: failed to output: %d\n";
      
       	output_data.cpuid = get_smp_processor_id();
       	output_data.ktime = ktime_get_ns();
       	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
       				    &output_data, sizeof(output_data));
       	if (err)
       		trace_printk(error_data, sizeof(error_data), err);
       	return 0;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************ END ***************************/
      
        # perf record -a -e bpf-output/no-inherit,name=evt/ \
                          -e ./test_bpf_output.c/map:channel.event=evt/ ls /
        # perf script | grep ls
           ls  2242 [003] 347851.557563:   evt: ffffffff811fd201 sys_write ...
           ls  2242 [003] 347851.557571:   evt: ffffffff811fd201 sys_write ...
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-11-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      03e0a7df
    • W
      perf tools: Apply tracepoint event definition options to BPF script · 95088a59
      Wang Nan 提交于
      Users can pass options to tracepoints defined in the BPF script.  For
      example:
      
        # perf record -e ./test.c/no-inherit/ bash
        # dd if=/dev/zero of=/dev/null count=10000
        # exit
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.022 MB perf.data (139 samples) ]
      
        (no-inherit works, only the sys_read issued by bash are captured, at
         least 10000 sys_read issued by dd are skipped.)
      
      test.c:
      
        #define SEC(NAME) __attribute__((section(NAME), used))
        SEC("func=sys_read")
        int bpf_func__sys_read(void *ctx)
        {
            return 1;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
      
      no-inherit is applied to the kprobe event defined in test.c.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-10-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      95088a59
  4. 22 2月, 2016 10 次提交
    • W
      perf tools: Enable indices setting syntax for BPF map · e571e029
      Wang Nan 提交于
      This patch introduces a new syntax to perf event parser:
      
       # perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2
      
      By utilizing the basic facilities in bpf-loader.c which allow setting
      different slots in a BPF map separately, the newly introduced syntax
      allows perf to control specific elements in a BPF map.
      
      Test result:
      
        # cat ./test_bpf_map_3.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        #define SEC(NAME) __attribute__((section(NAME), used))
        struct bpf_map_def {
      	unsigned int type;
      	unsigned int key_size;
      	unsigned int value_size;
      	unsigned int max_entries;
        };
        static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
       	(void *)BPF_FUNC_map_lookup_elem;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
       	(void *)BPF_FUNC_trace_printk;
        struct bpf_map_def SEC("maps") channel = {
       	.type = BPF_MAP_TYPE_ARRAY,
       	.key_size = sizeof(int),
       	.value_size = sizeof(unsigned char),
       	.max_entries = 100,
        };
        SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
        int func(void *ctx, int err, long nsec)
        {
       	char fmt[] = "%ld\n";
       	long usec = nsec * 0x10624dd3 >> 38; // nsec / 1000
       	int key = (int)usec;
       	unsigned char *pval = map_lookup_elem(&channel, &key);
      
       	if (!pval)
       		return 0;
       	trace_printk(fmt, sizeof(fmt), (unsigned char)*pval);
       	return 0;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      
      Normal case:
      
        # echo "" > /sys/kernel/debug/tracing/trace
        # ./perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data ]
        # cat /sys/kernel/debug/tracing/trace | grep usleep
                  usleep-405   [004] d... 2745423.547822: : 101
        # ./perf record -e './test_bpf_map_3.c/map:channel.value[0...9,20...29]=102,map:channel.value[10...19]=103/' usleep 3
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data ]
        # ./perf record -e './test_bpf_map_3.c/map:channel.value[0...9,20...29]=102,map:channel.value[10...19]=103/' usleep 15
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data ]
        # cat /sys/kernel/debug/tracing/trace | grep usleep
                  usleep-405   [004] d... 2745423.547822: : 101
                  usleep-655   [006] d... 2745434.122814: : 102
                  usleep-904   [006] d... 2745439.916264: : 103
        # ./perf record -e './test_bpf_map_3.c/map:channel.value[all]=104/' usleep 99
        # cat /sys/kernel/debug/tracing/trace | grep usleep
                  usleep-405   [004] d... 2745423.547822: : 101
                  usleep-655   [006] d... 2745434.122814: : 102
                  usleep-904   [006] d... 2745439.916264: : 103
                  usleep-1537  [003] d... 2745538.053737: : 104
      
      Error case:
      
        # ./perf record -e './test_bpf_map_3.c/map:channel.value[10...1000]=104/' usleep 99
        event syntax error: '..annel.value[10...1000]=104/'
                                         \___ Index too large
        Hint:	Valid config terms:
            	map:[<arraymap>].value<indices>=[value]
            	map:[<eventmap>].event<indices>=[event]
      
            	where <indices> is something like [0,3...5] or [all]
            	(add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-9-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e571e029
    • W
      perf tools: Support setting different slots in a BPF map separately · 2d055bf2
      Wang Nan 提交于
      This patch introduces basic facilities to support config different slots
      in a BPF map one by one.
      
      array.nr_ranges and array.ranges are introduced into 'struct
      parse_events_term', where ranges is an array of indices range (start,
      length) which will be configured by this config term. nr_ranges is the
      size of the array. The array is passed to 'struct bpf_map_priv'.  To
      indicate the new type of configuration, BPF_MAP_KEY_RANGES is added as a
      new key type. bpf_map_config_foreach_key() is extended to iterate over
      those indices instead of all possible keys.
      
      Code in this commit will be enabled by following commit which enables
      the indices syntax for array configuration.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2d055bf2
    • W
      perf tools: Enable passing event to BPF object · 7630b3e2
      Wang Nan 提交于
      A new syntax is added to the parser so that the user can access
      predefined perf events in BPF objects.
      
      After this patch, BPF programs for perf are finally able to utilize
      bpf_perf_event_read() introduced in commit 35578d79 ("bpf: Implement
      function bpf_perf_event_read() that get the selected hardware PMU
      counter").
      
      Test result:
      
        # cat test_bpf_map_2.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        #define SEC(NAME) __attribute__((section(NAME), used))
        struct bpf_map_def {
            unsigned int type;
            unsigned int key_size;
            unsigned int value_size;
            unsigned int max_entries;
        };
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
            (void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
            (void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_read)(struct bpf_map_def *, int) =
            (void *)BPF_FUNC_perf_event_read;
      
        struct bpf_map_def SEC("maps") pmu_map = {
            .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
            .key_size = sizeof(int),
            .value_size = sizeof(int),
            .max_entries = __NR_CPUS__,
        };
        SEC("func_write=sys_write")
        int func_write(void *ctx)
        {
            unsigned long long val;
            char fmt[] = "sys_write:        pmu=%llu\n";
            val = perf_event_read(&pmu_map, get_smp_processor_id());
            trace_printk(fmt, sizeof(fmt), val);
            return 0;
        }
      
        SEC("func_write_return=sys_write%return")
        int func_write_return(void *ctx)
        {
            unsigned long long val = 0;
            char fmt[] = "sys_write_return: pmu=%llu\n";
            val = perf_event_read(&pmu_map, get_smp_processor_id());
            trace_printk(fmt, sizeof(fmt), val);
            return 0;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      
      Normal case:
      
        # echo "" > /sys/kernel/debug/tracing/trace
        # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls /
        [SNIP]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data (7 samples) ]
        # cat /sys/kernel/debug/tracing/trace | grep ls
                      ls-17066 [000] d... 938449.863301: : sys_write:        pmu=1157327
                      ls-17066 [000] dN.. 938449.863342: : sys_write_return: pmu=1225218
                      ls-17066 [000] d... 938449.863349: : sys_write:        pmu=1241922
                      ls-17066 [000] dN.. 938449.863369: : sys_write_return: pmu=1267445
      
      Normal case (system wide):
      
        # echo "" > /sys/kernel/debug/tracing/trace
        # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.811 MB perf.data (120 samples) ]
      
        # cat /sys/kernel/debug/tracing/trace | grep -v '18446744073709551594' | grep -v perf | head -n 20
        [SNIP]
        #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
        #              | |       |   ||||       |         |
                   gmain-30828 [002] d... 2740551.068992: : sys_write:        pmu=84373
                   gmain-30828 [002] d... 2740551.068992: : sys_write_return: pmu=87696
                   gmain-30828 [002] d... 2740551.068996: : sys_write:        pmu=100658
                   gmain-30828 [002] d... 2740551.068997: : sys_write_return: pmu=102572
      
      Error case 1:
      
        # perf record -e './test_bpf_map_2.c' ls /
        [SNIP]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.014 MB perf.data ]
        # cat /sys/kernel/debug/tracing/trace | grep ls
                      ls-17115 [007] d... 2724279.665625: : sys_write:        pmu=18446744073709551614
                      ls-17115 [007] dN.. 2724279.665651: : sys_write_return: pmu=18446744073709551614
                      ls-17115 [007] d... 2724279.665658: : sys_write:        pmu=18446744073709551614
                      ls-17115 [007] dN.. 2724279.665677: : sys_write_return: pmu=18446744073709551614
      
        (18446744073709551614 is 0xfffffffffffffffe (-2))
      
      Error case 2:
      
        # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=evt/' -a
        event syntax error: '..ps:pmu_map.event=evt/'
                                          \___ Event not found for map setting
      
        Hint:	Valid config terms:
             	map:[<arraymap>].value=[value]
             	map:[<eventmap>].event=[event]
        [SNIP]
      
      Error case 3:
        # ls /proc/2348/task/
        2348  2505  2506  2507  2508
        # perf record -i -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' -p 2348
        ERROR: Apply config to BPF failed: Cannot set event to BPF map in multi-thread tracing
      
      Error case 4:
        # perf record -e cycles -e './test_bpf_map_2.c/map:pmu_map.event=cycles/' ls /
        ERROR: Apply config to BPF failed: Doesn't support inherit event (Hint: use -i to turn off inherit)
      
      Error case 5:
        # perf record -i -e raw_syscalls:sys_enter -e './test_bpf_map_2.c/map:pmu_map.event=raw_syscalls:sys_enter/' ls
        ERROR: Apply config to BPF failed: Can only put raw, hardware and BPF output event into a BPF map
      
      Error case 6:
        # perf record -i -e './test_bpf_map_2.c/map:pmu_map.event=123/' ls /
        event syntax error: '.._map.event=123/'
                                          \___ Incorrect value type for map
        [SNIP]
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-7-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7630b3e2
    • W
      perf record: Apply config to BPF objects before recording · 8690a2a7
      Wang Nan 提交于
      bpf__apply_obj_config() is introduced as the core API to apply object
      config options to all BPF objects. This patch also does the real work
      for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
      stored in map's private field into the BPF map.
      
      This patch is required because we are not always able to set all BPF
      config during parsing. Further patch will set events created by perf to
      BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist until
      perf_evsel__open().
      
      bpf_map_foreach_key() is introduced to iterate over each key needs to be
      configured. This function would be extended to support more map types
      and different key settings.
      
      In perf record, before start recording, call bpf__apply_config() to turn
      on all BPF config options.
      
      Test result:
      
        # cat ./test_bpf_map_1.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        #define SEC(NAME) __attribute__((section(NAME), used))
        struct bpf_map_def {
            unsigned int type;
            unsigned int key_size;
            unsigned int value_size;
            unsigned int max_entries;
        };
        static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
            (void *)BPF_FUNC_map_lookup_elem;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
            (void *)BPF_FUNC_trace_printk;
        struct bpf_map_def SEC("maps") channel = {
            .type = BPF_MAP_TYPE_ARRAY,
            .key_size = sizeof(int),
            .value_size = sizeof(int),
            .max_entries = 1,
        };
        SEC("func=sys_nanosleep")
        int func(void *ctx)
        {
            int key = 0;
            char fmt[] = "%d\n";
            int *pval = map_lookup_elem(&channel, &key);
            if (!pval)
                return 0;
            trace_printk(fmt, sizeof(fmt), *pval);
            return 0;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      
        # echo "" > /sys/kernel/debug/tracing/trace
        # ./perf record -e './test_bpf_map_1.c/map:channel.value=11/' usleep 10
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data ]
        # cat /sys/kernel/debug/tracing/trace
        # tracer: nop
        #
        # entries-in-buffer/entries-written: 1/1   #P:8
        [SNIP]
        #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
        #              | |       |   ||||       |         |
                   usleep-18593 [007] d... 2394714.395539: : 11
        # ./perf record -e './test_bpf_map_1.c/map:channel.value=101/' usleep 10
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data ]
        # cat /sys/kernel/debug/tracing/trace
        # tracer: nop
        #
        # entries-in-buffer/entries-written: 1/1   #P:8
        [SNIP]
        #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
        #              | |       |   ||||       |         |
                   usleep-18593 [007] d... 2394714.395539: : 11
                   usleep-19000 [006] d... 2394831.057840: : 101
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-6-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8690a2a7
    • W
      perf tools: Enable BPF object configure syntax · a34f3be7
      Wang Nan 提交于
      This patch adds the final step for BPF map configuration. A new syntax
      is appended into parser so user can config BPF objects through '/' '/'
      enclosed config terms.
      
      After this patch, following syntax is available:
      
        # perf record -e ./test_bpf_map_1.c/map:channel.value=10/ ...
      
      It would takes effect after appling following commits.
      
      Test result:
      
        # cat ./test_bpf_map_1.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        #define SEC(NAME) __attribute__((section(NAME), used))
        struct bpf_map_def {
            unsigned int type;
            unsigned int key_size;
            unsigned int value_size;
            unsigned int max_entries;
        };
        static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
            (void *)BPF_FUNC_map_lookup_elem;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
            (void *)BPF_FUNC_trace_printk;
        struct bpf_map_def SEC("maps") channel = {
            .type = BPF_MAP_TYPE_ARRAY,
            .key_size = sizeof(int),
            .value_size = sizeof(int),
            .max_entries = 1,
        };
        SEC("func=sys_nanosleep")
        int func(void *ctx)
        {
            int key = 0;
            char fmt[] = "%d\n";
            int *pval = map_lookup_elem(&channel, &key);
            if (!pval)
                return 0;
            trace_printk(fmt, sizeof(fmt), *pval);
            return 0;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      
       - Normal case:
        # ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data ]
      
       - Error case:
      
        # ./perf record -e './test_bpf_map_1.c/map:channel.value/' usleep 10
        event syntax error: '..ps:channel:value/'
                                         \___ Config value not set (missing '=')
        Hint:	Valid config term:
               map:[<arraymap>]:value=[value]
               (add -v to see detail)
        Run 'perf list' for a list of valid events
      
        Usage: perf record [<options>] [<command>]
           or: perf record [<options>] -- <command> [<options>]
      
           -e, --event <event>   event selector. use 'perf list' to list available events
      
        # ./perf record -e './test_bpf_map_1.c/xmap:channel.value=10/' usleep 10
        event syntax error: '..pf_map_1.c/xmap:channel.value=10/'
                                          \___ Invalid object config option
        [SNIP]
      
        # ./perf record -e './test_bpf_map_1.c/map:xchannel.value=10/' usleep 10
        event syntax error: '..p_1.c/map:xchannel.value=10/'
                                          \___ Target map not exist
        [SNIP]
      
        # ./perf record -e './test_bpf_map_1.c/map:channel.xvalue=10/' usleep 10
        event syntax error: '..ps:channel.xvalue=10/'
                                          \___ Invalid object map config option
        [SNIP]
      
        # ./perf record -e './test_bpf_map_1.c/map:channel.value=x10/' usleep 10
        event syntax error: '..nnel.value=x10/'
                                          \___ Incorrect value type for map
        [SNIP]
      
        Change BPF_MAP_TYPE_ARRAY to '1' in test_bpf_map_1.c:
      
        # ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
        event syntax error: '..ps:channel.value=10/'
                                          \___ Can't use this config term to this type of map
      
        Hint:	Valid config term:
            	map:[<arraymap>].value=[value]
            	(add -v to see detail)
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      [for parser part]
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-5-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a34f3be7
    • W
      perf bpf: Add API to set values to map entries in a bpf object · 066dacbf
      Wang Nan 提交于
      bpf__config_obj() is introduced as a core API to config BPF object after
      loading. One configuration option of maps is introduced. After this
      patch BPF object can accept assignments like:
      
        map:my_map.value=1234
      
      (map.my_map.value looks pretty. However, there's a small but hard to fix
      problem related to flex's greedy matching. Please see [1].  Choose ':'
      to avoid it in a simpler way.)
      
      This patch is more complex than the work it does because the
      consideration of extension. In designing BPF map configuration, the
      following things should be considered:
      
       1. Array indices selection: perf should allow user setting different
          value for different slots in an array, with syntax like:
          map:my_map.value[0,3...6]=1234;
      
       2. A map should be set by different config terms, each for a part
          of it. For example, set each slot to the pid of a thread;
      
       3. Type of value: integer is not the only valid value type. A perf
          counter can also be put into a map after commit 35578d79
          ("bpf: Implement function bpf_perf_event_read() that get the
            selected hardware PMU counter")
      
       4. For a hash table, it should be possible to use a string or other
          value as a key;
      
       5. It is possible that map configuration is unable to be setup
          during parsing. A perf counter is an example.
      
      Therefore, this patch does the following:
      
       1. Instead of updating map element during parsing, this patch stores
          map config options in 'struct bpf_map_priv'. Following patches
          will apply those configs at an appropriate time;
      
       2. Link map operations in a list so a map can have multiple config
          terms attached, so different parts can be configured separately;
      
       3. Make 'struct bpf_map_priv' extensible so that the following patches
          can add new types of keys and operations;
      
       4. Use bpf_obj_config__map_funcs array to support more map config options.
      
      Since the patch changing the event parser to parse BPF object config is
      relative large, I've put it in another commit. Code in this patch can be
      tested after applying the next patch.
      
      [1] http://lkml.kernel.org/g/564ED621.4050500@huawei.comSigned-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-4-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com>
      [ Changes "maps:my_map.value" to "map:my_map.value", improved error messages ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      066dacbf
    • N
      perf tools: Fix assertion failure on dynamic entry · dd42baf1
      Namhyung Kim 提交于
      The dynamic entry is created for each field in a tracepoint event.
      Since they have no fixed hpp format index, it should skip when
      perf_hpp__reset_width() is called.
      
      This caused following assertion failure..
      
        $ perf record -e sched:sched_switch -a sleep 1
      
        $ perf report -s comm,next_pid --stdio
        perf: ui/hist.c:651: perf_hpp__reset_width:
          Assertion `!(fmt->idx >= PERF_HPP__MAX_INDEX)' failed.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1456064558-13086-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dd42baf1
    • N
      perf tools: Fix column width setting on 'trace' sort key · 0c0af78d
      Namhyung Kim 提交于
      It missed to update column length of the 'trace' sort key in the
      hists__calc_col_len() so it might truncate the output.  It calculated
      the column length in the ->cmp() callback originally but it doesn't
      guarantee it's called always.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1456064558-13086-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0c0af78d
    • N
      perf tools: Fix alignment on some sort keys · 2960ed6f
      Namhyung Kim 提交于
      The srcline, srcfile and trace sort keys can have long entries.  With
      commit 89fee709 ("perf hists: Do column alignment on the format
      iterator"), it now aligns output with hist_entry__snprintf_alignment().
      So each (possibly long) sort entries don't need to do it themselves.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1456101153-14519-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2960ed6f
    • N
      perf tools: Update srcline/file if needed · cecaec63
      Namhyung Kim 提交于
      Normally the hist entry's srcline and/or srcfile is set during sorting.
      However sometime it's possible to a hist entry's srcline is not set yet
      after the sorting.  This is because the entry is so unique and other
      sort keys already make it distinct.  Then the srcline/file sort didn't
      have a chance to be called during the sorting.  In that case it has NULL
      srcline/srcfile field and shows nothing.
      
      Before:
      
        $ perf report -s comm,sym,srcline
        ...
        Overhead  Command       Symbol
        -----------------------------------------------------------------
          34.42%  swapper       [k] intel_idle          intel_idle.c:0
           2.44%  perf          [.] __poll_nocancel     (null)
           1.70%  gnome-shell   [k] fw_domains_get      (null)
           1.04%  Xorg          [k] sock_poll           (null)
      
      After:
      
          34.42%  swapper       [k] intel_idle          intel_idle.c:0
           2.44%  perf          [.] __poll_nocancel     .:0
           1.70%  gnome-shell   [k] fw_domains_get      fw_domains_get+42
           1.04%  Xorg          [k] sock_poll           socket.c:0
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1456101111-14400-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cecaec63