1. 19 8月, 2016 1 次提交
    • A
      perf evsel: Do not access outside hw cache name arrays · c53412ee
      Arnaldo Carvalho de Melo 提交于
      We have to check if the values are >= *_MAX, not just >, fix it.
      
      From the bugzilla report:
      
      ''In file /tools/perf/util/evsel.c  function __perf_evsel__hw_cache_name
      it appears that there is a bug that reads beyond the end of the buffer.
      The statement "if (type > PERF_COUNT_HW_CACHE_MAX)" allows type to be
      equal to the maximum value. Later, when statement "if
      (!perf_evsel__is_cache_op_valid(type, op))" is executed, the function
      can access array perf_evsel__hw_cache_stat[type] beyond the end of the
      buffer.
      
      It appears to me that the statement "if (type > PERF_COUNT_HW_CACHE_MAX)"
      should be "if (type >= PERF_COUNT_HW_CACHE_MAX)"
      
      Bug found with Coverity and manual code review. No attempts were made to
      execute the code with a maximum type value.''
      
      Committer note:
      
      Testing it:
      
        $ perf record -e $(echo $(perf list cache | cut -d \[ -f1) | sed 's/ /,/g') usleep 1
        [ perf record: Woken up 16 times to write data ]
        [ perf record: Captured and wrote 0.023 MB perf.data (34 samples) ]
        $ perf evlist
        L1-dcache-load-misses
        L1-dcache-loads
        L1-dcache-stores
        L1-icache-load-misses
        LLC-load-misses
        LLC-loads
        LLC-store-misses
        LLC-stores
        branch-load-misses
        branch-loads
        dTLB-load-misses
        dTLB-loads
        dTLB-store-misses
        dTLB-stores
        iTLB-load-misses
        iTLB-loads
        node-load-misses
        node-loads
        node-store-misses
        node-stores
        $ perf list cache
      
        List of pre-defined events (to be used in -e):
      
          L1-dcache-load-misses        [Hardware cache event]
          L1-dcache-loads              [Hardware cache event]
          L1-dcache-stores             [Hardware cache event]
          L1-icache-load-misses        [Hardware cache event]
          LLC-load-misses              [Hardware cache event]
          LLC-loads                    [Hardware cache event]
          LLC-store-misses             [Hardware cache event]
          LLC-stores                   [Hardware cache event]
          branch-load-misses           [Hardware cache event]
          branch-loads                 [Hardware cache event]
          dTLB-load-misses             [Hardware cache event]
          dTLB-loads                   [Hardware cache event]
          dTLB-store-misses            [Hardware cache event]
          dTLB-stores                  [Hardware cache event]
          iTLB-load-misses             [Hardware cache event]
          iTLB-loads                   [Hardware cache event]
          node-load-misses             [Hardware cache event]
          node-loads                   [Hardware cache event]
          node-store-misses            [Hardware cache event]
          node-stores                  [Hardware cache event]
        $
      Reported-by: NBrian Sweeney <bsweeney@lgsinnovations.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=153351Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c53412ee
  2. 03 8月, 2016 1 次提交
  3. 29 7月, 2016 1 次提交
  4. 16 7月, 2016 2 次提交
    • W
      perf tools: Enable overwrite settings · 626a6b78
      Wang Nan 提交于
      This patch allows following config terms and option:
      
      Globally setting events to overwrite;
      
        # perf record --overwrite ...
      
      Set specific events to be overwrite or no-overwrite.
      
        # perf record --event cycles/overwrite/ ...
        # perf record --event cycles/no-overwrite/ ...
      
      Add missing config terms and update the config term array size because
      the longest string length has changed.
      
      For overwritable events, it automatically selects attr.write_backward
      since perf requires it to be backward for reading.
      
      Test result:
      
        # perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
        # perf evlist -v
        syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1468485287-33422-14-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      626a6b78
    • A
      perf evlist: Drop redundant evsel->overwrite indicator · 32a951b4
      Arnaldo Carvalho de Melo 提交于
      evsel->overwrite indicator means an event should be put into
      overwritable ring buffer. In current implementation, it equals to
      evsel->attr.write_backward. To reduce compliexity, remove
      evsel->overwrite, use evsel->attr.write_backward instead.
      
      In addition, in __perf_evsel__open(), if kernel doesn't support
      write_backward and user explicitly set it in evsel, don't fallback
      like other missing feature, since it is meaningless to fall back to
      a forward ring buffer in this case: we are unable to stably read
      from an forward overwritable ring buffer.
      
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nilay Vaish <nilayvaish@gmail.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1468485287-33422-2-git-send-email-wangnan0@huawei.comSigned-off-by: NWang Nan <wangnan0@huawei.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      32a951b4
  5. 13 7月, 2016 2 次提交
  6. 30 6月, 2016 1 次提交
  7. 22 6月, 2016 1 次提交
  8. 04 6月, 2016 1 次提交
  9. 03 6月, 2016 1 次提交
  10. 30 5月, 2016 1 次提交
    • A
      perf tools: Per event max-stack settings · 792d48b4
      Arnaldo Carvalho de Melo 提交于
      The tooling counterpart, now it is possible to do:
      
        # perf record -e sched:sched_switch/max-stack=10/ -e cycles/call-graph=dwarf,max-stack=4/ -e cpu-cycles/call-graph=dwarf,max-stack=1024/ usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.052 MB perf.data (5 samples) ]
        # perf evlist -v
        sched:sched_switch: type: 2, size: 112, config: 0x110, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, sample_max_stack: 10
        cycles/call-graph=dwarf,max-stack=4/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 4
        cpu-cycles/call-graph=dwarf,max-stack=1024/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 1024
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      
      Using just /max-stack=N/ means /call-graph=fp,max-stack=N/, that should
      be further configurable by means of some .perfconfig knob.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      792d48b4
  11. 21 5月, 2016 1 次提交
  12. 13 5月, 2016 2 次提交
    • A
      perf evsel: Handle EACCESS + perf_event_paranoid=2 in fallback() · 08094828
      Arnaldo Carvalho de Melo 提交于
      Now with the default for the kernel.perf_event_paranoid sysctl being 2 [1]
      we need to fall back to :u, i.e. to set perf_event_attr.exclude_kernel
      to 1.
      
      Before:
      
        [acme@jouet linux]$ perf record usleep 1
        Error:
        You may not have permission to collect stats.
      
        Consider tweaking /proc/sys/kernel/perf_event_paranoid,
        which controls use of the performance events system by
        unprivileged users (without CAP_SYS_ADMIN).
      
        The current value is 2:
      
          -1: Allow use of (almost) all events by all users
        >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
        >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
        >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
        [acme@jouet linux]$
      
      After:
      
        [acme@jouet linux]$ perf record usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ]
        [acme@jouet linux]$ perf evlist
        cycles:u
        [acme@jouet linux]$ perf evlist -v
        cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        [acme@jouet linux]$
      
      And if the user turns on verbose mode, an explanation will appear:
      
        [acme@jouet linux]$ perf record -v usleep 1
        Warning:
        kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
        mmap size 528384B
        [ perf record: Woken up 1 times to write data ]
        Looking at the vmlinux_path (8 entries long)
        Using /lib/modules/4.6.0-rc7+/build/vmlinux for symbols
        [ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ]
        [acme@jouet linux]$
      
      [1] 0161028b ("perf/core: Change the default paranoia level to 2")
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      08094828
    • A
      perf evsel: Improve EPERM error handling in open_strerror() · 7d173913
      Arnaldo Carvalho de Melo 提交于
      We were showing a hardcoded default value for the kernel.perf_event_paranoid
      sysctl, now that it became more paranoid (1 -> 2 [1]), this would need to be
      updated, instead show the current value:
      
        [acme@jouet linux]$ perf record ls
        Error:
        You may not have permission to collect stats.
      
        Consider tweaking /proc/sys/kernel/perf_event_paranoid,
        which controls use of the performance events system by
        unprivileged users (without CAP_SYS_ADMIN).
      
        The current value is 2:
      
          -1: Allow use of (almost) all events by all users
        >= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
        >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
        >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
        [acme@jouet linux]$
      
      [1] 0161028b ("perf/core: Change the default paranoia level to 2")
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-0gc4rdpg8d025r5not8s8028@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7d173913
  13. 10 5月, 2016 1 次提交
    • A
      perf evsel: Print state of perf_event_attr.write_backward · 0a241ef4
      Arnaldo Carvalho de Melo 提交于
      Now we can see if it is set when using verbose mode in various tools,
      such as 'perf test':
      
        # perf test -vv back
        45: Test backward reading from ring buffer                   :
        --- start ---
        <SNIP>
        ------------------------------------------------------------
        perf_event_attr:
          type                             2
          size                             112
          config                           0x98
          { sample_period, sample_freq }   1
          sample_type                      IP|TID|TIME|CPU|PERIOD|RAW
          disabled                         1
          mmap                             1
          comm                             1
          task                             1
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
          write_backward                   1
        ------------------------------------------------------------
        sys_perf_event_open: pid 20911  cpu -1  group_fd -1  flags 0x8
        <SNIP>
        ---- end ----
        Test backward reading from ring buffer: Ok
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kxv05kv9qwl5of7rzfeiiwbv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0a241ef4
  14. 28 4月, 2016 2 次提交
  15. 26 4月, 2016 1 次提交
    • A
      perf evlist: Decode perf_event_attr->branch_sample_type · a213b92e
      Arnaldo Carvalho de Melo 提交于
      While trying to use --call-graph lbr in 'perf trace', since we only are
      interested in the callchain for userspace, up to the callchain, I found
      that 'perf evlist' is not decoding the branch_sample_type field, fix it.
      
      Before:
      
        # perf record --call-graph lbr usleep 1
        # perf evlist -v
        cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
        sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK,
        disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1,
        precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1,
        comm_exec: 1, branch_sample_type: 51201
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
      
      After:
      
        # perf evlist -v
        cycles:ppp: size: 112, { sample_period, sample_freq }: 4000,
        sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|BRANCH_STACK,
        disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1,
        precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1,
        comm_exec: 1, branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-hozai7974u0ulgx13k96fcaw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a213b92e
  16. 18 4月, 2016 1 次提交
  17. 15 4月, 2016 5 次提交
  18. 13 4月, 2016 1 次提交
  19. 12 4月, 2016 2 次提交
  20. 02 4月, 2016 1 次提交
    • W
      perf bpf: Add sample types for 'bpf-output' event · d37ba880
      Wang Nan 提交于
      Before this patch we can see very large time in the events before the
      'bpf-output' event. For example:
      
        # perf trace -vv -T --ev sched:sched_switch \
                            --ev bpf-output/no-inherit,name=evt/ \
                            --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                            usleep 10
        ...
        18446744073709.551 (18446564645918.480 ms): usleep/4157 nanosleep(rqtp: 0x7ffd3f0dc4e0) ...
        18446744073709.551 (         ): evt:Raise a BPF event!..)
        179427791.076 (         ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
        179427791.081 (         ): sched:sched_switch:usleep:4157 [120] S ==> swapper/2:0 [120])
        ...
      
      We can also see the differences between bpf-output events and
      breakpoint events:
      
      For bpf output event:
         sample_type                    IP|TID|RAW|IDENTIFIER
      
      For tracepoint events:
         sample_type                    IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER
      
      This patch fix this differences by adding more sample type for
      bpf-output events.
      
      After this patch:
      
        # perf trace -vv -T --ev sched:sched_switch \
                            --ev bpf-output/no-inherit,name=evt/ \
                            --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                            usleep 10
        ...
        179877370.878 ( 0.003 ms): usleep/5336 nanosleep(rqtp: 0x7ffff866c450) ...
        179877370.878 (         ): evt:Raise a BPF event!..)
        179877370.878 (         ): perf_bpf_probe:func_begin:(ffffffff810eb9a0))
        179877370.882 (         ): sched:sched_switch:usleep:5336 [120] S ==> swapper/4:0 [120])
        179877370.945 (         ): evt:Raise a BPF event!..)
        ...
      
        # ./perf trace -vv -T --ev sched:sched_switch \
                              --ev bpf-output/no-inherit,name=evt/ \
                              --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                              usleep 10 2>&1 | grep sample_type
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
        sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
      
      The 'IDENTIFIER' info is not required because all events have the same
      sample_type.
      
      Committer notes:
      
      Further testing, on top of the changes making 'perf trace' avoid samples
      from events without PERF_SAMPLE_TIME:
      
      Before:
      
        # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
        <SNIP>
          0.560 ( 0.001 ms): brk(                                                   ) = 0x55e5a1df8000
          18446640227439.430 (18446640227438.859 ms): nanosleep(rqtp: 0x7ffc96643370) ...
          18446640227439.430 (         ): evt:Raise a BPF event!..)
          0.576 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
          18446640227439.430 (         ): evt:Raise a BPF event!..)
          0.645 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
          0.646 ( 0.076 ms):  ... [continued]: nanosleep()) = 0
        #
      
      After:
      
        # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10
        <SNIP>
           0.292 ( 0.001 ms): brk(                          ) = 0x55c7cd6e1000
           0.302 ( 0.004 ms): nanosleep(rqtp: 0x7ffedd8bc0f0) ...
           0.302 (         ): evt:Raise a BPF event!..)
           0.303 (         ): perf_bpf_probe:func_begin:(ffffffff81112460))
           0.397 (         ): evt:Raise a BPF event!..)
           0.397 (         ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
           0.398 ( 0.100 ms):  ... [continued]: nanosleep()) = 0
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Reported-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1459517202-42320-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d37ba880
  21. 23 3月, 2016 1 次提交
  22. 23 2月, 2016 1 次提交
    • W
      perf tools: Introduce bpf-output event · 03e0a7df
      Wang Nan 提交于
      Commit a43eec30 ("bpf: introduce bpf_perf_event_output() helper")
      adds a helper to enable a BPF program to output data to a perf ring
      buffer through a new type of perf event, PERF_COUNT_SW_BPF_OUTPUT. This
      patch enables perf to create events of that type. Now a perf user can
      use the following cmdline to receive output data from BPF programs:
      
        # perf record -a -e bpf-output/no-inherit,name=evt/ \
                          -e ./test_bpf_output.c/map:channel.event=evt/ ls /
        # perf script
           perf 1560 [004] 347747.086295:  evt: ffffffff811fd201 sys_write ...
           perf 1560 [004] 347747.086300:  evt: ffffffff811fd201 sys_write ...
           perf 1560 [004] 347747.086315:  evt: ffffffff811fd201 sys_write ...
                  ...
      
      Test result:
      
        # cat test_bpf_output.c
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        struct bpf_map_def {
       	unsigned int type;
       	unsigned int key_size;
       	unsigned int value_size;
       	unsigned int max_entries;
        };
      
        #define SEC(NAME) __attribute__((section(NAME), used))
        static u64 (*ktime_get_ns)(void) =
       	(void *)BPF_FUNC_ktime_get_ns;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
       	(void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
       	(void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
       	(void *)BPF_FUNC_perf_event_output;
      
        struct bpf_map_def SEC("maps") channel = {
       	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
       	.key_size = sizeof(int),
       	.value_size = sizeof(u32),
       	.max_entries = __NR_CPUS__,
        };
      
        SEC("func_write=sys_write")
        int func_write(void *ctx)
        {
       	struct {
       		u64 ktime;
       		int cpuid;
       	} __attribute__((packed)) output_data;
       	char error_data[] = "Error: failed to output: %d\n";
      
       	output_data.cpuid = get_smp_processor_id();
       	output_data.ktime = ktime_get_ns();
       	int err = perf_event_output(ctx, &channel, get_smp_processor_id(),
       				    &output_data, sizeof(output_data));
       	if (err)
       		trace_printk(error_data, sizeof(error_data), err);
       	return 0;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************ END ***************************/
      
        # perf record -a -e bpf-output/no-inherit,name=evt/ \
                          -e ./test_bpf_output.c/map:channel.event=evt/ ls /
        # perf script | grep ls
           ls  2242 [003] 347851.557563:   evt: ffffffff811fd201 sys_write ...
           ls  2242 [003] 347851.557571:   evt: ffffffff811fd201 sys_write ...
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Cody P Schafer <dev@codyps.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kirill Smelkov <kirr@nexedi.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456132275-98875-11-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      03e0a7df
  23. 18 2月, 2016 1 次提交
    • J
      perf record: Add --all-user/--all-kernel options · 85723885
      Jiri Olsa 提交于
      Allow user to easily switch all events to user or kernel space with simple
      --all-user or --all-kernel options.
      
      This will be handy within perf mem/c2c wrappers to switch easily monitoring
      modes.
      
      Committer note:
      
      Testing it:
      
        # perf record --all-kernel --all-user -a sleep 2
         Error: option `all-user' cannot be used with all-kernel
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
              --all-user        Configure all used events to run in user space.
              --all-kernel      Configure all used events to run in kernel space.
        # perf record --all-user --all-kernel -a sleep 2
         Error: option `all-kernel' cannot be used with all-user
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
              --all-kernel      Configure all used events to run in kernel space.
              --all-user        Configure all used events to run in user space.
        # perf record --all-user -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.416 MB perf.data (162 samples) ]
        # perf report | grep '\[k\]'
        # perf record --all-kernel -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.423 MB perf.data (296 samples) ]
        # perf report | grep '\[\.\]'
        #
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/1455525293-8671-2-git-send-email-jolsa@kernel.org
      [ Made those options to be mutually exclusive ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      85723885
  24. 26 1月, 2016 1 次提交
  25. 09 1月, 2016 1 次提交
    • N
      perf evlist: Add --trace-fields option to show trace fields · 775d8a1b
      Namhyung Kim 提交于
      To use dynamic sort keys, it might be good to add an option to see the
      list of field names.
      
        $ perf evlist -i perf.data.sched
        sched:sched_switch
        sched:sched_stat_wait
        sched:sched_stat_sleep
        sched:sched_stat_iowait
        sched:sched_stat_runtime
        sched:sched_process_fork
        sched:sched_wakeup
        sched:sched_wakeup_new
        sched:sched_migrate_task
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      
        $ perf evlist -i perf.data.sched --trace-fields
        sched:sched_switch: trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
        sched:sched_stat_wait: trace_fields: comm,pid,delay
        sched:sched_stat_sleep: trace_fields: comm,pid,delay
        sched:sched_stat_iowait: trace_fields: comm,pid,delay
        sched:sched_stat_runtime: trace_fields: comm,pid,runtime,vruntime
        sched:sched_process_fork: trace_fields: parent_comm,parent_pid,child_comm,child_pid
        sched:sched_wakeup: trace_fields: comm,pid,prio,success,target_cpu
        sched:sched_wakeup_new: trace_fields: comm,pid,prio,success,target_cpu
        sched:sched_migrate_task: trace_fields: comm,pid,prio,orig_cpu,dest_cpu
      
      Committer notes:
      
      For another file, in verbose mode:
      
        # perf evlist -v --trace-fields
        sched:sched_switch: type: 2, size: 112, config: 0x10b, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, trace_fields: prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
        #
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1452125549-1511-5-git-send-email-namhyung@kernel.org
      [ Replaced 'trace_fields=' with 'trace_fields: ' to make the output consistent in -v mode ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      775d8a1b
  26. 14 12月, 2015 1 次提交
  27. 08 12月, 2015 2 次提交
  28. 27 11月, 2015 1 次提交
  29. 30 10月, 2015 1 次提交
    • W
      perf bpf: Attach eBPF filter to perf event · 1f45b1d4
      Wang Nan 提交于
      This is the final patch which makes basic BPF filter work. After
      applying this patch, users are allowed to use BPF filter like:
      
       # perf record --event ./hello_world.o ls
      
      A bpf_fd field is appended to 'struct evsel', and setup during the
      callback function add_bpf_event() for each 'probe_trace_event'.
      
      PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
      created perf event. The file descriptor of the eBPF program is passed to
      perf record using previous patches, and stored into evsel->bpf_fd.
      
      It is possible that different perf event are created for one kprobe
      events for different CPUs. In this case, when trying to call the ioctl,
      EEXIST will be return. This patch doesn't treat it as an error.
      
      Committer note:
      
      The bpf proggie used so far:
      
        __attribute__((section("fork=_do_fork"), used))
        int fork(void *ctx)
        {
      	  return 0;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
      
      failed to produce any samples, even with forks happening and it being
      running in system wide mode.
      
      That is because now the filter is being associated, and the code above
      always returns zero, meaning that all forks will be probed but filtered
      away ;-/
      
      Change it to 'return 1;' instead and after that:
      
        # trace --no-syscalls --event /tmp/foo.o
           0.000 perf_bpf_probe:fork:(ffffffff8109be30))
           2.333 perf_bpf_probe:fork:(ffffffff8109be30))
           3.725 perf_bpf_probe:fork:(ffffffff8109be30))
           4.550 perf_bpf_probe:fork:(ffffffff8109be30))
        ^C#
      
      And it works with all tools, including 'perf trace'.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f45b1d4
  30. 28 10月, 2015 1 次提交
    • W
      perf tools: Enable pre-event inherit setting by config terms · 374ce938
      Wang Nan 提交于
      This patch allows perf record setting event's attr.inherit bit by
      config terms like:
      
        # perf record -e cycles/no-inherit/ ...
        # perf record -e cycles/inherit/ ...
      
      So user can control inherit bit for each event separately.
      
      In following example, a.out fork()s in main then do some complex
      CPU intensive computations in both of its children.
      
      Basic result with and without inherit:
      
        # perf record -e cycles -e instructions ./a.out
        [ perf record: Woken up 9 times to write data ]
        [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
        # perf report --stdio
        # ...
        # Samples: 23K of event 'cycles'
        # Event count (approx.): 23641752891
        ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30428312415
      
        # perf record -i -e cycles -e instructions ./a.out
        [ perf record: Woken up 5 times to write data ]
        [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
        ...
        # Samples: 12K of event 'cycles'
        # Event count (approx.): 11699501775
        ...
        # Samples: 12K of event 'instructions'
        # Event count (approx.): 15058023559
      
      Cancel inherit for one event when globally enable:
      
        # perf record -e cycles/no-inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
        ...
        # Samples: 12K of event 'cycles/no-inherit/'
        # Event count (approx.): 11895759282
       ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30668000441
      
      Enable inherit for one event when globally disable:
      
        # perf record -i -e cycles/inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
        ...
        # Samples: 23K of event 'cycles/inherit/'
        # Event count (approx.): 23285400229
        ...
        # Samples: 11K of event 'instructions'
        # Event count (approx.): 14969050259
      
      Committer note:
      
      One can check if the bit was set, in addition to seeing the result in
      the perf.data file size as above by doing one of:
      
        # perf record -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      So, the inherit bit was set in both, now, if we disable it globally using
      --no-inherit:
      
        # perf record --no-inherit -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
      
      No inherit bit set, then disabling it and setting just on the cycles event:
      
        # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
        # perf evlist -v
        cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      We can see it as well in by using a more verbose level of debug messages in
      the tool that sets up the perf_event_attr, 'perf record' in this case:
      
        [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          mmap                             1
          comm                             1
          freq                             1
          task                             1
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          config                           0x1
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          freq                             1
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
      
      <SNIP>
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
      [ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      374ce938