1. 19 11月, 2015 4 次提交
    • W
      perf bpf: Allow BPF program config probing options · 03e01f56
      Wang Nan 提交于
      By extending the syntax of BPF object section names, this patch allows users to
      config probing options like what they can do in 'perf probe'.
      
      The error message in 'perf probe' is also updated.
      
      Test result:
      
      For following BPF file test_probe_glob.c:
      
        # cat test_probe_glob.c
        __attribute__((section("inlines=no;func=SyS_dup?"), used))
      
        int func(void *ctx)
        {
      	  return 1;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
        #
        # ./perf record  -e ./test_probe_glob.c ls /
        ...
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data ]
        # ./perf evlist
        perf_bpf_probe:func_1
        perf_bpf_probe:func
      
      After changing "inlines=no" to "inlines=yes":
      
        # ./perf record  -e ./test_probe_glob.c ls /
        ...
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data ]
        # ./perf evlist
        perf_bpf_probe:func_3
        perf_bpf_probe:func_2
        perf_bpf_probe:func_1
        perf_bpf_probe:func
      
      Then test 'force':
      
      Use following program:
      
        # cat test_probe_force.c
        __attribute__((section("func=sys_write"), used))
      
        int funca(void *ctx)
        {
      	  return 1;
        }
      
        __attribute__((section("force=yes;func=sys_write"), used))
      
        int funcb(void *ctx)
        {
        	return 1;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
        #
      
        # perf record -e ./test_probe_force.c usleep 1
        Error: event "func" already exists.
         Hint: Remove existing event by 'perf probe -d'
             or force duplicates by 'perf probe -f'
             or set 'force=yes' in BPF source.
        event syntax error: './test_probe_force.c'
                             \___ Probe point exist. Try 'perf probe -d "*"' and set 'force=yes'
      
        (add -v to see detail)
        ...
      
      Then replace 'force=no' to 'force=yes':
      
        # vim test_probe_force.c
        # perf record -e ./test_probe_force.c usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.017 MB perf.data ]
        # perf evlist
        perf_bpf_probe:func_1
        perf_bpf_probe:func
        #
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-7-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      03e01f56
    • W
      perf bpf: Allow attaching BPF programs to modules symbols · 5dbd16c0
      Wang Nan 提交于
      By extending the syntax of BPF object section names, this patch allows
      users to attach BPF programs to symbols in modules. For example:
      
        SEC("module=i915;"
            "parse_cmds=i915_parse_cmds")
        int parse_cmds(void *ctx)
        {
            return 1;
        }
      
      The implementation is very simple: like what 'perf probe' does, for module,
      fill 'uprobe' field in 'struct perf_probe_event'. Other parts will be done
      automatically.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-5-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5dbd16c0
    • W
      perf bpf: Allow BPF program attach to uprobe events · 361f2b1d
      Wang Nan 提交于
      This patch adds a new syntax to the BPF object section name to support
      probing at uprobe event. Now we can use BPF program like this:
      
        SEC(
        "exec=/lib64/libc.so.6;"
        "libcwrite=__write"
        )
        int libcwrite(void *ctx)
        {
            return 1;
        }
      
      Where, in section name of a program, before the main config string, we
      can use 'key=value' style options. Now the only option key is "exec",
      for uprobes.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-4-git-send-email-wangnan0@huawei.com
      [ Changed the separator from \n to ; ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      361f2b1d
    • A
      tools: Adopt memdup() from tools/perf, moving it to tools/lib/string.c · 4ddd3274
      Arnaldo Carvalho de Melo 提交于
      That will contain more string functions with counterparts, sometimes
      verbatim copies, in the kernel.
      Acked-by: NWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/n/tip-rah6g97kn21vfgmlramorz6o@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4ddd3274
  2. 13 11月, 2015 6 次提交
    • W
      perf probe: Clear probe_trace_event when add_probe_trace_event() fails · 092b1f0b
      Wang Nan 提交于
      When probing with a glob, errors in add_probe_trace_event() won't be
      passed to debuginfo__find_trace_events() because it would be modified by
      probe_point_search_cb(). It causes a segfault if perf fails to find an
      argument for a probe point matched by the glob. For example:
      
        # ./perf probe -v -n 'SyS_dup? oldfd'
        probe-definition(0): SyS_dup? oldfd
        symbol:SyS_dup? file:(null) line:0 offset:0 return:0 lazy:(null)
        parsing arg: oldfd into oldfd
        1 arguments
        Looking at the vmlinux_path (7 entries long)
        Using /lib/modules/4.3.0-rc4+/build/vmlinux for symbols
        Open Debuginfo file: /lib/modules/4.3.0-rc4+/build/vmlinux
        Try to find probe point from debuginfo.
        Matched function: SyS_dup3
        found inline addr: 0xffffffff812095c0
        Probe point found: SyS_dup3+0
        Searching 'oldfd' variable in context.
        Converting variable oldfd into trace event.
        oldfd type is long int.
        found inline addr: 0xffffffff812096d4
        Probe point found: SyS_dup2+36
        Searching 'oldfd' variable in context.
        Failed to find 'oldfd' in this function.
        Matched function: SyS_dup3
        Probe point found: SyS_dup3+0
        Searching 'oldfd' variable in context.
        Converting variable oldfd into trace event.
        oldfd type is long int.
        Matched function: SyS_dup2
        Probe point found: SyS_dup2+0
        Searching 'oldfd' variable in context.
        Converting variable oldfd into trace event.
        oldfd type is long int.
        Found 4 probe_trace_events.
        Opening /sys/kernel/debug/tracing//kprobe_events write=1
        Writing event: p:probe/SyS_dup3 _text+2135488 oldfd=%di:s64
        Segmentation fault (core dumped)
        #
      
      This patch ensures that add_probe_trace_event() doesn't touches
      tf->ntevs and tf->tevs if those functions fail.
      
      After the patch:
      
        # perf probe  'SyS_dup? oldfd'
        Failed to find 'oldfd' in this function.
        Added new events:
          probe:SyS_dup3       (on SyS_dup? with oldfd)
          probe:SyS_dup3_1     (on SyS_dup? with oldfd)
          probe:SyS_dup2       (on SyS_dup? with oldfd)
      
        You can now use it in all perf tools, such as:
      
      	perf record -e probe:SyS_dup2 -aR sleep 1
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447417761-156094-3-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      092b1f0b
    • M
      perf probe: Fix memory leaking on failure by clearing all probe_trace_events · 0196e787
      Masami Hiramatsu 提交于
      Fix memory leaking on the debuginfo__find_trace_events() failure path
      which frees an array of probe_trace_events but doesn't clears all the
      allocated sub-structures and strings.
      
      So, before doing zfree(tevs), clear all the array elements which may
      have allocated resources.
      Reported-by: NWang Nan <wangnan0@huawei.com>
      Signed-off-by: NMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447417761-156094-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0196e787
    • A
      perf buildid-list: Requires ordered events · 1216b65c
      Adrian Hunter 提交于
      'perf buildid-list' processes events to determine hits (i.e. with-hits
      option).  That may not work if events are not sorted in order. i.e. MMAP
      events must be processed before the samples that depend on them so that
      sample processing can 'hit' the DSO to which the MMAP refers.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1447408112-1920-3-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1216b65c
    • A
      perf symbols: Fix dso lookup by long name and missing buildids · e266a753
      Adrian Hunter 提交于
      Commit 4598a0a6 ("perf symbols: Improve DSO long names lookup speed
      with rbtree") Added a tree to lookup dsos by long name.  That tree gets
      corrupted whenever a dso long name is changed because the tree is not
      updated.
      
      One effect of that is buildid-list does not work with the 'with-hits'
      option because dso lookup fails and results in two structs for the same
      dso.  The first has the buildid but no hits, the second has hits but no
      buildid. e.g.
      
      Before:
      
        $ tools/perf/perf record ls
        arch     certs    CREDITS  Documentation  firmware  include
        ipc      Kconfig  lib      Makefile       net       REPORTING-BUGS
        scripts  sound    usr      block          COPYING   crypto
        drivers  fs       init     Kbuild         kernel    MAINTAINERS
        mm       README   samples  security       tools     virt
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data (11 samples) ]
        $ tools/perf/perf buildid-list
        574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
        30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so
        $ tools/perf/perf buildid-list -H
        574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
        0000000000000000000000000000000000000000 /lib/x86_64-linux-gnu/libc-2.19.so
      
      After:
      
        $ tools/perf/perf buildid-list -H
        574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
        30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so
      
      The fix is to record the root of the tree on the dso so that
      dso__set_long_name() can update the tree when the long name changes.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Douglas Hatch <doug.hatch@hp.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Fixes: 4598a0a6 ("perf symbols: Improve DSO long names lookup speed with rbtree")
      Link: http://lkml.kernel.org/r/1447408112-1920-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e266a753
    • A
      perf symbols: Allow forcing reading of non-root owned files by root · 2059fc7a
      Arnaldo Carvalho de Melo 提交于
      When the root user tries to read a file owned by some other user we get:
      
        # ls -la perf.data
        -rw-------. 1 acme acme 20032 Nov 12 15:50 perf.data
        # perf report
        File perf.data not owned by current user or root (use -f to override)
        # perf report -f | grep -v ^# | head -2
          30.96%  ls       [kernel.vmlinux]  [k] do_set_pte
          28.24%  ls       libc-2.20.so      [.] intel_check_word
        #
      
      That wasn't happening when the symbol code tried to read a JIT map,
      where the same check was done but no forcing was possible, fix it.
      Reported-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://permalink.gmane.org/gmane.linux.kernel.perf.user/2380Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2059fc7a
    • A
      perf symbols: Rebuild rbtree when adjusting symbols for kcore · 866548dd
      Adrian Hunter 提交于
      Normally symbols are read from the DSO and adjusted, if need be, so that
      the symbol start matches the file offset in the DSO file (we want the
      file offset because that is what we know from MMAP events). That is done
      by dso__load_sym() which inserts the symbols *after* adjusting them.
      
      In the case of kcore, the symbols have been read from kallsyms and the
      symbol start is the memory address. The symbols have to be adjusted to
      match the kcore file offsets. dso__split_kallsyms_for_kcore() does that,
      but now the adjustment is being done *after* the symbols have been
      inserted. It appears dso__split_kallsyms_for_kcore() was assuming that
      changing the symbol start would not change the order in the rbtree -
      which is, of course, not guaranteed.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NWang Nan <wangnan0@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/563CB241.2090701@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      866548dd
  3. 12 11月, 2015 3 次提交
  4. 07 11月, 2015 4 次提交
    • W
      perf test: Add 'perf test BPF' · ba1fae43
      Wang Nan 提交于
      This patch adds BPF testcase for testing BPF event filtering.
      
      By utilizing the result of 'perf test LLVM', this patch compiles the
      eBPF sample program then test its ability. The BPF script in 'perf test
      LLVM' lets only 50% samples generated by epoll_pwait() to be captured.
      This patch runs that system call for 111 times, so the result should
      contain 56 samples.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446817783-86722-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ba1fae43
    • W
      perf bpf: Improve BPF related error messages · d3e0ce39
      Wang Nan 提交于
      A series of bpf loader related error codes were introduced to help error
      reporting. Functions were improved to return these new error codes.
      
      Functions which return pointers were adjusted to encode error codes into
      return value using the ERR_PTR() interface.
      
      bpf_loader_strerror() was improved to convert these error messages to
      strings. It checks the error codes and calls libbpf_strerror() and
      strerror_r() accordingly, so caller don't need to consider checking the
      range of the error code.
      
      In bpf__strerror_load(), print kernel version of running kernel and the
      object's 'version' section to notify user how to fix his/her program.
      
      v1 -> v2:
       Use macro for error code.
      
       Fetch error message based on array index, eliminate for-loop.
      
       Print version strings.
      
      Before:
      
        # perf record -e ./test_kversion_nomatch_program.o sleep 1
        event syntax error: './test_kversion_nomatch_program.o'
                             \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
        SKIP
      
        After:
      
        # perf record -e ./test_kversion_nomatch_program.o ls
        event syntax error: './test_kversion_nomatch_program.o'
                             \___ 'version' (4.4.0) doesn't match running kernel (4.3.0)
        SKIP
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446818289-87444-1-git-send-email-wangnan0@huawei.com
      [ Add 'static inline' to bpf__strerror_prepare_load() when LIBBPF is disabled ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3e0ce39
    • W
      perf tools: Make fetch_kernel_version() publicly available · 07bc5c69
      Wang Nan 提交于
      There are 2 places in llvm-utils.c which find kernel version information
      through uname. This patch extracts the uname related code into a
      fetch_kernel_version() function and puts it into util.h so it can be
      reused.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446818135-87310-1-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      07bc5c69
    • W
      bpf tools: Improve libbpf error reporting · 6371ca3b
      Wang Nan 提交于
      In this patch, a series of libbpf specific error numbers and
      libbpf_strerror() are introduced to help reporting errors.
      
      Functions are updated to pass correct the error number through the
      CHECK_ERR() macro.
      
      All users of bpf_object__open{_buffer}() and bpf_program__title() in
      perf are modified accordingly. In addition, due to the error codes
      changing, bpf__strerror_load() is also modified to use them.
      
      bpf__strerror_head() is also changed accordingly so it can parse libbpf
      errors. bpf_loader_strerror() is introduced for that purpose, and will
      be improved by the following patch.
      
      load_program() is improved not to dump log buffer if it is empty. log
      buffer is also used to deduce whether the error was caused by an invalid
      program or other problem.
      
      v1 -> v2:
      
       - Using macro for error code.
      
       - Fetch error message based on array index, eliminate for-loop.
      
       - Use log buffer to detect the reason of failure. 3 new error code
         are introduced to replace LIBBPF_ERRNO__LOAD.
      
      In v1:
      
        # perf record -e ./test_ill_program.o ls
        event syntax error: './test_ill_program.o'
                             \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
        SKIP
      
        # perf record -e ./test_kversion_nomatch_program.o ls
        event syntax error: './test_kversion_nomatch_program.o'
                             \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
        SKIP
      
        # perf record -e ./test_big_program.o ls
        event syntax error: './test_big_program.o'
                             \___ Failed to load program: Validate your program and check 'license'/'version' sections in your object
        SKIP
      
        In v2:
      
        # perf record -e ./test_ill_program.o ls
        event syntax error: './test_ill_program.o'
                             \___ Kernel verifier blocks program loading
        SKIP
      
        # perf record -e ./test_kversion_nomatch_program.o
        event syntax error: './test_kversion_nomatch_program.o'
                             \___ Incorrect kernel version
        SKIP
        (Will be further improved by following patches)
      
        # perf record -e ./test_big_program.o
        event syntax error: './test_big_program.o'
                             \___ Program too big
        SKIP
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446817783-86722-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6371ca3b
  5. 06 11月, 2015 2 次提交
  6. 05 11月, 2015 5 次提交
  7. 03 11月, 2015 1 次提交
    • W
      perf bpf: Mute libbpf when '-v' not set · 7a011946
      Wang Nan 提交于
      According to [1], libbpf should be muted. This patch reset info and
      warning message level to ensure libbpf doesn't output anything even
      if error happened.
      
      [1] http://lkml.kernel.org/r/20151020151255.GF5119@kernel.org
      
      Committer note:
      
      Before:
      
      Testing it with an incompatible kernel version in the .c file that
      generated foo.o:
      
        [root@zoo ~]# perf record -e /tmp/foo.o sleep 1
        libbpf: load bpf program failed: Invalid argument
        libbpf: -- BEGIN DUMP LOG ---
        libbpf:
      
        libbpf: -- END LOG --
        libbpf: failed to load program 'fork=_do_fork'
        libbpf: failed to load object '/tmp/foo.o'
        event syntax error: '/tmp/foo.o'
                             \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
        [root@zoo ~]#
      
      After:
      
        [root@zoo ~]# perf record -e /tmp/foo.o sleep 1
        event syntax error: '/tmp/foo.o'
                             \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
        [root@zoo ~]#
      
      This, BTW, need fixing to emit a proper message by validating the
      version in the foo.o "version" ELF section against the running kernel,
      warning the user instead of asking the kernel to load a binary that it
      will refuse due to unmatching kernel version.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446547486-229499-3-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7a011946
  8. 30 10月, 2015 3 次提交
    • R
      perf unwind: Pass symbol source to libunwind · 7ed4915a
      Rabin Vincent 提交于
      Even if --symfs is used to point to the debug binaries, we send in the
      non-debug filenames to libunwind, which leads to libunwind not finding
      the debug frame.  Fix this by preferring the file in --symfs, if it is
      available.
      Signed-off-by: NRabin Vincent <rabin.vincent@axis.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Rabin Vincent <rabinv@axis.com>
      Link: http://lkml.kernel.org/r/1446104978-26429-1-git-send-email-rabin.vincent@axis.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7ed4915a
    • W
      perf tools: Compile scriptlets to BPF objects when passing '.c' to --event · d509db04
      Wang Nan 提交于
      This patch provides infrastructure for passing source files to --event
      directly using:
      
       # perf record --event bpf-file.c command
      
      This patch does following works:
      
       1) Allow passing '.c' file to '--event'. parse_events_load_bpf() is
          expanded to allow caller tell it whether the passed file is source
          file or object.
      
       2) llvm__compile_bpf() is called to compile the '.c' file, the result
          is saved into memory. Use bpf_object__open_buffer() to load the
          in-memory object.
      
      Introduces a bpf-script-example.c so we can manually test it:
      
       # perf record --clang-opt "-DLINUX_VERSION_CODE=0x40200" --event ./bpf-script-example.c sleep 1
      
      Note that '--clang-opt' must put before '--event'.
      
      Futher patches will merge it into a testcase so can be tested automatically.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-10-git-send-email-wangnan0@huawei.comSigned-off-by: NHe Kuang <hekuang@huawei.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d509db04
    • W
      perf bpf: Attach eBPF filter to perf event · 1f45b1d4
      Wang Nan 提交于
      This is the final patch which makes basic BPF filter work. After
      applying this patch, users are allowed to use BPF filter like:
      
       # perf record --event ./hello_world.o ls
      
      A bpf_fd field is appended to 'struct evsel', and setup during the
      callback function add_bpf_event() for each 'probe_trace_event'.
      
      PERF_EVENT_IOC_SET_BPF ioctl is used to attach eBPF program to a newly
      created perf event. The file descriptor of the eBPF program is passed to
      perf record using previous patches, and stored into evsel->bpf_fd.
      
      It is possible that different perf event are created for one kprobe
      events for different CPUs. In this case, when trying to call the ioctl,
      EEXIST will be return. This patch doesn't treat it as an error.
      
      Committer note:
      
      The bpf proggie used so far:
      
        __attribute__((section("fork=_do_fork"), used))
        int fork(void *ctx)
        {
      	  return 0;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
      
      failed to produce any samples, even with forks happening and it being
      running in system wide mode.
      
      That is because now the filter is being associated, and the code above
      always returns zero, meaning that all forks will be probed but filtered
      away ;-/
      
      Change it to 'return 1;' instead and after that:
      
        # trace --no-syscalls --event /tmp/foo.o
           0.000 perf_bpf_probe:fork:(ffffffff8109be30))
           2.333 perf_bpf_probe:fork:(ffffffff8109be30))
           3.725 perf_bpf_probe:fork:(ffffffff8109be30))
           4.550 perf_bpf_probe:fork:(ffffffff8109be30))
        ^C#
      
      And it works with all tools, including 'perf trace'.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f45b1d4
  9. 29 10月, 2015 2 次提交
    • W
      perf bpf: Collect perf_evsel in BPF object files · 4edf30e3
      Wang Nan 提交于
      This patch creates a 'struct perf_evsel' for every probe in a BPF object
      file(s) and fills 'struct evlist' with them. The previously introduced
      dummy event is now removed. After this patch, the following command:
      
       # perf record --event filter.o ls
      
      Can trace on each of the probes defined in filter.o.
      
      The core of this patch is bpf__foreach_tev(), which calls a callback
      function for each 'struct probe_trace_event' event for a bpf program
      with each associated file descriptors. The add_bpf_event() callback
      creates evsels by calling parse_events_add_tracepoint().
      
      Since bpf-loader.c will not be built if libbpf is turned off, an empty
      bpf__foreach_tev() is defined in bpf-loader.h to avoid build errors.
      
      Committer notes:
      
      Before:
      
        # /tmp/oldperf record --event /tmp/foo.o -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.198 MB perf.data ]
        # perf evlist
        /tmp/foo.o
        # perf evlist -v
        /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
        sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
        inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
        exclude_guest: 1, mmap2: 1, comm_exec: 1
      
      I.e. we create just the PERF_TYPE_SOFTWARE (type: 1),
      PERF_COUNT_SW_DUMMY(config 0x9) event, now, with this patch:
      
        # perf record --event /tmp/foo.o -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.210 MB perf.data ]
        # perf evlist -v
        perf_bpf_probe:fork: type: 2, size: 112, config: 0x6bd, { sample_period,
        sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1,
        inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest:
        1, mmap2: 1, comm_exec: 1
        #
      
      We now have a PERF_TYPE_SOFTWARE (type: 1), but the config states 0x6bd,
      which is how, after setting up the event via the kprobes interface, the
      'perf_bpf_probe:fork' event is accessible via the perf_event_open
      syscall. This is all transient, as soon as the 'perf record' session
      ends, these probes will go away.
      
      To see how it looks like, lets try doing a neverending session, one that
      expects a control+C to end:
      
        # perf record --event /tmp/foo.o -a
      
      So, with that in place, we can use 'perf probe' to see what is in place:
      
        # perf probe -l
          perf_bpf_probe:fork  (on _do_fork@acme/git/linux/kernel/fork.c)
      
      We also can use debugfs:
      
        [root@felicio ~]# cat /sys/kernel/debug/tracing/kprobe_events
        p:perf_bpf_probe/fork _text+638512
      
      Ok, now lets stop and see if we got some forks:
      
        [root@felicio linux]# perf record --event /tmp/foo.o -a
        ^C[ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.325 MB perf.data (111 samples) ]
      
        [root@felicio linux]# perf script
            sshd  1271 [003] 81797.507678: perf_bpf_probe:fork: (ffffffff8109be30)
            sshd 18309 [000] 81797.524917: perf_bpf_probe:fork: (ffffffff8109be30)
            sshd 18309 [001] 81799.381603: perf_bpf_probe:fork: (ffffffff8109be30)
            sshd 18309 [001] 81799.408635: perf_bpf_probe:fork: (ffffffff8109be30)
        <SNIP>
      
      Sure enough, we have 111 forks :-)
      
      Callchains seems to work as well:
      
        # perf report --stdio --no-child
        # To display the perf.data header info, please use --header/--header-only options.
        #
        # Total Lost Samples: 0
        #
        # Samples: 562  of event 'perf_bpf_probe:fork'
        # Event count (approx.): 562
        #
        # Overhead  Command   Shared Object     Symbol
        # ........  ........  ................  ............
        #
            44.66%  sh        [kernel.vmlinux]  [k] _do_fork
                          |
                          ---_do_fork
                             entry_SYSCALL_64_fastpath
                             __libc_fork
                             make_child
      
          26.16%  make      [kernel.vmlinux]  [k] _do_fork
      <SNIP>
        #
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-7-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4edf30e3
    • W
      perf tools: Load eBPF object into kernel · 1e5e3ee8
      Wang Nan 提交于
      This patch utilizes bpf_object__load() provided by libbpf to load all
      objects into kernel.
      
      Committer notes:
      
      Testing it:
      
      When using an incorrect kernel version number, i.e., having this in your
      eBPF proggie:
      
        int _version __attribute__((section("version"), used)) = 0x40100;
      
      For a 4.3.0-rc6+ kernel, say, this happens and needs checking at event
      parsing time, to provide a better error report to the user:
      
        # perf record --event /tmp/foo.o sleep 1
        libbpf: load bpf program failed: Invalid argument
        libbpf: -- BEGIN DUMP LOG ---
        libbpf:
      
        libbpf: -- END LOG --
        libbpf: failed to load program 'fork=_do_fork'
        libbpf: failed to load object '/tmp/foo.o'
        event syntax error: '/tmp/foo.o'
                             \___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      
      If we instead make it match, i.e. use 0x40300 on this v4.3.0-rc6+
      kernel, the whole process goes thru:
      
        # perf record --event /tmp/foo.o -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.202 MB perf.data ]
        # perf evlist -v
        /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
        sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
        inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
        exclude_guest: 1, mmap2: 1, comm_exec: 1
        #
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-6-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1e5e3ee8
  10. 28 10月, 2015 10 次提交
    • W
      perf tools: Create probe points for BPF programs · aa3abf30
      Wang Nan 提交于
      This patch introduces bpf__{un,}probe() functions to enable callers to
      create kprobe points based on section names a BPF program. It parses the
      section names in the program and creates corresponding 'struct
      perf_probe_event' structures. The parse_perf_probe_command() function is
      used to do the main parsing work. The resuling 'struct perf_probe_event'
      is stored into program private data for further using.
      
      By utilizing the new probing API, this patch creates probe points during
      event parsing.
      
      To ensure probe points be removed correctly, register an atexit hook so
      even perf quit through exit() bpf__clear() is still called, so probing
      points are cleared. Note that bpf_clear() should be registered before
      bpf__probe() is called, so failure of bpf__probe() can still trigger
      bpf__clear() to remove probe points which are already probed.
      
      strerror style error reporting scaffold is created by this patch.
      bpf__strerror_probe() is the first error reporting function in
      bpf-loader.c.
      
      Committer note:
      
      Trying it:
      
      To build a test eBPF object file:
      
      I am testing using a script I built from the 'perf test -v LLVM' output:
      
        $ cat ~/bin/hello-ebpf
        export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h"
        export WORKING_DIR=/lib/modules/4.2.0/build
        export CLANG_SOURCE=-
        export CLANG_OPTIONS=-xc
      
        OBJ=/tmp/foo.o
        rm -f $OBJ
        echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \
        clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ
      
       ---
      
      First asking to put a probe in a function not present in the kernel
      (misses the initial _):
      
        $ perf record --event /tmp/foo.o sleep 1
        Probe point 'do_fork' not found.
        event syntax error: '/tmp/foo.o'
                             \___ You need to check probing points in BPF file
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
        $
      
       ---
      
      Now, with "__attribute__((section("fork=_do_fork"), used)):
      
       $ grep _do_fork /proc/kallsyms
       ffffffff81099ab0 T _do_fork
       $ perf record --event /tmp/foo.o sleep 1
       Failed to open kprobe_events: Permission denied
       event syntax error: '/tmp/foo.o'
                            \___ Permission denied
      
       ---
      
      Cool, we need to provide some better hints, "kprobe_events" is too low
      level, one doesn't strictly need to know the precise details of how
      these things are put in place, so something that shows the command
      needed to fix the permissions would be more helpful.
      
      Lets try as root instead:
      
        # perf record --event /tmp/foo.o sleep 1
        Lowering default frequency rate to 1000.
        Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data ]
        # perf evlist
        /tmp/foo.o
        [root@felicio ~]# perf evlist -v
        /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
        sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1,
        inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
        sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
      
       ---
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aa3abf30
    • W
      perf tools: Enable passing bpf object file to --event · 84c86ca1
      Wang Nan 提交于
      By introducing new rules in tools/perf/util/parse-events.[ly], this
      patch enables 'perf record --event bpf_file.o' to select events by an
      eBPF object file. It calls parse_events_load_bpf() to load that file,
      which uses bpf__prepare_load() and finally calls bpf_object__open() for
      the object files.
      
      After applying this patch, commands like:
      
       # perf record --event foo.o sleep
      
      become possible.
      
      However, at this point it is unable to link any useful things onto the
      evsel list because the creating of probe points and BPF program
      attaching have not been implemented.  Before real events are possible to
      be extracted, to avoid perf report error because of empty evsel list,
      this patch link a dummy evsel. The dummy event related code will be
      removed when probing and extracting code is ready.
      
      Commiter notes:
      
      Using it:
      
        $ ls -la foo.o
        ls: cannot access foo.o: No such file or directory
        $ perf record --event foo.o sleep
        libbpf: failed to open foo.o: No such file or directory
        event syntax error: 'foo.o'
                             \___ BPF object file 'foo.o' is invalid
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
        $
      
        $ file /tmp/build/perf/perf.o
        /tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
        $ perf record --event /tmp/build/perf/perf.o sleep
        libbpf: /tmp/build/perf/perf.o is not an eBPF object file
        event syntax error: '/tmp/build/perf/perf.o'
                             \___ BPF object file '/tmp/build/perf/perf.o' is invalid
      
        (add -v to see detail)
        Run 'perf list' for a list of valid events
      
         Usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
        $
      
        $ file /tmp/foo.o
        /tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
        $ perf record --event /tmp/foo.o sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data ]
        $ perf evlist
        /tmp/foo.o
        $ perf evlist  -v
        /tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        $
      
      So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
      
        $ perf report --stdio
        Error:
        The perf.data file has no samples!
        # To display the perf.data header info, please use --header/--header-only options.
        #
        $
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      84c86ca1
    • W
      perf ebpf: Add the libbpf glue · 69d262a9
      Wang Nan 提交于
      The 'bpf-loader.[ch]' files are introduced in this patch. Which will be
      the interface between perf and libbpf. bpf__prepare_load() resides in
      bpf-loader.c. Following patches will enrich these two files.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1444826502-49291-3-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      69d262a9
    • J
      perf symbols: Fix endless loop in dso__split_kallsyms_for_kcore · 443f8c75
      Jiri Olsa 提交于
      Currently we split symbols based on the map comparison, but symbols are stored
      within dso objects and maps could point into same dso objects (kernel maps).
      
      Hence we could end up changing rbtree we are currently iterating and mess it
      up. It's easily reproduced on s390x by running:
      
        $ perf record -a -- sleep 3
        $ perf buildid-list -i perf.data --with-hits
      
      The fix is to compare dso objects instead.
      Reported-by: NMichael Petlan <mpetlan@redhat.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151026135130.GA26003@krava.brq.redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      443f8c75
    • W
      perf tools: Enable pre-event inherit setting by config terms · 374ce938
      Wang Nan 提交于
      This patch allows perf record setting event's attr.inherit bit by
      config terms like:
      
        # perf record -e cycles/no-inherit/ ...
        # perf record -e cycles/inherit/ ...
      
      So user can control inherit bit for each event separately.
      
      In following example, a.out fork()s in main then do some complex
      CPU intensive computations in both of its children.
      
      Basic result with and without inherit:
      
        # perf record -e cycles -e instructions ./a.out
        [ perf record: Woken up 9 times to write data ]
        [ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
        # perf report --stdio
        # ...
        # Samples: 23K of event 'cycles'
        # Event count (approx.): 23641752891
        ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30428312415
      
        # perf record -i -e cycles -e instructions ./a.out
        [ perf record: Woken up 5 times to write data ]
        [ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
        ...
        # Samples: 12K of event 'cycles'
        # Event count (approx.): 11699501775
        ...
        # Samples: 12K of event 'instructions'
        # Event count (approx.): 15058023559
      
      Cancel inherit for one event when globally enable:
      
        # perf record -e cycles/no-inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
        ...
        # Samples: 12K of event 'cycles/no-inherit/'
        # Event count (approx.): 11895759282
       ...
        # Samples: 24K of event 'instructions'
        # Event count (approx.): 30668000441
      
      Enable inherit for one event when globally disable:
      
        # perf record -i -e cycles/inherit/ -e instructions ./a.out
        [ perf record: Woken up 7 times to write data ]
        [ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
        ...
        # Samples: 23K of event 'cycles/inherit/'
        # Event count (approx.): 23285400229
        ...
        # Samples: 11K of event 'instructions'
        # Event count (approx.): 14969050259
      
      Committer note:
      
      One can check if the bit was set, in addition to seeing the result in
      the perf.data file size as above by doing one of:
      
        # perf record -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      So, the inherit bit was set in both, now, if we disable it globally using
      --no-inherit:
      
        # perf record --no-inherit -e cycles -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
        # perf evlist -v
        cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
      
      No inherit bit set, then disabling it and setting just on the cycles event:
      
        # perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
        # perf evlist -v
        cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
        instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
        #
      
      We can see it as well in by using a more verbose level of debug messages in
      the tool that sets up the perf_event_attr, 'perf record' in this case:
      
        [root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          mmap                             1
          comm                             1
          freq                             1
          task                             1
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          config                           0x1
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          freq                             1
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8
      
      <SNIP>
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
      [ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      374ce938
    • D
      perf symbols: we can now read separate debug-info files based on a build ID · 5baecbcd
      Dima Kogan 提交于
      Recent GDB (at least on a vanilla Debian box) looks for debug information in
      
        /usr/lib/debug/.build-id/nn/nnnnnnn
      
      where nn/nnnnnn is the build-id of the stripped ELF binary. This is
      documented here:
      
        https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
      
      This was not working in perf because we didn't read the build id until
      AFTER we searched for the separate debug information file. This patch
      reads the build ID and THEN does the search.
      Signed-off-by: NDima Kogan <dima@secretsauce.net>
      Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5baecbcd
    • D
      perf symbols: Fix type error when reading a build-id · f2f30968
      Dima Kogan 提交于
      This was benign, but wrong. The build-id should live in a char[], not a char*[]
      Signed-off-by: NDima Kogan <dima@secretsauce.net>
      Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f2f30968
    • A
      perf tools: Search for more options when passing args to -h · f4efcce3
      Arnaldo Carvalho de Melo 提交于
      Recently 'perf <tool> -h' was made aware of arguments and would show
      just the help for the arguments specified, but that required a strict
      form, i.e.:
      
        $ perf -h --tui
      
      worked, but:
      
        $ perf -h tui
      
      didn't.
      
      Make it support both cases and also look at the option help when neither
      matches, so that he following examples works:
      
        $ perf report -h interface
      
         Usage: perf report [<options>]
      
          --gtk    Use the GTK2 interface
          --stdio  Use the stdio interface
          --tui    Use the TUI interface
      
        $ perf report -h stack
      
         Usage: perf report [<options>]
      
          -g, --call-graph <print_type,threshold[,print_limit],order,
                            sort_key[,branch]>
            Display call graph (stack chain/backtrace):
      
              print_type:  call graph printing style (graph|flat|fractal|none)
              threshold:   minimum call graph inclusion threshold (<percent>)
              print_limit: maximum number of call graph entry (<number>)
              order:       call graph order (caller|callee)
              sort_key:    call graph sort key (function|address)
              branch:      include last branch info to call graph (branch)
      
            Default: graph,0.5,caller,function
              --max-stack <n>   Set the maximum stack depth when parsing the
                                callchain, anything beyond the specified depth
                                will be ignored. Default: 127
        $
      Suggested-by: NIngo Molnar <mingo@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Chandler Carruth <chandlerc@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-xzqvamzqv3cv0p6w3inhols3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f4efcce3
    • J
      perf cpu_map: Add cpu_map__empty_new function · 2322f573
      Jiri Olsa 提交于
      Adding cpu_map__empty_new interface to create empty cpumap with given
      size. The cpumap entries are initialized with -1.
      
      It'll be used for caching cpu_map in following patches.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NKan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1445784728-21732-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2322f573
    • J
      perf evsel: Move id_offset out of struct perf_evsel union member · af339981
      Jiri Olsa 提交于
      Because the 'perf stat record' patches will use the id_offset member
      together with the priv pointer.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NKan Liang <kan.liang@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1445784728-21732-29-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af339981