1. 07 8月, 2015 2 次提交
    • W
      tracing, perf: Implement BPF programs attached to uprobes · 04a22fae
      Wang Nan 提交于
      By copying BPF related operation to uprobe processing path, this patch
      allow users attach BPF programs to uprobes like what they are already
      doing on kprobes.
      
      After this patch, users are allowed to use PERF_EVENT_IOC_SET_BPF on a
      uprobe perf event. Which make it possible to profile user space programs
      and kernel events together using BPF.
      
      Because of this patch, CONFIG_BPF_EVENTS should be selected by
      CONFIG_UPROBE_EVENT to ensure trace_call_bpf() is compiled even if
      KPROBE_EVENT is not set.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1435716878-189507-3-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      04a22fae
    • W
      bpf: Use correct #ifdef controller for trace_call_bpf() · 098d2164
      Wang Nan 提交于
      Commit e1abf2cc ("bpf: Fix the build on
      BPF_SYSCALL=y && !CONFIG_TRACING kernels, make it more configurable")
      updated the building condition of bpf_trace.o from CONFIG_BPF_SYSCALL
      to CONFIG_BPF_EVENTS, but the corresponding #ifdef controller in
      trace_events.h for trace_call_bpf() was not changed. Which, in theory,
      is incorrect.
      
      With current Kconfigs, we can create a .config with CONFIG_BPF_SYSCALL=y
      and CONFIG_BPF_EVENTS=n by unselecting CONFIG_KPROBE_EVENT and
      selecting CONFIG_BPF_SYSCALL. With these options, trace_call_bpf() will
      be defined as an extern function, but if anyone calls it a symbol missing
      error will be triggered since bpf_trace.o was not built.
      
      This patch changes the #ifdef controller for trace_call_bpf() from
      CONFIG_BPF_SYSCALL to CONFIG_BPF_EVENTS. I'll show its correctness:
      
      Before this patch:
      
         BPF_SYSCALL   BPF_EVENTS   trace_call_bpf   bpf_trace.o
         y             y           normal           compiled
         n             n           inline           not compiled
         y             n           normal           not compiled (incorrect)
         n             y          impossible (BPF_EVENTS depends on BPF_SYSCALL)
      
      After this patch:
      
         BPF_SYSCALL   BPF_EVENTS   trace_call_bpf   bpf_trace.o
         y             y           normal           compiled
         n             n           inline           not compiled
         y             n           inline           not compiled (fixed)
         n             y          impossible (BPF_EVENTS depends on BPF_SYSCALL)
      
      So this patch doesn't break anything. QED.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1435716878-189507-2-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      098d2164
  2. 06 8月, 2015 7 次提交
  3. 05 8月, 2015 6 次提交
    • K
      perf tools: Per-event time support · 32067712
      Kan Liang 提交于
      This patchkit adds the ability to turn off time stamps per event.
      
      One usaful case for partial time is to work with per-event callgraph to
      enable "PEBS threshold > 1" (https://lkml.org/lkml/2015/5/10/196), which
      can significantly reduce the sampling overhead.
      
      The event samples with time stamps off will not be ordered.
      Signed-off-by: NKan Liang <kan.liang@intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1438677022-34296-2-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      32067712
    • A
      perf trace: Use vfs_getname syscall arg beautifier in more syscalls · 34221118
      Arnaldo Carvalho de Melo 提交于
      Those were covered and tested in this cset:
      
       access, chdir, chmod, chown, chroot, creat, getxattr,
       inotify_add_watch, lchown, lgetxattr, listxattr,
       lsetxattr, mkdir, mkdirat, mknod, rmdir, faccessat,
       newfstatat, openat, readlink, readlinkat, removexattr,
       setxattr, statfs, swapon, swapoff, truncate, unlinkat,
       utime, utimes, utimensat.
      
      E.g.:
      
        # trace -e statfs,access,mkdir mkdir /tmp/bla
         0.285 (0.020 ms): mkdir/2799 access(filename: /etc/ld.so.preload, mode: R         ) = -1 ENOENT No such file or directory
         1.070 (0.032 ms): mkdir/2799 statfs(pathname: /sys/fs/selinux, buf: 0x7ffeafbdc930) = 0
         1.087 (0.013 ms): mkdir/2799 statfs(pathname: /sys/fs/selinux, buf: 0x7ffeafbdc820) = 0
         1.189 (0.014 ms): mkdir/2799 access(filename: /etc/selinux/config                 ) = 0
         1.905 (0.610 ms): mkdir/2799 mkdir(pathname: /tmp/bla, mode: 511                  ) = 0
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-wbqtnlktquun3wtpjdz3okul@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      
        and an empty message aborts the commit.
      34221118
    • A
      perf trace: Deref sys_enter pointer args with contents from probe:vfs_getname · f994592d
      Arnaldo Carvalho de Melo 提交于
      To work like strace and dereference syscall pointer args we need to
      insert probes (or tracepoints) right after we copy those bytes from
      userspace.
      
      Since we're formatting the syscall args at raw_syscalls:sys_enter time,
      we need to have a formatter that just stores the position where, later,
      when we get the probe:vfs_getname, we can insert the pointer contents.
      
      Now, if a probe:vfs_getname with this format is in place:
      
       # perf probe -l
        probe:vfs_getname (on getname_flags:72@/home/git/linux/fs/namei.c with pathname)
      
      That was, in this case, put in place with:
      
       # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
       Added new event:
        probe:vfs_getname    (on getname_flags:72 with pathname=filename:string)
      
       You can now use it in all perf tools, such as:
      
      	perf record -e probe:vfs_getname -aR sleep 1
       #
      
      Then 'perf trace' will notice that and do the pointer -> contents
      expansion:
      
       # trace -e open touch /tmp/bla
        0.165 (0.010 ms): touch/17752 open(filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
        0.195 (0.011 ms): touch/17752 open(filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
        0.512 (0.012 ms): touch/17752 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
        0.582 (0.012 ms): touch/17752 open(filename: /tmp/bla, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: 438) = 3
       #
      
      Roughly equivalent to strace's output:
      
       # strace -rT -e open touch /tmp/bla
        0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000039>
        0.000317 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000102>
        0.001461 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 <0.000072>
        0.000405 open("/tmp/bla", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3 <0.000055>
        0.000641 +++ exited with 0 +++
       #
      
      Now we need to either look for at all syscalls that are marked as
      pointers and have some well known names ("filename", "pathname", etc)
      and set the arg formatter to the one used for the "open" syscall in this
      patch.
      
      This implementation works for syscalls with just a string being copied
      from userspace, for matching syscalls with more than one string being
      copied via the same probe/trace point (vfs_getname) we need to extend
      the vfs_getname probe spec to include the pointer too, but there are
      some problems with that in 'perf probe' or the kernel kprobes code, need
      to investigate before considering supporting multiple strings per
      syscall.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-xvuwx6nuj8cf389kf9s2ue2s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f994592d
    • A
      perf trace: Use a constant for the syscall formatting buffer · e4d44e83
      Arnaldo Carvalho de Melo 提交于
      We were using it as a magic number, 1024, fix that.
      
      Eventually we need to stop doing it per line, and do it per
      arg, traversing the args at output time, to avoid the memmove()
      calls that will be used in the next cset to replace pointers
      present at raw_syscalls:sys_enter time with its contents that
      appear at probe:vfs_getname time, before raw_syscalls:sys_exit
      time.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4sz3wid39egay1pp8qmbur4u@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e4d44e83
    • A
      perf trace: Remember if the vfs_getname tracepoint/kprobe is in place · 08c98776
      Arnaldo Carvalho de Melo 提交于
      So that we can later decide if we will store where to expand the
      pathname once we are handling vfs_getname or if we should instead
      just go on and straight away print the pointer.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ytxk5s5jpc50wahffmlxgxuw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      08c98776
    • A
      perf trace: Do not show syscall tracepoint filter in the --no-syscalls case · 2e5e5f87
      Arnaldo Carvalho de Melo 提交于
      We were accessing trace->syscalls.events members even when that struct
      wasn't initialized, i.e. --no-syscalls was specified on the command
      line, fix it to show that, still in debug mode, when we have an event
      qualifier list, i.e. when we actually are doing subset syscall tracing.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Fixes: 19867b61 ("perf trace: Use event filters for the event qualifier list")
      Link: http://lkml.kernel.org/n/tip-7980ym6vujgh3yiai0cqzc88@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2e5e5f87
  4. 04 8月, 2015 25 次提交