1. 23 3月, 2016 1 次提交
  2. 29 2月, 2016 1 次提交
  3. 27 2月, 2016 2 次提交
    • W
      perf trace: Print content of bpf-output event · 1d6c9407
      Wang Nan 提交于
      With this patch the contend of BPF output event is printed by
      'perf trace'. For example:
      
       # ./perf trace -a --ev bpf-output/no-inherit,name=evt/ \
                         --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                         usleep 100000
        ...
          1.787 ( 0.004 ms): usleep/3832 nanosleep(rqtp: 0x7ffc78b18980                                        ) ...
          1.787 (         ): evt:Raise a BPF event!..)
          1.788 (         ): perf_bpf_probe:func_begin:(ffffffff810e97d0))
        ...
        101.866 (87.038 ms): gmain/1654 poll(ufds: 0x7f57a80008c0, nfds: 2, timeout_msecs: 1000               ) ...
        101.866 (         ): evt:Raise a BPF event!..)
        101.867 (         ): perf_bpf_probe:func_end:(ffffffff810e97d0 <- ffffffff81796173))
        101.869 (100.087 ms): usleep/3832  ... [continued]: nanosleep()) = 0
        ...
      
       (There is an extra ')' at the end of several lines. However, it is
        another problem, unrelated to this commit.)
      
      Where test_bpf_trace.c is:
      
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        struct bpf_map_def {
              unsigned int type;
              unsigned int key_size;
              unsigned int value_size;
              unsigned int max_entries;
        };
        #define SEC(NAME) __attribute__((section(NAME), used))
        static u64 (*ktime_get_ns)(void) =
              (void *)BPF_FUNC_ktime_get_ns;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
              (void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
              (void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
              (void *)BPF_FUNC_perf_event_output;
      
        struct bpf_map_def SEC("maps") channel = {
              .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
              .key_size = sizeof(int),
              .value_size = sizeof(u32),
              .max_entries = __NR_CPUS__,
        };
      
        static inline int __attribute__((always_inline))
        func(void *ctx, int type)
        {
      	char output_str[] = "Raise a BPF event!";
      	char err_str[] = "BAD %d\n";
      	int err;
      
              err = perf_event_output(ctx, &channel, get_smp_processor_id(),
      			        &output_str, sizeof(output_str));
      	if (err)
      		trace_printk(err_str, sizeof(err_str), err);
              return 1;
        }
        SEC("func_begin=sys_nanosleep")
        int func_begin(void *ctx) {return func(ctx, 1);}
        SEC("func_end=sys_nanosleep%return")
        int func_end(void *ctx) { return func(ctx, 2);}
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-8-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1d6c9407
    • W
      perf trace: Call bpf__apply_obj_config in 'perf trace' · ba504235
      Wang Nan 提交于
      Without this patch BPF map configuration is not applied.
      
      Command like this:
       # ./perf trace --ev bpf-output/no-inherit,name=evt/ \
                      --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                      usleep 100000
      
      Load BPF files without error, but since map:channel.event=evt is not
      applied, bpf-output event not work.
      
      This patch allows 'perf trace' load and run BPF scripts.
      Signed-off-by: NWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-7-git-send-email-wangnan0@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ba504235
  4. 18 12月, 2015 1 次提交
  5. 29 10月, 2015 1 次提交
  6. 15 9月, 2015 1 次提交
    • J
      perf evsel: Propagate error info from tp_format · 8dd2a131
      Jiri Olsa 提交于
      Propagate error info from tp_format via ERR_PTR to get it all the way
      down to the parse-event.c tracepoint adding routines. Following
      functions now return pointer with encoded error:
      
        - tp_format
        - trace_event__tp_format
        - perf_evsel__newtp_idx
        - perf_evsel__newtp
      
      This affects several other places in perf, that cannot use pointer check
      anymore, but must utilize the err.h interface, when getting error
      information from above functions list.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Raphael Beamonte <raphael.beamonte@gmail.com>
      Link: http://lkml.kernel.org/r/1441615087-13886-5-git-send-email-jolsa@kernel.org
      [ Add two missing ERR_PTR() and one IS_ERR() ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8dd2a131
  7. 04 9月, 2015 1 次提交
  8. 29 8月, 2015 1 次提交
  9. 15 8月, 2015 1 次提交
  10. 12 8月, 2015 4 次提交
  11. 06 8月, 2015 2 次提交
    • M
      perf trace: Add total time column to summary. · 834fd46d
      Milian Wolff 提交于
      It is cumbersome to manually calculate the total time spent in a given
      syscall by multiplying the average value with the number of calls.
      
      Instead, we now do this directly inside perf trace.
      
      Note that this is also done by 'strace', which even adds a column with
      relative numbers - something we could do in the future.
      
      Example:
      
        perf trace -s find /some/folder > /dev/null
      
         Summary of events:
      
         find (19976), 700123 events, 100.0%, 0.000 msec
      
           syscall            calls    total       min       avg       max      stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
           --------------- -------- --------- --------- --------- ---------     ------
           read                   4     0.006     0.001     0.002     0.003     27.42%
           write               8046     9.617     0.001     0.001     0.035      0.56%
           open               34196    40.384     0.001     0.001     0.071      0.30%
           close              68375    57.104     0.001     0.001     0.076      0.25%
           stat                   4     0.004     0.001     0.001     0.001      3.14%
           fstat              34189    27.518     0.001     0.001     0.060      0.34%
           mmap                  13     0.029     0.001     0.002     0.003     10.74%
           mprotect               6     0.018     0.002     0.003     0.005     17.04%
           munmap                 3     0.014     0.003     0.005     0.006     24.87%
           brk                   87     0.490     0.001     0.006     0.016      6.50%
           ioctl                  3     0.004     0.001     0.001     0.003     36.39%
           access                 1     0.004     0.004     0.004     0.004      0.00%
           uname                  1     0.001     0.001     0.001     0.001      0.00%
           getdents           68393   143.600     0.001     0.002     0.187      0.95%
           fchdir             68371    56.980     0.001     0.001     0.111      0.39%
           arch_prctl             1     0.001     0.001     0.001     0.001      0.00%
           openat             34184    41.737     0.001     0.001     0.102      0.41%
           newfstatat         34184    41.180     0.001     0.001     0.064      0.34%
      Signed-off-by: NMilian Wolff <milian.wolff@kdab.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      LPU-Reference: 1438853069-5902-1-git-send-email-milian.wolff@kdab.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      834fd46d
    • M
      perf trace: Write to stderr by default · 007d66a0
      Milian Wolff 提交于
      Without this patch, it is cumbersome to read the trace output but
      ignoring the normal, potentially verbose, output of the debuggee.  One
      common example is doing something like the following:
      
       perf trace -s find /tmp > /dev/null
      
      Without this patch, the trace summary will be lost. Now, it will still
      be printed at the end. This behavior is also applied by strace.
      
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: David Ahern <dsahern@gmail.com>
      Link: http://lkml.kernel.org/n/tip-tqnks6y2cnvm5f9g2dsfr7zl@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      007d66a0
  12. 05 8月, 2015 5 次提交
    • A
      perf trace: Use vfs_getname syscall arg beautifier in more syscalls · 34221118
      Arnaldo Carvalho de Melo 提交于
      Those were covered and tested in this cset:
      
       access, chdir, chmod, chown, chroot, creat, getxattr,
       inotify_add_watch, lchown, lgetxattr, listxattr,
       lsetxattr, mkdir, mkdirat, mknod, rmdir, faccessat,
       newfstatat, openat, readlink, readlinkat, removexattr,
       setxattr, statfs, swapon, swapoff, truncate, unlinkat,
       utime, utimes, utimensat.
      
      E.g.:
      
        # trace -e statfs,access,mkdir mkdir /tmp/bla
         0.285 (0.020 ms): mkdir/2799 access(filename: /etc/ld.so.preload, mode: R         ) = -1 ENOENT No such file or directory
         1.070 (0.032 ms): mkdir/2799 statfs(pathname: /sys/fs/selinux, buf: 0x7ffeafbdc930) = 0
         1.087 (0.013 ms): mkdir/2799 statfs(pathname: /sys/fs/selinux, buf: 0x7ffeafbdc820) = 0
         1.189 (0.014 ms): mkdir/2799 access(filename: /etc/selinux/config                 ) = 0
         1.905 (0.610 ms): mkdir/2799 mkdir(pathname: /tmp/bla, mode: 511                  ) = 0
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-wbqtnlktquun3wtpjdz3okul@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      
        and an empty message aborts the commit.
      34221118
    • A
      perf trace: Deref sys_enter pointer args with contents from probe:vfs_getname · f994592d
      Arnaldo Carvalho de Melo 提交于
      To work like strace and dereference syscall pointer args we need to
      insert probes (or tracepoints) right after we copy those bytes from
      userspace.
      
      Since we're formatting the syscall args at raw_syscalls:sys_enter time,
      we need to have a formatter that just stores the position where, later,
      when we get the probe:vfs_getname, we can insert the pointer contents.
      
      Now, if a probe:vfs_getname with this format is in place:
      
       # perf probe -l
        probe:vfs_getname (on getname_flags:72@/home/git/linux/fs/namei.c with pathname)
      
      That was, in this case, put in place with:
      
       # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
       Added new event:
        probe:vfs_getname    (on getname_flags:72 with pathname=filename:string)
      
       You can now use it in all perf tools, such as:
      
      	perf record -e probe:vfs_getname -aR sleep 1
       #
      
      Then 'perf trace' will notice that and do the pointer -> contents
      expansion:
      
       # trace -e open touch /tmp/bla
        0.165 (0.010 ms): touch/17752 open(filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
        0.195 (0.011 ms): touch/17752 open(filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
        0.512 (0.012 ms): touch/17752 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
        0.582 (0.012 ms): touch/17752 open(filename: /tmp/bla, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: 438) = 3
       #
      
      Roughly equivalent to strace's output:
      
       # strace -rT -e open touch /tmp/bla
        0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000039>
        0.000317 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000102>
        0.001461 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 <0.000072>
        0.000405 open("/tmp/bla", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3 <0.000055>
        0.000641 +++ exited with 0 +++
       #
      
      Now we need to either look for at all syscalls that are marked as
      pointers and have some well known names ("filename", "pathname", etc)
      and set the arg formatter to the one used for the "open" syscall in this
      patch.
      
      This implementation works for syscalls with just a string being copied
      from userspace, for matching syscalls with more than one string being
      copied via the same probe/trace point (vfs_getname) we need to extend
      the vfs_getname probe spec to include the pointer too, but there are
      some problems with that in 'perf probe' or the kernel kprobes code, need
      to investigate before considering supporting multiple strings per
      syscall.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-xvuwx6nuj8cf389kf9s2ue2s@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f994592d
    • A
      perf trace: Use a constant for the syscall formatting buffer · e4d44e83
      Arnaldo Carvalho de Melo 提交于
      We were using it as a magic number, 1024, fix that.
      
      Eventually we need to stop doing it per line, and do it per
      arg, traversing the args at output time, to avoid the memmove()
      calls that will be used in the next cset to replace pointers
      present at raw_syscalls:sys_enter time with its contents that
      appear at probe:vfs_getname time, before raw_syscalls:sys_exit
      time.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-4sz3wid39egay1pp8qmbur4u@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e4d44e83
    • A
      perf trace: Remember if the vfs_getname tracepoint/kprobe is in place · 08c98776
      Arnaldo Carvalho de Melo 提交于
      So that we can later decide if we will store where to expand the
      pathname once we are handling vfs_getname or if we should instead
      just go on and straight away print the pointer.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ytxk5s5jpc50wahffmlxgxuw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      08c98776
    • A
      perf trace: Do not show syscall tracepoint filter in the --no-syscalls case · 2e5e5f87
      Arnaldo Carvalho de Melo 提交于
      We were accessing trace->syscalls.events members even when that struct
      wasn't initialized, i.e. --no-syscalls was specified on the command
      line, fix it to show that, still in debug mode, when we have an event
      qualifier list, i.e. when we actually are doing subset syscall tracing.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Fixes: 19867b61 ("perf trace: Use event filters for the event qualifier list")
      Link: http://lkml.kernel.org/n/tip-7980ym6vujgh3yiai0cqzc88@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2e5e5f87
  13. 29 7月, 2015 1 次提交
    • A
      perf python: Remove dependency on 'machine' methods · 959c2199
      Arnaldo Carvalho de Melo 提交于
      The python binding still doesn't provide symbol resolving facilities,
      but the recent addition of the trace_event__register_resolver() function
      made it add as a dependency the machine__resolve_kernel_addr() method,
      that in turn drags all the symbol resolving code.
      
      The problem:
      
        [root@zoo ~]# perf test -v python
        17: Try 'import perf' in python, checking link problems      :
        --- start ---
        test child forked, pid 6853
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.so: undefined symbol: machine__resolve_kernel_addr
        test child finished with -1
        ---- end ----
        Try 'import perf' in python, checking link problems: FAILED!
        [root@zoo ~]#
      
      Fix it by requiring this function to receive the resolver as a
      parameter, just like pevent_register_function_resolver(), i.e. do
      not explicitely refer to an object file not included in
      tools/perf/util/python-ext-sources.
      
        [root@zoo ~]# perf test python
        17: Try 'import perf' in python, checking link problems      : Ok
        [root@zoo ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Fixes: c3168b0d ("perf symbols: Provide libtraceevent callback to resolve kernel symbols")
      Link: http://lkml.kernel.org/n/tip-vxlhh95v2em9zdbgj3jm7xi5@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      959c2199
  14. 24 7月, 2015 1 次提交
  15. 21 7月, 2015 1 次提交
    • A
      perf trace: Support 'strace' syscall event groups · 005438a8
      Arnaldo Carvalho de Melo 提交于
      I.e.:
      
        $ cat ~/share/perf-core/strace/groups/file
        access
        chmod
        creat
        execve
        faccessat
        getcwd
        lstat
        mkdir
        open
        openat
        quotactl
        readlink
        rename
        rmdir
        stat
        statfs
        symlink
        unlink
        $
      
      Then, on a quiet desktop, try running this and then moving your mouse to
      see the deluge of mouse related activity:
      
        # perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
        Added new event:
          probe:vfs_getname    (on getname_flags:72 with pathname=filename:string)
      
        You can now use it in all perf tools, such as:
      
      	perf record -e probe:vfs_getname -aR sleep 1
        #
        # trace --ev probe:vfs_getname --filter-pids 2232 -e file
         0.042 (0.042 ms): mousetweaks/2235 open(filename: 0x14e3910, mode: 438                                   ) ...
         0.042 (        ): probe:vfs_getname:(ffffffff812230bc) pathname="/home/acme/.icons/Adwaita/cursors/xterm")
         0.100 (0.100 ms): mousetweaks/2235  ... [continued]: open()) = -1 ENOENT No such file or directory
         0.142 (0.018 ms): mousetweaks/2235 open(filename: 0x14c3c10, mode: 438                                   ) ...
         0.142 (        ): probe:vfs_getname:(ffffffff812230bc) pathname="/home/acme/.icons/Adwaita/index.theme")
         0.192 (0.069 ms): mousetweaks/2235  ... [continued]: open()) = -1 ENOENT No such file or directory
         0.230 (0.017 ms): mousetweaks/2235 open(filename: 0x14c3c10, mode: 438                                   ) ...
         0.230 (        ): probe:vfs_getname:(ffffffff812230bc) pathname="/usr/share/icons/Adwaita/cursors/xterm")
         0.253 (0.041 ms): mousetweaks/2235  ... [continued]: open()) = 14
         0.459 (0.008 ms): mousetweaks/2235 open(filename: 0x14e3910, mode: 438                                   ) ...
         0.459 (        ): probe:vfs_getname:(ffffffff812230bc) pathname="/home/acme/.icons/Adwaita/cursors/left_side")
         0.468 (0.017 ms): mousetweaks/2235  ... [continued]: open()) = -1 ENOENT No such file or directory
      
      Need to combine that raw_syscalls:sys_enter(open) + probe:vfs_getname +
      raw_syscalls:sys_exit(open) sequence...
      
      Now, if you're bored, please write some more syscall groups, like the ones
      in 'strace' and send it our way :-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Milian Wolff <mail@milianw.de>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-a42xklu59lcbxp7bbnic74a8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      005438a8
  16. 20 7月, 2015 1 次提交
  17. 06 7月, 2015 4 次提交
    • A
      perf trace: Use event filters for the event qualifier list · 19867b61
      Arnaldo Carvalho de Melo 提交于
      We use raw_syscalls:sys_{enter,exit} events to show the syscalls, but were
      using a rather lazy/inneficient way to implement our 'strace -e' equivalent:
      filter out after reading the events in the ring buffer.
      
      Deflect more work to the kernel by appending a filter expression for that,
      that, together with the pid list, that is always present, if only to filter the
      tracer itself, reduces pressure on the ring buffer and otherwise use
      infrastructure already in place in the kernel to do early filtering.
      
      If we use it with -v we can see the filter passed to the kernel,
      for instance, for this contrieved case:
      
        # trace -v -e \!open,close,write,poll,recvfrom,select,recvmsg,writev,sendmsg,read,futex,epoll_wait,ioctl,eventfd --filter-pids 2189,2566,1398,2692,4475,4532
      <SNIP>
        (common_pid != 2514 && common_pid != 1398 && common_pid != 2189 && common_pid != 2566 && common_pid != 2692 && common_pid != 4475 && common_pid != 4532) && (id != 3 && id != 232 && id != 284 && id != 202 && id != 16 && id != 2 && id != 7 && id != 0 && id != 45 && id != 47 && id != 23 && id != 46 && id != 1 && id != 20)
           0.011 (0.011 ms): caribou/2295 eventfd2(flags: CLOEXEC|NONBLOCK) = 18
          16.946 (0.019 ms): caribou/2295 eventfd2(flags: CLOEXEC|NONBLOCK) = 18
          38.598 (0.167 ms): chronyd/794 socket(family: INET, type: DGRAM ) = 4
          38.603 (0.002 ms): chronyd/794 fcntl(fd: 4<socket:[239307]>, cmd: GETFD) = 0
          38.605 (0.001 ms): chronyd/794 fcntl(fd: 4<socket:[239307]>, cmd: SETFD, arg: 1) = 0
      ^C
       #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ti2tg18atproqpguc2moinp6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      19867b61
    • A
      perf evlist: Make perf_evlist__set_filter use perf_evsel__set_filter · 94ad89bc
      Arnaldo Carvalho de Melo 提交于
      Instead of calling perf_evsel__apply_filter straight away, so that
      we can, in the next patches, expand the filter with more conditions
      before actually calling the ioctl to pass the end result filter to
      the kernel.
      
      Now we need to call perf_evlist__apply_filters() after the filter
      is completely setup, i.e. do the ioctl calls.
      
      The perf_evlist__apply_filters() method was already in place, because
      that is the model for the other tools that receives filters in the
      command line: go on setting then in the evsel->filter and only at
      the end, after parsing the whole command line, apply them.
      
      We get, as a bonus, a more expressive message that states which
      event, if any, failed to have the filter applied to, with an
      error message stating what happened.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-f429pgz75ryz7tpe6v74etre@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      94ad89bc
    • A
      perf trace: Store the syscall ids for the event qualifiers in a table · 8b3ce757
      Arnaldo Carvalho de Melo 提交于
      That we will use to set a filter on raw_syscalls:sys_{enter,exit}
      events.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-2acxrcxyu7tlolrfilpty38y@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b3ce757
    • A
      perf trace: Remember what are the syscalls tracepoint evsels · c27366f0
      Arnaldo Carvalho de Melo 提交于
      We will need to set filters on then.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-u8hpgjpf3w8o1prnnjnwegwf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c27366f0
  18. 26 6月, 2015 1 次提交
    • A
      perf trace: Validate syscall list passed via -e argument · d0cc439b
      Arnaldo Carvalho de Melo 提交于
      The 'trace' tool was accepting any names passed and just looking if
      syscalls returned via the raw_syscalls:* tracepoints were in that list,
      leading to it accepting perf events and then never finding any, as those
      are not valid syscall names, confusing users.
      
      Fix it by checking each entry in the list using audit_name_to_syscall,
      telling the user which entries are invalid and suggesting where to look
      for valid syscall names.
      
      E.g:
      
        [root@zoo ~]# trace -e open,foo,bar,close,baz
        Error: Invalid syscall bar, baz, foo
        Hint:	 try 'perf list syscalls:sys_enter_*'
        Hint:	 and: 'man syscalls'
        [root@zoo ~]#
      Reported-by: NFlavio Leitner <fbl@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/n/tip-4g1i3m1z6fzsrznn2umi02wa@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d0cc439b
  19. 24 6月, 2015 1 次提交
  20. 20 6月, 2015 1 次提交
  21. 12 6月, 2015 1 次提交
    • A
      trace: Beautify perf_event_open syscall · a1c2552d
      Arnaldo Carvalho de Melo 提交于
      Syswide tracing and then running 'stat' and 'trace':
      
       $ perf trace -e perf_event_open
       1034.649 (0.019 ms): perf/6133 perf_event_open(attr_uptr: 0x36f0360, pid: 16134, cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = -1 EINVAL Invalid argument
       1034.670 (0.008 ms): perf/6133 perf_event_open(attr_uptr: 0x36f0360, pid: 16134, cpu: -1, group_fd: -1) = -1 EINVAL Invalid argument
       1034.681 (0.007 ms): perf/6133 perf_event_open(attr_uptr: 0x36f0360, pid: 16134, cpu: -1, group_fd: -1) = -1 EINVAL Invalid argument
       1034.692 (0.007 ms): perf/6133 perf_event_open(attr_uptr: 0x36f0360, pid: 16134, cpu: -1, group_fd: -1) = -1 EINVAL Invalid argument
       9986.983 (0.014 ms): trace/6139 perf_event_open(attr_uptr: 0x7ffd9c629320, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
       9987.026 (0.016 ms): trace/6139 perf_event_open(attr_uptr: 0x37c7e70, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
       9987.041 (0.008 ms): trace/6139 perf_event_open(attr_uptr: 0x37c7e70, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
       9987.489 (0.092 ms): trace/6139 perf_event_open(attr_uptr: 0x3795ee0, pid: 16140, group_fd: -1, flags: FD_CLOEXEC) = 3
       9987.536 (0.044 ms): trace/6139 perf_event_open(attr_uptr: 0x3795ee0, pid: 16140, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 4
       9987.580 (0.041 ms): trace/6139 perf_event_open(attr_uptr: 0x3795ee0, pid: 16140, cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 5
       9987.620 (0.037 ms): trace/6139 perf_event_open(attr_uptr: 0x3795ee0, pid: 16140, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 7
       9987.659 (0.035 ms): trace/6139 perf_event_open(attr_uptr: 0x37975d0, pid: 16140, group_fd: -1, flags: FD_CLOEXEC) = 8
       9987.692 (0.031 ms): trace/6139 perf_event_open(attr_uptr: 0x37975d0, pid: 16140, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 9
       9987.727 (0.032 ms): trace/6139 perf_event_open(attr_uptr: 0x37975d0, pid: 16140, cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 10
       9987.761 (0.031 ms): trace/6139 perf_event_open(attr_uptr: 0x37975d0, pid: 16140, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 11
      
      Need to intercept perf_copy_attr() with a kprobe or with eBPF...
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/n/tip-njb105hab2i3t5dexym9lskl@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a1c2552d
  22. 15 5月, 2015 1 次提交
  23. 12 5月, 2015 1 次提交
  24. 09 5月, 2015 1 次提交
    • A
      perf machine: Protect the machine->threads with a rwlock · b91fc39f
      Arnaldo Carvalho de Melo 提交于
      In addition to using refcounts for the struct thread lifetime
      management, we need to protect access to machine->threads from
      concurrent access.
      
      That happens in 'perf top', where a thread processes events, inserting
      and deleting entries from that rb_tree while another thread decays
      hist_entries, that end up dropping references and ultimately deleting
      threads from the rb_tree and releasing its resources when no further
      hist_entry (or other data structures, like in 'perf sched') references
      it.
      
      So the rule is the same for refcounts + protected trees in the kernel,
      get the tree lock, find object, bump the refcount, drop the tree lock,
      return, use object, drop the refcount if no more use of it is needed,
      keep it if storing it in some other data structure, drop when releasing
      that data structure.
      
      I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
      "perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".
      
      The addr_location__put() one is because as we return references to
      several data structures, we may end up adding more reference counting
      for the other data structures and then we'll drop it at
      addr_location__put() time.
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b91fc39f
  25. 29 4月, 2015 2 次提交
  26. 24 4月, 2015 2 次提交
    • A
      perf trace: Disable events and drain events when forked workload ends · 02ac5421
      Arnaldo Carvalho de Melo 提交于
      We were not checking in the inner event processing loop if the forked workload
      had finished, which, on a busy system, may make it take a long time trying to
      drain events, entering a seemingly neverending loop, waiting for the system to
      get idle enough to make it drain the buffers.
      
      Fix it by disabling the events when 'done' is true, in the inner loop, to start
      draining what is in the buffers.
      
      Now:
      
      [root@ssdandy ~]# time trace --filter-pids 14003 -a sleep 1 | tail
        996.748 ( 0.002 ms): sh/30296 rt_sigprocmask(how: SETMASK, nset: 0x7ffc83418160, sigsetsize: 8) = 0
        996.751 ( 0.002 ms): sh/30296 rt_sigprocmask(how: BLOCK, nset: 0x7ffc834181f0, oset: 0x7ffc83418270, sigsetsize: 8) = 0
        996.755 ( 0.002 ms): sh/30296 rt_sigaction(sig: INT, act: 0x7ffc83417f50, oact: 0x7ffc83417ff0, sigsetsize: 8) = 0
       1004.543 ( 0.362 ms): tail/30198  ... [continued]: read()) = 4096
       1004.548 ( 7.791 ms): sh/30296 wait4(upid: -1, stat_addr: 0x7ffc834181a0) ...
       1004.975 ( 0.427 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096
       1005.390 ( 0.410 ms): tail/30198 read(buf: 0x765410, count: 8192) = 4096
       1005.743 ( 0.348 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096
       1006.197 ( 0.449 ms): tail/30198 read(buf: 0x765410, count: 8192) = 4096
       1006.492 ( 0.290 ms): tail/30198 read(buf: 0x7633f0, count: 8192) = 4096
      
      real	0m1.219s
      user	0m0.704s
      sys	0m0.331s
      [root@ssdandy ~]#
      Reported-by: NMichael Petlan <mpetlan@redhat.com>
      Suggested-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-p6kpn1b26qcbe47pufpw0tex@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      02ac5421
    • A
      perf trace: Enable events when doing system wide tracing and starting a workload · cb24d01d
      Arnaldo Carvalho de Melo 提交于
       commit f7aa222f
       Author: Arnaldo Carvalho de Melo <acme@redhat.com>
       Date:   Tue Feb 3 13:25:39 2015 -0300
      
          perf trace: No need to enable evsels for workload started from perf
      
      The assumption was that whenever a workload is specified, the
      attr.enable_on_exec evsel flag would be set, but that is not happening
      when perf_record_opts.system_wide is set, for instance
      
      That resulted in both perf_evlist__enable() and attr.enable_on_exec
      being not called/set, which made the events to remain disabled while the
      workload runs, producing no output.
      
      Fix it,  by calling perf_evlist__enable() in the 'trace' tool
      when forking and not targetting a workload started from trace
      
      v2: Test against !target__none(), as suggested by Namhyung Kim, that is
      what is used in perf_evsel__config() when deciding if the
      attr.enable_on_exec flag to be set. More work is needed to cover other
      cases such as opts->initial_delay.
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-27z7169pvfxgj8upic636syv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cb24d01d