1. 05 9月, 2018 1 次提交
  2. 04 9月, 2018 1 次提交
  3. 31 8月, 2018 10 次提交
    • B
      perf trace beauty: Alias 'umount' to 'umount2' · 3de3e8bb
      Benjamin Peterson 提交于
      Before:
      
        # perf trace -e *mount* umount /dev/mapper/fedora-home /s
          11.576 ( 0.004 ms) umount/3138 umount2(arg0: 94501956754656, arg1: 0, arg2: 1, arg3: 140051050083104, arg4: 4, arg5: 94501956755136) = -1 EINVAL Invalid argument
        #
      
      After:
      
        # perf trace -e *mount* umount /s
           0.000 ( 9.241 ms): umount/5251 umount2(name: 0x55f74a986480) = 0
      Signed-off-by: NBenjamin Peterson <benjamin@python.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180828035344.31500-1-benjamin@python.org
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3de3e8bb
    • A
      perf trace: Show comm and tid for tracepoint events · c4191e55
      Arnaldo Carvalho de Melo 提交于
      So that all events have that info, improving reading by having
      information better aligned, etc.
      
      Before:
      
        # echo 1 > /proc/sys/vm/drop_caches
        # perf trace -e block:*,ext4:*,tools/perf/examples/bpf/augmented_syscalls.c,close cat tools/perf/examples/bpf/hello.c
             0.000 (         ): #include <stdio.h>
      
        int syscall_enter(openat)(void *args)
        {
        	puts("Hello, world\n");
        	return 0;
        }
      
        license(GPL);
        cat/2731 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
             0.025 (         ): syscalls:sys_exit_openat:0x3
             0.063 ( 0.022 ms): cat/2731 close(fd: 3) = 0
             0.110 (         ): cat/2731 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
             0.123 (         ): syscalls:sys_exit_openat:0x3
             0.243 ( 0.008 ms): cat/2731 close(fd: 3) = 0
             0.485 (         ): cat/2731 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC)
             0.500 (         ): syscalls:sys_exit_open:0x3
             0.531 ( 0.017 ms): cat/2731 close(fd: 3) = 0
             0.587 (         ): cat/2731 openat(dfd: CWD, filename: tools/perf/examples/bpf/hello.c)
             0.601 (         ): syscalls:sys_exit_openat:0x3
             0.631 (         ): ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 1311399 lblk 0
             0.639 (         ): ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 1311399 found 1 [0/1) 5276651 W0x10
             0.654 (         ): block:block_bio_queue:253,2 R 42213208 + 8 [cat]
             0.663 (         ): block:block_bio_remap:8,0 R 58206040 + 8 <- (253,2) 42213208
             0.671 (         ): block:block_bio_remap:8,0 R 175570776 + 8 <- (8,6) 58206040
             0.678 (         ): block:block_bio_queue:8,0 R 175570776 + 8 [cat]
             0.692 (         ): block:block_getrq:8,0 R 175570776 + 8 [cat]
             0.700 (         ): block:block_plug:[cat]
             0.708 (         ): block:block_rq_insert:8,0 R 4096 () 175570776 + 8 [cat]
             0.713 (         ): block:block_unplug:[cat] 1
             0.716 (         ): block:block_rq_issue:8,0 R 4096 () 175570776 + 8 [cat]
             0.949 ( 0.007 ms): cat/2731 close(fd: 3) = 0
             0.969 ( 0.006 ms): cat/2731 close(fd: 1) = 0
             0.982 ( 0.006 ms): cat/2731 close(fd: 2) = 0
        #
      
      After:
      
        # echo 1 > /proc/sys/vm/drop_caches
        # perf trace -e block:*,ext4:*,tools/perf/examples/bpf/augmented_syscalls.c,close cat tools/perf/examples/bpf/hello.c
             0.000 (         ): cat/1380 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)#include <stdio.h>
      
        int syscall_enter(openat)(void *args)
        {
        	puts("Hello, world\n");
        	return 0;
        }
      
        license(GPL);
      
             0.024 (         ): cat/1380 syscalls:sys_exit_openat:0x3
             0.063 ( 0.024 ms): cat/1380 close(fd: 3) = 0
             0.114 (         ): cat/1380 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
             0.127 (         ): cat/1380 syscalls:sys_exit_openat:0x3
             0.247 ( 0.009 ms): cat/1380 close(fd: 3) = 0
             0.484 (         ): cat/1380 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC)
             0.499 (         ): cat/1380 syscalls:sys_exit_open:0x3
             0.613 ( 0.010 ms): cat/1380 close(fd: 3) = 0
             0.662 (         ): cat/1380 openat(dfd: CWD, filename: tools/perf/examples/bpf/hello.c)
             0.678 (         ): cat/1380 syscalls:sys_exit_openat:0x3
             0.712 (         ): cat/1380 ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 1311399 lblk 0
             0.721 (         ): cat/1380 ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 1311399 found 1 [0/1) 5276651 W0x10
             0.734 (         ): cat/1380 block:block_bio_queue:253,2 R 42213208 + 8 [cat]
             0.745 (         ): cat/1380 block:block_bio_remap:8,0 R 58206040 + 8 <- (253,2) 42213208
             0.754 (         ): cat/1380 block:block_bio_remap:8,0 R 175570776 + 8 <- (8,6) 58206040
             0.761 (         ): cat/1380 block:block_bio_queue:8,0 R 175570776 + 8 [cat]
             0.780 (         ): cat/1380 block:block_getrq:8,0 R 175570776 + 8 [cat]
             0.791 (         ): cat/1380 block:block_plug:[cat]
             0.802 (         ): cat/1380 block:block_rq_insert:8,0 R 4096 () 175570776 + 8 [cat]
             0.806 (         ): cat/1380 block:block_unplug:[cat] 1
             0.810 (         ): cat/1380 block:block_rq_issue:8,0 R 4096 () 175570776 + 8 [cat]
             1.005 ( 0.011 ms): cat/1380 close(fd: 3) = 0
             1.031 ( 0.008 ms): cat/1380 close(fd: 1) = 0
             1.048 ( 0.008 ms): cat/1380 close(fd: 2) = 0
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-us1mwsupxffs4jlm3uqm5dvj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c4191e55
    • A
      perf trace augmented_syscalls: Augment sendto's 'addr' arg · 6ebb6862
      Arnaldo Carvalho de Melo 提交于
      Its a 'struct sockaddr' pointer, augment it with the same beautifier as
      for 'connect' and 'bind', that all receive from userspace that pointer.
      
      Doing it in the other direction remains to be done, hooking at the
      syscalls:sys_exit_{accept4?,recvmsg} tracepoints somehow.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-k2eu68lsphnm2fthc32gq76c@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6ebb6862
    • A
      perf trace augmented_syscalls: Augment bind's 'myaddr' sockaddr arg · 02ef2884
      Arnaldo Carvalho de Melo 提交于
      One more, to reuse the augmented_sockaddr_syscall_enter() macro
      introduced from the augmentation of connect's sockaddr arg, also to get
      a subset of the struct arg augmentations done using the manual method,
      before switching to something automatic, using tracefs's format file or,
      even better, BTF containing the syscall args structs.
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c
           0.000 sshd/11479 bind(fd: 3<socket:[170336]>, umyaddr: { .family: NETLINK }, addrlen: 12)
           1.752 sshd/11479 bind(fd: 3<socket:[170336]>, umyaddr: { .family: INET, port: 22, addr: 0.0.0.0 }, addrlen: 16)
           1.924 sshd/11479 bind(fd: 4<socket:[170338]>, umyaddr: { .family: INET6, port: 22, addr: :: }, addrlen: 28)
        ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-a2drqpahpmc7uwb3n3gj2plu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      02ef2884
    • A
      perf trace augmented_syscalls: Augment connect's 'sockaddr' arg · d5a7e661
      Arnaldo Carvalho de Melo 提交于
      As the first example of augmenting something other than a 'filename',
      augment the 'struct sockaddr' argument for the 'connect' syscall:
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c ssh -6 fedorapeople.org
           0.000 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
           0.042 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
           1.329 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
           1.362 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
           1.458 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
           1.478 ssh/29669 connect(fd: 3, uservaddr: { .family: LOCAL, path: /var/run/nscd/socket }, addrlen: 110)
           1.683 ssh/29669 connect(fd: 3<socket:[125942]>, uservaddr: { .family: INET, port: 53, addr: 192.168.43.1 }, addrlen: 16)
           4.710 ssh/29669 connect(fd: 3<socket:[125942]>, uservaddr: { .family: INET6, port: 22, addr: 2610:28:3090:3001:5054:ff:fea7:9474 }, addrlen: 28)
        root@fedorapeople.org: Permission denied (publickey).
        #
      
      This is still just augmenting the syscalls:sys_enter_connect part, later
      we'll wire this up to augment the enter+exit combo, like in the
      tradicional 'perf trace' and 'strace' outputs.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-s7l541cbiqb22ifio6z7dpf6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d5a7e661
    • A
      perf trace: Use the augmented filename, expanding syscall enter pointers · 75d1e306
      Arnaldo Carvalho de Melo 提交于
      This is the final touch in showing how a syscall argument beautifier can
      access the augmented args put in place by the
      tools/perf/examples/bpf/augmented_syscalls.c eBPF script, right after
      the regular raw syscall args, i.e. the up to 6 long integer values in
      the syscall interface.
      
      With this we are able to show the 'openat' syscall arg, now with up to
      64 bytes, but in time this will be configurable, just like with the
      'strace -s strsize' argument, from 'strace''s man page:
      
        -s strsize  Specify the maximum string size to print (the default is 32).
      
      This actually is the maximum string to _collect_ and store in the ring
      buffer, not just print.
      
      Before:
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): cat/9658 openat(dfd: CWD, filename: 0x6626eda8, flags: CLOEXEC)
           0.017 ( 0.007 ms): cat/9658 openat(dfd: CWD, filename: 0x6626eda8, flags: CLOEXEC) = 3
           0.049 (         ): cat/9658 openat(dfd: CWD, filename: 0x66476ce0, flags: CLOEXEC)
           0.051 ( 0.007 ms): cat/9658 openat(dfd: CWD, filename: 0x66476ce0, flags: CLOEXEC) = 3
           0.377 (         ): cat/9658 openat(dfd: CWD, filename: 0x1e8f806b)
           0.379 ( 0.005 ms): cat/9658 openat(dfd: CWD, filename: 0x1e8f806b) = 3
        #
      
      After:
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): cat/11966 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
           0.006 ( 0.006 ms): cat/11966 openat(dfd: CWD, filename: 0x4bfdcda8, flags: CLOEXEC) = 3
           0.034 (         ): cat/11966 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
           0.036 ( 0.008 ms): cat/11966 openat(dfd: CWD, filename: 0x4c1e4ce0, flags: CLOEXEC) = 3
           0.375 (         ): cat/11966 openat(dfd: CWD, filename: /etc/passwd)
           0.377 ( 0.005 ms): cat/11966 openat(dfd: CWD, filename: 0xe87906b) = 3
        #
      
      This cset should show all the aspects of establishing a protocol between
      an eBPF syscall arg augmenter program, tools/perf/examples/bpf/augmented_syscalls.c and
      a 'perf trace' beautifier, the one associated with all 'char *' point
      syscall args with names that can heuristically be associated with
      filenames.
      
      Now to wire up 'open' to show a second syscall using this scheme, all we
      have to do now is to change tools/perf/examples/bpf/augmented_syscalls.c,
      as 'perf trace' will notice that the perf_sample.raw_size is more than
      what is expected for a particular syscall payload as defined by its
      tracefs format file and will then use the augmented payload in the
      'filename' syscall arg beautifier.
      
      The same protocol will be used for structs such as 'struct sockaddr *',
      'struct pollfd', etc, with additions for handling arrays.
      
      This will all be done under the hood when 'perf trace' realizes the
      system has the necessary components, and also can be done by providing
      a precompiled augmented_syscalls.c eBPF ELF object.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-gj9kqb61wo7m3shtpzercbcr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      75d1e306
    • A
      perf trace: Show comm/tid for augmented_syscalls · c96f4edc
      Arnaldo Carvalho de Melo 提交于
      To get us a bit more like the sys_enter + sys_exit combo:
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC)
           0.009 ( 0.009 ms): cat/3619 openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) = 3
           0.051 (         ): openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC)
           0.054 ( 0.010 ms): cat/3619 openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) = 3
           0.539 (         ): openat(dfd: CWD, filename: 0xca71506b)
           0.543 ( 0.115 ms): cat/3619 openat(dfd: CWD, filename: 0xca71506b) = 3
        #
      
      After:
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): cat/4919 openat(dfd: CWD, filename: 0xc8358da8, flags: CLOEXEC)
           0.007 ( 0.005 ms): cat/4919 openat(dfd: CWD, filename: 0xc8358da8, flags: CLOEXEC) = 3
           0.032 (         ): cat/4919 openat(dfd: CWD, filename: 0xc8560ce0, flags: CLOEXEC)
           0.033 ( 0.006 ms): cat/4919 openat(dfd: CWD, filename: 0xc8560ce0, flags: CLOEXEC) = 3
           0.301 (         ): cat/4919 openat(dfd: CWD, filename: 0x91fa306b)
           0.304 ( 0.004 ms): cat/4919 openat(dfd: CWD, filename: 0x91fa306b) = 3
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-6w8ytyo5y655a1hsyfpfily6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c96f4edc
    • A
      perf trace: Extract the comm/tid printing for syscall enter · 6dcbd212
      Arnaldo Carvalho de Melo 提交于
      Will be used with augmented syscalls, where we haven't transitioned
      completely to combining sys_enter_FOO with sys_exit_FOO, so we'll go
      as far as having it similar to the end result, strace like, as possible.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-canomaoiybkswwnhj69u9ae4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6dcbd212
    • A
      perf trace: Print the syscall name for augmented_syscalls · 1cdf618f
      Arnaldo Carvalho de Melo 提交于
      Since we copy all the payload for raw_syscalls:sys_enter plus add
      expanded pointers, we can use the syscall id to get its name, etc:
      
        # grep 'field:.* id' /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format
      	field:long id;	offset:8;	size:8;	signed:1;
        #
      
      Before:
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): __augmented_syscalls__:dfd: CWD, filename: 0xec9f9da8, flags: CLOEXEC
           0.006 ( 0.006 ms): cat/2395 openat(dfd: CWD, filename: 0xec9f9da8, flags: CLOEXEC) = 3
           0.041 (         ): __augmented_syscalls__:dfd: CWD, filename: 0xecc01ce0, flags: CLOEXEC
           0.042 ( 0.007 ms): cat/2395 openat(dfd: CWD, filename: 0xecc01ce0, flags: CLOEXEC) = 3
           0.376 (         ): __augmented_syscalls__:dfd: CWD, filename: 0xac0a806b
           0.379 ( 0.006 ms): cat/2395 openat(dfd: CWD, filename: 0xac0a806b) = 3
        #
      
      After:
      
        # perf trace -e tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC)
           0.009 ( 0.009 ms): cat/3619 openat(dfd: CWD, filename: 0x31b6dda8, flags: CLOEXEC) = 3
           0.051 (         ): openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC)
           0.054 ( 0.010 ms): cat/3619 openat(dfd: CWD, filename: 0x31d75ce0, flags: CLOEXEC) = 3
           0.539 (         ): openat(dfd: CWD, filename: 0xca71506b)
           0.543 ( 0.115 ms): cat/3619 openat(dfd: CWD, filename: 0xca71506b) = 3
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-epz6y9i0eavmerc5ha98t7gn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1cdf618f
    • A
      perf trace: Pass augmented args to the arg formatters when available · 7a983a0f
      Arnaldo Carvalho de Melo 提交于
      If the tracepoint payload is bigger than what a syscall expected from
      what is in its format file in tracefs, then that will be used as
      augmented args, i.e. the expansion of syscall arg pointers, with things
      like a filename, structs, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-bsbqx7xi2ot4q9bf570f7tqs@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7a983a0f
  4. 09 8月, 2018 4 次提交
    • A
      perf trace: Wire up the augmented syscalls with the syscalls:sys_enter_FOO beautifier · 88cf7084
      Arnaldo Carvalho de Melo 提交于
      We just check that the evsel is the one we associated with the
      bpf-output event associated with the "__augmented_syscalls__" eBPF map,
      to show that the formatting is done properly:
      
        # perf trace -e perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): __augmented_syscalls__:dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC
           0.006 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC
           0.007 ( 0.004 ms): cat/11486 openat(dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC                 ) = 3
           0.029 (         ): __augmented_syscalls__:dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC
           0.030 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC
           0.031 ( 0.004 ms): cat/11486 openat(dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC                 ) = 3
           0.249 (         ): __augmented_syscalls__:dfd: CWD, filename: 0xc3700d6
           0.250 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0xc3700d6
           0.252 ( 0.003 ms): cat/11486 openat(dfd: CWD, filename: 0xc3700d6                                  ) = 3
        #
      
      Now we just need to get the full blown enter/exit handlers to check if the
      evsel being processed is the augmented_syscalls one to go pick the pointer
      payloads from the end of the payload.
      
      We also need to state somehow what is the layout for multi pointer arg syscalls.
      
      Also handy would be to have a BTF file with the struct definitions used in
      syscalls, compact, generated at kernel built time and available for use in eBPF
      programs.
      
      Till we get there we can go on doing some manual coupling of the most relevant
      syscalls with some hand built beautifiers.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-r6ba5izrml82nwfmwcp7jpkm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      88cf7084
    • A
      perf trace: Setup the augmented syscalls bpf-output event fields · d3d1c4bd
      Arnaldo Carvalho de Melo 提交于
      The payload that is put in place by the eBPF script attached to
      syscalls:sys_enter_openat (and other syscalls with pointers, in the
      future) can be consumed by the existing sys_enter beautifiers if
      evsel->priv is setup with a struct syscall_tp with struct tp_fields for
      the 'syscall_id' and 'args' fields expected by the beautifiers, this
      patch does just that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-xfjyog8oveg2fjys9r1yy1es@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3d1c4bd
    • A
      perf bpf: Make bpf__setup_output_event() return the bpf-output event · 78e890ea
      Arnaldo Carvalho de Melo 提交于
      We're calling it to setup that event, and we'll need it later to decide
      if the bpf-output event we're handling is the one setup for a specific
      purpose, return it using ERR_PTR, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-zhachv7il2n1lopt9aonwhu7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      78e890ea
    • A
      perf trace: Handle "bpf-output" events associated with "__augmented_syscalls__" BPF map · e0b6d2ef
      Arnaldo Carvalho de Melo 提交于
      Add an example BPF script that writes syscalls:sys_enter_openat raw
      tracepoint payloads augmented with the first 64 bytes of the "filename"
      syscall pointer arg.
      
      Then catch it and print it just like with things written to the
      "__bpf_stdout__" map associated with a PERF_COUNT_SW_BPF_OUTPUT software
      event, by just letting the default tracepoint handler in 'perf trace',
      trace__event_handler(), to use bpf_output__fprintf(trace, sample), just
      like it does with all other PERF_COUNT_SW_BPF_OUTPUT events, i.e. just
      do a dump on the payload, so that we can check if what is being printed
      has at least the first 64 bytes of the "filename" arg:
      
      The augmented_syscalls.c eBPF script:
      
        # cat tools/perf/examples/bpf/augmented_syscalls.c
        // SPDX-License-Identifier: GPL-2.0
      
        #include <stdio.h>
      
        struct bpf_map SEC("maps") __augmented_syscalls__ = {
             .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
             .key_size = sizeof(int),
             .value_size = sizeof(u32),
             .max_entries = __NR_CPUS__,
        };
      
        struct syscall_enter_openat_args {
      	unsigned long long common_tp_fields;
      	long		   syscall_nr;
      	long		   dfd;
      	char		   *filename_ptr;
      	long		   flags;
      	long		   mode;
        };
      
        struct augmented_enter_openat_args {
      	struct syscall_enter_openat_args args;
      	char				 filename[64];
        };
      
        int syscall_enter(openat)(struct syscall_enter_openat_args *args)
        {
      	struct augmented_enter_openat_args augmented_args;
      
      	probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
      	probe_read_str(&augmented_args.filename, sizeof(augmented_args.filename), args->filename_ptr);
      	perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU,
      			  &augmented_args, sizeof(augmented_args));
      	return 1;
        }
      
        license(GPL);
        #
      
      So it will just prepare a raw_syscalls:sys_enter payload for the
      "openat" syscall.
      
      This will eventually be done for all syscalls with pointer args,
      globally or just when the user asks, using some spec, which args of
      which syscalls it wants "expanded" this way, we'll probably start with
      just all the syscalls that have char * pointers with familiar names, the
      ones we already handle with the probe:vfs_getname kprobe if it is in
      place hooking the kernel getname_flags() function used to copy from user
      the paths.
      
      Running it we get:
      
        # perf trace -e perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): __augmented_syscalls__:X?.C......................`\..................../etc/ld.so.cache..#......,....ao.k...............k......1.".........
           0.006 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC
           0.008 ( 0.005 ms): cat/31292 openat(dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC                 ) = 3
           0.036 (         ): __augmented_syscalls__:X?.C.......................\..................../lib64/libc.so.6......... .\....#........?.......=.C..../.".........
           0.037 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC
           0.039 ( 0.007 ms): cat/31292 openat(dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC                 ) = 3
           0.323 (         ): __augmented_syscalls__:X?.C.....................P....................../etc/passwd......>.C....@................>.C.....,....ao.>.C........
           0.325 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0xe8be50d6
           0.327 ( 0.004 ms): cat/31292 openat(dfd: CWD, filename: 0xe8be50d6                                 ) = 3
        #
      
      We need to go on optimizing this to avoid seding trash or zeroes in the
      pointer content payload, using the return from bpf_probe_read_str(), but
      to keep things simple at this stage and make incremental progress, lets
      leave it at that for now.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-g360n1zbj6bkbk6q0qo11c28@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e0b6d2ef
  5. 03 8月, 2018 5 次提交
  6. 02 8月, 2018 2 次提交
    • A
      perf trace: Associate vfs_getname()'ed pathname with fd returned from 'openat' · 6a648b53
      Arnaldo Carvalho de Melo 提交于
      When the vfs_getname() wannabe tracepoint is in place:
      
        # perf probe -l
          probe:vfs_getname    (on getname_flags:73@acme/git/linux/fs/namei.c with pathname)
        #
      
      'perf trace' will use it to get the pathname when it is copied from
      userspace to the kernel, right after syscalls:sys_enter_open, copied
      in the 'probe:vfs_getname', stash it somewhere and then, at
      syscalls:sys_exit_open time, if the 'open' return is not -1, i.e. a
      successfull open syscall, associate that pathname to this return, i.e.
      the fd.
      
      We were not doing this for the 'openat' syscall, which would cause 'perf
      trace' to fallback to using /proc to get the fd, change it so that we
      use what we got from probe:vfs_getname, reducing the 'openat'
      beautification process cost, ditching the syscalls performed to read
      procfs state and avoiding some possible races in the process.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-xnp44ao3bkb6ejeczxfnjwsh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6a648b53
    • A
      perf trace: Do not require --no-syscalls to suppress strace like output · b912885a
      Arnaldo Carvalho de Melo 提交于
      So far the --syscalls option was the default, requiring explicit
      --no-syscalls when wanting to process just some other event, invert that
      and assume it only when no other event was specified, allowing its
      explicit enablement when wanting to see all syscalls together with some
      other event:
      
      E.g:
      
      The existing default is maintained for a single workload:
      
        # perf trace sleep 1
      <SNIP>
           0.264 ( 0.003 ms): sleep/12762 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7f62cbf04000
           0.271 ( 0.001 ms): sleep/12762 close(fd: 3) = 0
           0.295 (1000.130 ms): sleep/12762 nanosleep(rqtp: 0x7ffd15194fd0) = 0
        1000.469 ( 0.006 ms): sleep/12762 close(fd: 1) = 0
        1000.480 ( 0.004 ms): sleep/12762 close(fd: 2) = 0
        1000.502 (         ): sleep/12762 exit_group()
        #
      
      For a pid:
      
        # pidof ssh
        7826 3961 3226 2628 2493
        # perf trace -p 3961
               ? (         ):  ... [continued]: select()) = 1
           0.023 ( 0.005 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce870               ) = 0
           0.036 ( 0.009 ms): read(fd: 5</dev/pts/7>, buf: 0x7ffcc8fca7b0, count: 16384             ) = 3
           0.060 ( 0.004 ms): getpid(                                                               ) = 3961 (ssh)
           0.079 ( 0.004 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce8e0               ) = 0
           0.088 ( 0.003 ms): clock_gettime(which_clock: BOOTTIME, tp: 0x7ffcc8fce7c0               ) = 0
      <SNIP>
      
      For system wide, threads, cgroups, user, etc when no event is specified,
      the existing behaviour is maintained, i.e. --syscalls is selected.
      
      When some event is specified, then --no-syscalls doesn't need to be
      specified:
      
        # perf trace -e tcp:tcp_probe ssh localhost
           0.000 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=53 snd_nxt=0xb67ce8f7 snd_una=0xb67ce8f7 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43690
           0.010 tcp:tcp_probe:src=[::1]:39074 dest=[::1]:22 mark=0 length=32 snd_nxt=0xa8f9ef38 snd_una=0xa8f9ef23 snd_cwnd=10 ssthresh=2147483647 snd_wnd=43690 srtt=31 rcv_wnd=43776
           4.525 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=1240 snd_nxt=0xb67ce90c snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=43776
           7.242 tcp:tcp_probe:src=[::1]:22 dest=[::1]:39074 mark=0 length=80 snd_nxt=0xb67ced44 snd_una=0xb67ce90c snd_cwnd=10 ssthresh=2147483647 snd_wnd=43776 srtt=18 rcv_wnd=174720
        The authenticity of host 'localhost (::1)' can't be established.
        ECDSA key fingerprint is SHA256:TKZS58923458203490asekfjaklskljmkjfgPMBfHzY.
        ECDSA key fingerprint is MD5:d8:29:54:40:71:fa:b8:44:89:52:64:8a:35:42:d0:e8.
        Are you sure you want to continue connecting (yes/no)?
      ^C
        #
      
      To get the previous behaviour just use --syscalls and get all syscalls formatted
      strace like + the specified extra events:
      
        # trace -e sched:*switch --syscalls sleep 1
        <SNIP>
           0.160 ( 0.003 ms): sleep/12877 mprotect(start: 0x7fdfe2361000, len: 4096, prot: READ) = 0
           0.164 ( 0.009 ms): sleep/12877 munmap(addr: 0x7fdfe2345000, len: 113155) = 0
           0.211 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce68e000
           0.212 ( 0.002 ms): sleep/12877 brk(brk: 0x55d3ce6af000) = 0x55d3ce6af000
           0.215 ( 0.001 ms): sleep/12877 brk() = 0x55d3ce6af000
           0.219 ( 0.004 ms): sleep/12877 open(filename: 0xe1f07c00, flags: CLOEXEC) = 3
           0.225 ( 0.001 ms): sleep/12877 fstat(fd: 3, statbuf: 0x7fdfe2138aa0) = 0
           0.227 ( 0.003 ms): sleep/12877 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7fdfdb1b8000
           0.234 ( 0.001 ms): sleep/12877 close(fd: 3) = 0
           0.257 (         ): sleep/12877 nanosleep(rqtp: 0x7fffb36b6020) ...
           0.260 (         ): sched:sched_switch:prev_comm=sleep prev_pid=12877 prev_prio=120 prev_state=D ==> next_comm=swapper/3 next_pid=0 next_prio=120
           0.257 (1000.134 ms): sleep/12877  ... [continued]: nanosleep()) = 0
        1000.428 ( 0.006 ms): sleep/12877 close(fd: 1) = 0
        1000.440 ( 0.004 ms): sleep/12877 close(fd: 2) = 0
        1000.461 (         ): sleep/12877 exit_group()
        #
      
      When specifiying just some syscalls, the behaviour doesn't change, i.e.:
      
        # trace -e nanosleep -e sched:*switch sleep 1
           0.000 (         ): sleep/14974 nanosleep(rqtp: 0x7ffc344ba9c0                                        ) ...
           0.007 (         ): sched:sched_switch:prev_comm=sleep prev_pid=14974 prev_prio=120 prev_state=D ==> next_comm=swapper/2 next_pid=0 next_prio=120
           0.000 (1000.139 ms): sleep/14974  ... [continued]: nanosleep()) = 0
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-om2fulll97ytnxv40ler8jkf@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b912885a
  7. 31 7月, 2018 2 次提交
    • A
      perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' args · 162d3edb
      Arnaldo Carvalho de Melo 提交于
      For instance:
      
        $ trace -e socket* ssh sandy
           0.000 ( 0.031 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           0.052 ( 0.015 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.568 ( 0.020 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.603 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.699 ( 0.014 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.724 ( 0.012 ms): ssh/19919 socket(family: LOCAL, type: STREAM|CLOEXEC|NONBLOCK                   ) = 3
           1.804 ( 0.020 ms): ssh/19919 socket(family: INET, type: STREAM, protocol: TCP                      ) = 3
          17.549 ( 0.098 ms): ssh/19919 socket(family: LOCAL, type: STREAM                                    ) = 4
        acme@sandy's password:
      
      Just like with other syscall args, the common bits are supressed so that
      the output is more compact, i.e. we use "TCP" instead of "IPPROTO_TCP",
      but we can make this show the original constant names if we like it by
      using some command line knob or ~/.perfconfig "[trace]" section
      variable.
      
      Also needed is to make perf's event parser accept things like:
      
        $ perf trace -e socket*/protocol=TCP/
      
      By using both the tracefs event 'format' files and these tables built
      from the kernel sources.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-l39jz1vnyda0b6jsufuc8bz7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      162d3edb
    • A
      perf trace beauty: Do not print NULL strarray entries · bc972ada
      Arnaldo Carvalho de Melo 提交于
      We may have string tables where not all slots have values, in those
      cases its better to print the numeric value, for instance:
      
      In the table below we would show "protocol: (null)" for
      
            socket_ipproto[3]
      
      Where it would be better to show "protocol: 3".
      
            $ tools/perf/trace/beauty/socket_ipproto.sh
            static const char *socket_ipproto[] = {
                  [0] = "IP",
                  [103] = "PIM",
                  [108] = "COMP",
                  [12] = "PUP",
                  [132] = "SCTP",
                  [136] = "UDPLITE",
                  [137] = "MPLS",
                  [17] = "UDP",
                  [1] = "ICMP",
                  [22] = "IDP",
                  [255] = "RAW",
                  [29] = "TP",
                  [2] = "IGMP",
                  [33] = "DCCP",
                  [41] = "IPV6",
                  [46] = "RSVP",
                  [47] = "GRE",
                  [4] = "IPIP",
                  [50] = "ESP",
                  [51] = "AH",
                  [6] = "TCP",
                  [8] = "EGP",
                  [92] = "MTP",
                  [94] = "BEETPH",
                  [98] = "ENCAP",
            };
            $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-7djfak94eb3b9ltr79cpn3ti@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bc972ada
  8. 05 6月, 2018 1 次提交
  9. 18 5月, 2018 1 次提交
  10. 27 4月, 2018 2 次提交
  11. 04 4月, 2018 1 次提交
  12. 02 4月, 2018 1 次提交
    • A
      perf trace: Show only failing syscalls · 0a6545bd
      Arnaldo Carvalho de Melo 提交于
      For instance:
      
        # perf probe "vfs_getname=getname_flags:72 pathname=result->name:string"
        Added new event:
          probe:vfs_getname    (on getname_flags:72 with pathname=result->name:string)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe:vfs_getname -aR sleep 1
      
        # perf trace --failure sleep 1
           0.043 ( 0.010 ms): sleep/10978 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
      
      For reference, here are all the syscalls in this case:
      
        # perf trace sleep 1
               ? (         ): sleep/10976  ... [continued]: execve()) = 0
             0.027 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
             0.044 ( 0.010 ms): sleep/10976 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
             0.057 ( 0.006 ms): sleep/10976 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
             0.064 ( 0.002 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b370) = 0
             0.067 ( 0.003 ms): sleep/10976 mmap(len: 111457, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec8615000
             0.071 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
             0.080 ( 0.007 ms): sleep/10976 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
             0.088 ( 0.002 ms): sleep/10976 read(fd: 3, buf: 0x7fffac22b538, count: 832) = 832
             0.092 ( 0.001 ms): sleep/10976 fstat(fd: 3, statbuf: 0x7fffac22b3d0) = 0
             0.094 ( 0.002 ms): sleep/10976 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7feec8613000
             0.099 ( 0.004 ms): sleep/10976 mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3) = 0x7feec8057000
             0.104 ( 0.007 ms): sleep/10976 mprotect(start: 0x7feec8203000, len: 2097152) = 0
             0.112 ( 0.005 ms): sleep/10976 mmap(addr: 0x7feec8403000, len: 24576, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 1753088) = 0x7feec8403000
             0.120 ( 0.003 ms): sleep/10976 mmap(addr: 0x7feec8409000, len: 14976, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS|FIXED) = 0x7feec8409000
             0.128 ( 0.001 ms): sleep/10976 close(fd: 3) = 0
             0.139 ( 0.001 ms): sleep/10976 arch_prctl(option: 4098, arg2: 140663540761856) = 0
             0.186 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8403000, len: 16384, prot: READ) = 0
             0.204 ( 0.003 ms): sleep/10976 mprotect(start: 0x55bdc0ec3000, len: 4096, prot: READ) = 0
             0.209 ( 0.004 ms): sleep/10976 mprotect(start: 0x7feec8631000, len: 4096, prot: READ) = 0
             0.214 ( 0.010 ms): sleep/10976 munmap(addr: 0x7feec8615000, len: 111457) = 0
             0.269 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d04000
             0.271 ( 0.002 ms): sleep/10976 brk(brk: 0x55bdc2d25000) = 0x55bdc2d25000
             0.274 ( 0.001 ms): sleep/10976 brk() = 0x55bdc2d25000
             0.278 ( 0.007 ms): sleep/10976 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
             0.288 ( 0.001 ms): sleep/10976 fstat(fd: 3</usr/lib/locale/locale-archive>, statbuf: 0x7feec8408aa0) = 0
             0.290 ( 0.003 ms): sleep/10976 mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3) = 0x7feec1488000
             0.297 ( 0.001 ms): sleep/10976 close(fd: 3</usr/lib/locale/locale-archive>) = 0
             0.325 (1000.193 ms): sleep/10976 nanosleep(rqtp: 0x7fffac22c0b0) = 0
          1000.560 ( 0.006 ms): sleep/10976 close(fd: 1) = 0
          1000.573 ( 0.005 ms): sleep/10976 close(fd: 2) = 0
          1000.596 (         ): sleep/10976 exit_group()
        #
      
      And can be done systemwide, etc, with backtraces:
      
        # perf trace --max-stack=16 --failure sleep 1
           0.048 ( 0.015 ms): sleep/11092 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT No such file or directory
                                             __access (inlined)
                                             dl_main (/usr/lib64/ld-2.26.so)
        #
      
      Or for some specific syscalls:
      
        # perf trace --max-stack=16 -e openat --failure cat /tmp/rien
        cat: /tmp/rien: No such file or directory
             0.251 ( 0.012 ms): cat/11106 openat(dfd: CWD, filename: /tmp/rien) = -1 ENOENT No such file or directory
                                               __libc_open64 (inlined)
                                               main (/usr/bin/cat)
                                               __libc_start_main (/usr/lib64/libc-2.26.so)
                                               _start (/usr/bin/cat)
        #
      
      Look for inotify* syscalls that fail, system wide, for 2 seconds, with backtraces:
      
        # perf trace -a --max-stack=16 --failure -e inotify* sleep 2
         819.165 ( 0.058 ms): gmain/1724 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: /home/acme/~, mask: 16789454) = -1 ENOENT No such file or directory
                                             __GI_inotify_add_watch (inlined)
                                             _ik_watch (/usr/lib64/libgio-2.0.so.0.5400.3)
                                             _ip_start_watching (/usr/lib64/libgio-2.0.so.0.5400.3)
                                             im_scan_missing (/usr/lib64/libgio-2.0.so.0.5400.3)
                                             g_timeout_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_main_context_dispatch (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_main_context_iterate.isra.23 (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_main_context_iteration (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             glib_worker_main (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             g_thread_proxy (/usr/lib64/libglib-2.0.so.0.5400.3)
                                             start_thread (/usr/lib64/libpthread-2.26.so)
                                             __GI___clone (inlined)
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-8f7d3mngaxvi7tlzloz3n7cs@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0a6545bd
  13. 08 3月, 2018 3 次提交
  14. 07 3月, 2018 1 次提交
    • A
      perf trace: Support setting cgroups as targets · 9ea42ba4
      Arnaldo Carvalho de Melo 提交于
      One can set a cgroup as a default cgroup to be used by all events or
      set cgroups with the 'perf stat' and 'perf record' behaviour, i.e.
      '-G A' will be the cgroup for events defined so far in the command line.
      
      Here in my main machine, with a kvm instance running a rhel6 guinea pig
      I have:
      
        # ls -la /sys/fs/cgroup/perf_event/ | grep drw
        drwxr-xr-x. 14 root root 360 Mar  6 12:04 ..
        drwxr-xr-x.  3 root root   0 Mar  6 15:05 machine.slice
        #
      
      So I can go ahead and use that cgroup hierarchy, say lets see what
      syscalls are being emitted by threads in that 'machine.slice' hierarchy
      that are taking more than 100ms:
      
        # perf trace --duration 100 -G machine.slice
           0.188 (249.850 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
         250.274 (249.743 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
         500.224 (249.755 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
         750.097 (249.934 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
        1000.244 (249.780 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
        1250.197 (249.796 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
        1500.124 (249.859 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
        1750.076 (172.900 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
         902.570 (1021.116 ms): qemu-system-x8/23667 ppoll(ufds: 0x558151e03180, nfds: 74, tsp: 0x7ffc00cd0900, sigsetsize: 8) = 1
        1923.825 (305.133 ms): qemu-system-x8/23667 ppoll(ufds: 0x558151e03180, nfds: 74, tsp: 0x7ffc00cd0900, sigsetsize: 8) = 1
        2000.172 (229.002 ms): CPU 0/KVM/23744 ioctl(fd: 16<anon_inode:kvm-vcpu:0>, cmd: KVM_RUN) = 0
      ^C  #
      
      If we look inside that cgroup hierarchy we get:
      
        # ls -la /sys/fs/cgroup/perf_event/machine.slice/ | grep drw
        drwxr-xr-x. 3 root root 0 Mar  6 15:05 .
        drwxr-xr-x. 2 root root 0 Mar  6 16:16 machine-qemu\x2d2\x2drhel6.sandy.scope
        #
      
      There is just one, but lets say there were more and we would want to see
      5 seconds worth of syscall summary for the threads in that cgroup:
      
        # perf trace --summary -G machine.slice/machine-qemu\\x2d2\\x2drhel6.sandy.scope/ -a sleep 5
      
         Summary of events:
      
           qemu-system-x86 (23667), 143858 events, 24.2%
      
           syscall            calls    total       min       avg       max      stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
           --------------- -------- --------- --------- --------- ---------     ------
           ppoll              28492  4348.631     0.000     0.153    11.616      1.05%
           futex              19661   140.801     0.001     0.007     2.993      3.20%
           read               18440    68.084     0.001     0.004     1.653      4.33%
           ioctl               5387    24.768     0.002     0.005     0.134      1.62%
      
           CPU 0/KVM (23744), 449455 events, 75.8%
      
           syscall            calls    total       min       avg       max      stddev
                                     (msec)    (msec)    (msec)    (msec)        (%)
           --------------- -------- --------- --------- --------- ---------     ------
           ioctl             148364  3401.812     0.000     0.023    11.801      1.15%
           futex              36131   404.127     0.001     0.011     7.377      2.63%
           writev             29452   339.688     0.003     0.012     1.740      1.36%
           write              11315    45.992     0.001     0.004     0.105      1.10%
      
        #
      
      See the documentation about how to set more than one cgroup for
      different events in the same command line.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-t126jh4occqvu0xdqlcjygex@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ea42ba4
  15. 05 3月, 2018 1 次提交
  16. 06 2月, 2018 1 次提交
    • R
      perf trace: Fix call-graph output · 3a9e9a47
      Ravi Bangoria 提交于
      Recently, Arnaldo fixed global vs event specific --max-stack usage with
      commit bd3dda9a ("perf trace: Allow overriding global --max-stack
      per event"). This commit is having a regression when we don't use
      --max-stack at all with perf trace. Ex,
      
        $ ./perf trace record -g ls
        $ ./perf trace -i perf.data
           0.076 ( 0.002 ms): ls/9109 brk(
           0.196 ( 0.008 ms): ls/9109 access(filename: 0x9f998b70, mode: R
           0.209 ( 0.031 ms): ls/9109 open(filename: 0x9f998978, flags: CLOEXEC
      
      This is missing call-traces.
      After patch:
      
        $ ./perf trace -i perf.data
           0.076 ( 0.002 ms): ls/9109 brk(
                                      do_syscall_trace_leave ([kernel.kallsyms])
                                      [0] ([unknown])
                                      syscall_exit_work ([kernel.kallsyms])
                                      brk (/usr/lib64/ld-2.17.so)
                                      _dl_sysdep_start (/usr/lib64/ld-2.17.so)
                                      _dl_start_final (/usr/lib64/ld-2.17.so)
                                      _dl_start (/usr/lib64/ld-2.17.so)
                                      _start (/usr/lib64/ld-2.17.so)
           0.196 ( 0.008 ms): ls/9109 access(filename: 0x9f998b70, mode: R
                                      do_syscall_trace_leave ([kernel.kallsyms])
                                      [0] ([unknown])
      Signed-off-by: NRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: bd3dda9a ("perf trace: Allow overriding global --max-stack per event")
      Link: http://lkml.kernel.org/r/20180130053053.13214-3-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3a9e9a47
  17. 25 1月, 2018 3 次提交