1. 29 12月, 2018 1 次提交
  2. 21 12月, 2018 2 次提交
    • A
      perf trace: Do not hardcode the size of the tracepoint common_ fields · b9b6a2ea
      Arnaldo Carvalho de Melo 提交于
      We shouldn't hardcode the size of the tracepoint common_ fields, use the
      offset of the 'id'/'__syscallnr' field in the sys_enter event instead.
      
      This caused the augmented syscalls code to fail on a particular build of a
      PREEMPT_RT_FULL kernel where these extra 'common_migrate_disable' and
      'common_padding' fields were before the syscall id one:
      
        # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format
        name: sys_enter
        ID: 22
        format:
      	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
      	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
      	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
      	field:int common_pid;	offset:4;	size:4;	signed:1;
      	field:unsigned short common_migrate_disable;	offset:8;	size:2;	signed:0;
      	field:unsigned short common_padding;	offset:10;	size:2;	signed:0;
      
      	field:long id;	offset:16;	size:8;	signed:1;
      	field:unsigned long args[6];	offset:24;	size:48;	signed:0;
      
        print fmt: "NR %ld (%lx, %lx, %lx, %lx, %lx, %lx)", REC->id, REC->args[0], REC->args[1], REC->args[2], REC->args[3], REC->args[4], REC->args[5]
        #
      
      All those 'common_' prefixed fields are zeroed when they hit a BPF tracepoint
      hook, we better just discard those, i.e. somehow pass an offset to the
      BPF program from the start of the ctx and make adjustments in the 'perf trace'
      handlers to adjust the offset of the syscall arg offsets obtained from tracefs.
      
      Till then, fix it the quick way and add this to the augmented_raw_syscalls.c to
      bet it to work in such kernels:
      
        diff --git a/tools/perf/examples/bpf/augmented_raw_syscalls.c b/tools/perf/examples/bpf/augmented_raw_syscalls.c
        index 53c233370fae..1f746f931e13 100644
        --- a/tools/perf/examples/bpf/augmented_raw_syscalls.c
        +++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c
        @@ -38,12 +38,14 @@ struct bpf_map SEC("maps") syscalls = {
      
         struct syscall_enter_args {
                unsigned long long common_tp_fields;
        +       long               rt_common_tp_fields;
                long               syscall_nr;
                unsigned long      args[6];
         };
      
         struct syscall_exit_args {
                unsigned long long common_tp_fields;
        +       long               rt_common_tp_fields;
                long               syscall_nr;
                long               ret;
         };
      
      Just to check that this was the case. Fix it properly later, for now remove the
      hardcoding of the offset in the 'perf trace' side and document the situation
      with this patch.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-2pqavrktqkliu5b9nzouio21@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b9b6a2ea
    • A
      perf trace: Check if the raw_syscalls:sys_{enter,exit} are setup before setting tp filter · f76214f9
      Arnaldo Carvalho de Melo 提交于
      While updating 'perf trace' on an machine with an old precompiled
      augmented_raw_syscalls.o that didn't setup the syscall map the new 'perf
      trace' codebase notices the augmented_raw_syscalls.o eBPF event, decides
      to use it instead of the old raw_syscalls:sys_{enter,exit} method, but
      then because we don't have the syscall map tries to set the tracepoint
      filter on the sys_{enter,exit} evsels, that are NULL, segfaulting.
      
      Make the code more robust by checking it those tracepoints have
      their respective evsels in place before trying to set the tp filter.
      
      With this we still get everything to work, just not setting up the
      syscall filters, which is better than a segfault. Now to update the
      precompiled augmented_raw_syscalls.o and continue development :-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-3ft5rjdl05wgz2pwpx2z8btu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f76214f9
  3. 19 12月, 2018 11 次提交
    • A
      perf beauty mmap: Print mmap's 'offset' arg in hexadecimal · a6631340
      Arnaldo Carvalho de Melo 提交于
      Also to make it match 'strace' output, for regression testing.
      
      Both now produce this option, when 'perf trace' uses a .perfconfig
      asking for the strace like output:
      
        mmap(0x7faf66e6a000, 1363968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7faf66e6a000
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-27qhouo1kaac2iyl85nfnsf5@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a6631340
    • A
      perf trace beauty: Beautify arch_prctl()'s arguments · fb7068e7
      Arnaldo Carvalho de Melo 提交于
      This actually so far, AFAIK is available only in x86, so the code was
      put in place with x86 prefixes, in arches where it is not available it
      will just not be called, so no further mechanisms are needed at this
      time.
      
      Later, when other arches wire this up, we'll just look at the uname
      (live sessions) or perf_env data in the perf.data header to auto-wire
      the right beautifier.
      
      With this the output is the same as produced by 'strace' when used with
      the following ~/.perfconfig:
      
        # cat ~/.perfconfig
        [llvm]
      	dump-obj = true
        [trace]
      	  add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
      	  show_zeros = yes
      	  show_duration = no
      	  no_inherit = yes
      	  show_timestamp = no
      	  show_arg_names = no
      	  args_alignment = -40
      	  show_prefix = yes
        #
      
      And, on fedora 29, since the string tables are generated from the kernel
      sources, we don't know about 0x3001, just like strace:
      
        --- /tmp/strace 2018-12-17 11:22:08.707586721 -0300
        +++ /tmp/trace  2018-12-18 11:11:32.037512729 -0300
        @@ -1,49 +1,49 @@
        -arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc8a92dc80) = -1 EINVAL (Invalid argument)
        +arch_prctl(0x3001 /* ARCH_??? */, 0x7ffe4eb93ae0) = -1 EINVAL (Invalid argument)
        -arch_prctl(ARCH_SET_FS, 0x7faf6700f540) = 0
        +arch_prctl(ARCH_SET_FS, 0x7fb507364540) = 0
      
      And that seems to be related to the CET/Shadow Stack feature, that
      userland in Fedora 29 (glibc 2.28) are querying the kernel about, that
      0x3001 seems to be ARCH_CET_STATUS, I'll check the situation and test
      with a fedora 29 kernel to see if the other codes are used.
      
      A diff that ignores the different pointers for different runs needs to
      be put in place in the upcoming regression tests comparing 'perf trace's
      output to strace's.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-73a9prs8ktkrt97trtdmdjs8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fb7068e7
    • A
      perf trace: When showing string prefixes show prefix + ??? for unknown entries · 9614b8d6
      Arnaldo Carvalho de Melo 提交于
      To match 'strace' output, like in:
      
        arch_prctl(0x3001 /* ARCH_??? */, 0x7ffc8a92dc80) = -1 EINVAL (Invalid argument)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-kx59j2dk5l1x04ou57mt99ck@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9614b8d6
    • A
      perf trace: Move strarrays to beauty.h for further reuse · 1f2d085e
      Arnaldo Carvalho de Melo 提交于
      We'll use it in the upcoming arch_prctl() 'code' arg beautifier.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-6e4tj2fjen8qa73gy4u49vav@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1f2d085e
    • A
      perf trace: Show NULL when syscall pointer args are 0 · ce05539f
      Arnaldo Carvalho de Melo 提交于
      Matching strace's output format. The 'format' file for the syscall
      tracepoints have an indication if the arg is a pointer, with some
      exceptions like 'mmap' that has its first arg as an 'unsigned long', so
      use a heuristic using the argument name, i.e. if it contains the 'addr'
      substring, format it with the pointer formatter.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-ddghemr8qrm6i0sb8awznbze@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ce05539f
    • A
      perf trace: Enclose the errno strings with () · 2c83dfae
      Arnaldo Carvalho de Melo 提交于
      To match strace, now both emit the same line for calls like:
      
       access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-krxl6klsqc9qyktoaxyih942@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2c83dfae
    • A
      perf trace: Add alignment spaces after the closing parens · 4b8a240e
      Arnaldo Carvalho de Melo 提交于
      To use strace's style, helping in comparing the output of 'perf trace'
      with the one from 'strace', to help in upcoming regression tests.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-mw6peotz4n84rga0fk78buff@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4b8a240e
    • A
      perf trace: Allow asking for not suppressing common string prefixes · c65c83ff
      Arnaldo Carvalho de Melo 提交于
      So far we've been suppressing common stuff such as "MAP_" in the mmap
      flags, showing "SHARED" instead of "MAP_SHARED", allow for those
      prefixes (and a few suffixes) to be shown:
      
        # trace -e *map,open*,*seek sleep 1
        openat("/etc/ld.so.cache", CLOEXEC) = 3
        mmap(0, 109093, READ, PRIVATE, 3, 0) = 0x7ff61c695000
        openat("/lib64/libc.so.6", CLOEXEC) = 3
        lseek(3, 792, SET) = 792
        mmap(0, 8192, READ|WRITE, PRIVATE|ANONYMOUS) = 0x7ff61c693000
        lseek(3, 792, SET) = 792
        lseek(3, 864, SET) = 864
        mmap(0, 1857568, READ, PRIVATE|DENYWRITE, 3, 0) = 0x7ff61c4cd000
        mmap(0x7ff61c4ef000, 1363968, EXEC|READ, PRIVATE|FIXED|DENYWRITE, 3, 139264) = 0x7ff61c4ef000
        mmap(0x7ff61c63c000, 311296, READ, PRIVATE|FIXED|DENYWRITE, 3, 1503232) = 0x7ff61c63c000
        mmap(0x7ff61c689000, 24576, READ|WRITE, PRIVATE|FIXED|DENYWRITE, 3, 1814528) = 0x7ff61c689000
        mmap(0x7ff61c68f000, 14368, READ|WRITE, PRIVATE|FIXED|ANONYMOUS) = 0x7ff61c68f000
        munmap(0x7ff61c695000, 109093) = 0
        openat("/usr/lib/locale/locale-archive", CLOEXEC) = 3
        mmap(0, 217749968, READ, PRIVATE, 3, 0) = 0x7ff60f523000
        #
        # vim ~/.perfconfig
        #
        # perf config
        llvm.dump-obj=true
        trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        trace.show_zeros=yes
        trace.show_duration=no
        trace.no_inherit=yes
        trace.show_timestamp=no
        trace.show_arg_names=no
        trace.args_alignment=0
        trace.string_quote="
        trace.show_prefix=yes
        #
        #
        # trace -e *map,open*,*seek sleep 1
        openat(AT_FDCWD, "/etc/ld.so.cache", O_CLOEXEC) = 3
        mmap(0, 109093, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7ebbe59000
        openat(AT_FDCWD, "/lib64/libc.so.6", O_CLOEXEC) = 3
        lseek(3, 792, SEEK_SET) = 792
        mmap(0, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS) = 0x7f7ebbe57000
        lseek(3, 792, SEEK_SET) = 792
        lseek(3, 864, SEEK_SET) = 864
        mmap(0, 1857568, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f7ebbc91000
        mmap(0x7f7ebbcb3000, 1363968, PROT_EXEC|PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 139264) = 0x7f7ebbcb3000
        mmap(0x7f7ebbe00000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1503232) = 0x7f7ebbe00000
        mmap(0x7f7ebbe4d000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 1814528) = 0x7f7ebbe4d000
        mmap(0x7f7ebbe53000, 14368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS) = 0x7f7ebbe53000
        munmap(0x7f7ebbe59000, 109093) = 0
        openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_CLOEXEC) = 3
        mmap(0, 217749968, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f7eaece7000
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-mtn1i4rjowjl72trtnbmvjd4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c65c83ff
    • A
      perf trace: Add a prefix member to the strarray class · 2e3d7fac
      Arnaldo Carvalho de Melo 提交于
      So that the user, in an upcoming patch, can select printing it to get
      the full string as used in the source code, not one with a common prefix
      chopped off so as to make the output more compact.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-zypczc88gzbmeqx7b372s138@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2e3d7fac
    • A
      perf trace: Enclose strings with double quotes · 721f5326
      Arnaldo Carvalho de Melo 提交于
      To match 'strace' output, helping with upcoming regression tests
      comparing both outputs.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-jab52t1dcuh6vlztqle9g7u9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      721f5326
    • A
      perf trace: Make the alignment of the syscall args be configurable · 9ed45d59
      Arnaldo Carvalho de Melo 提交于
      Since the start 'perf trace' aligns the parens enclosing the list of
      syscall args to align the syscall results, allow this to be
      configurable, keeping the default of 70. Using:
      
        # perf config
        llvm.dump-obj=true
        trace.add_events=/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        trace.show_zeros=yes
        trace.show_duration=no
        trace.no_inherit=yes
        trace.show_timestamp=no
        trace.show_arg_names=no
        trace.args_alignment=0
        # trace -e open*,close,*sleep sleep 1
        openat(CWD, /etc/ld.so.cache, CLOEXEC) = 3
        close(3) = 0
        openat(CWD, /lib64/libc.so.6, CLOEXEC) = 3
        close(3) = 0
        openat(CWD, /usr/lib/locale/locale-archive, CLOEXEC) = 3
        close(3) = 0
        nanosleep(0x7ffc00de66f0, 0) = 0
        close(1) = 0
        close(2) = 0
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-r8cbhoz1lr5npq9tutpvoigr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ed45d59
  4. 18 12月, 2018 21 次提交
  5. 21 11月, 2018 4 次提交
  6. 03 11月, 2018 1 次提交
    • A
      perf trace: Fix setting of augmented payload when using eBPF + raw_syscalls · cd26ea6d
      Arnaldo Carvalho de Melo 提交于
      For now with BPF raw_augmented we hook into raw_syscalls:sys_enter and
      there we get all 6 syscall args plus the tracepoint common fields
      (sizeof(long)) and the syscall_nr (another long). So we check if that is
      the case and if so don't look after the sc->args_size, but always after
      the full raw_syscalls:sys_enter payload, which is fixed.
      
      We'll revisit this later to pass s->args_size to the BPF augmenter (now
      tools/perf/examples/bpf/augmented_raw_syscalls.c, so that it copies only
      what we need for each syscall, like what happens when we use
      syscalls:sys_enter_NAME, so that we reduce the kernel/userspace traffic
      to just what is needed for each syscall.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-nlslrg8apxdsobt4pwl3n7ur@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cd26ea6d