1. 06 5月, 2020 7 次提交
  2. 12 2月, 2020 1 次提交
    • A
      perf trace: Resolve prctl's 'option' arg strings to numbers · d7a07b29
      Arnaldo Carvalho de Melo 提交于
        # perf trace -e syscalls:sys_enter_prctl --filter="option==SET_NAME"
           0.000 Socket Thread/3860 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7fc50b9733e8)
           0.053 SSL Cert #78/3860 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7fc50b9733e8)
      ^C  #
      
      If one uses '-v' with 'perf trace', we can see the filter it puts in
      place:
      
        New filter for syscalls:sys_enter_prctl: (option==0xf) && (common_pid != 3859 && common_pid != 2757)
      
      We still need to allow using plain '-e prctl' and have this turn into
      creating a 'syscalls:sys_enter_prctl' event so that the filter can be
      applied only to it as right now '-e prctl' ends up using the
      'raw_syscalls:sys_enter/sys_exit'.
      
      The end goal is to have something like:
      
        # perf trace -e prctl/option==SET_NAME/
      
      And have that use tracepoint filters or eBPF ones.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mike Christie <mchristi@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d7a07b29
  3. 19 11月, 2019 1 次提交
    • I
      perf parse: Report initial event parsing error · a910e466
      Ian Rogers 提交于
      Record the first event parsing error and report. Implementing feedback
      from Jiri Olsa:
      
        https://lkml.org/lkml/2019/10/28/680
      
      An example error is:
      
        $ tools/perf/perf stat -e c/c/
        WARNING: multiple event parsing errors
        event syntax error: 'c/c/'
                               \___ unknown term
      
        valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore
      
      Initial error:
      
        event syntax error: 'c/c/'
                            \___ Cannot find PMU `c'. Missing kernel support?
        Run 'perf list' for a list of valid events
      
         Usage: perf stat [<options>] [<command>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      Signed-off-by: NIan Rogers <irogers@google.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Allison Randal <allison@lohutok.net>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/20191116074652.9960-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a910e466
  4. 20 10月, 2019 6 次提交
    • A
      perf trace: Use STUL_STRARRAY_FLAGS with mmap · 27198a89
      Arnaldo Carvalho de Melo 提交于
      The 'mmap' syscall has special needs so it doesn't use
      SCA_STRARRAY_FLAGS, see its implementation in
      syscall_arg__scnprintf_mmap_flags(), related to special handling of
      MAP_ANONYMOUS, so set ->parm to the strarray__mmap_flags and hook up
      with strarray__strtoul_flags manually, now we can filter by those or-ed
      string expressions:
      
        # perf trace -e syscalls:sys_enter_mmap sleep 1
           0.000 syscalls:sys_enter_mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0)
           0.026 syscalls:sys_enter_mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS)
           0.036 syscalls:sys_enter_mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3, off: 0)
           0.046 syscalls:sys_enter_mmap(addr: 0x7fae003d9000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000)
           0.052 syscalls:sys_enter_mmap(addr: 0x7fae00526000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000)
           0.055 syscalls:sys_enter_mmap(addr: 0x7fae00573000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000)
           0.062 syscalls:sys_enter_mmap(addr: 0x7fae00579000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS)
           0.253 syscalls:sys_enter_mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0)
        #
      
        # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE" sleep 1
           0.000 syscalls:sys_enter_mmap(addr: 0x7f6ab3dcb000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000)
           0.010 syscalls:sys_enter_mmap(addr: 0x7f6ab3f18000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000)
           0.014 syscalls:sys_enter_mmap(addr: 0x7f6ab3f65000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000)
        # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|ANONYMOUS" sleep 1
           0.000 syscalls:sys_enter_mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS)
        #
      
        # perf trace -v -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|ANONYMOUS" sleep 1 |& grep "New filter"
        New filter for syscalls:sys_enter_mmap: flags==0x22
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-czw754b7m9rp9ibq2f6be2o1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      27198a89
    • A
      perf trace: Wire up strarray__strtoul_flags() · e0712baa
      Arnaldo Carvalho de Melo 提交于
      Now anything that uses STRARRAY_FLAGS, like the 'fsmount' syscall will
      support mapping or-ed strings back to a value that can be used in a
      filter.
      
      In some cases, where STRARRAY_FLAGS isn't used but instead the scnprintf
      is a special one because of specific needs, like for mmap, then one has
      to set the ->pars to the strarray. See the next cset.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-r2lpqo7dfsrhi4ll0npsb3u7@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e0712baa
    • A
      libbeauty: Introduce strarray__strtoul_flags() · 154c978d
      Arnaldo Carvalho de Melo 提交于
      Counterpart of strarray__scnprintf_flags(), i.e. from a expression like:
      
         # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE"
      
      I.e. that "flags==PRIVATE|FIXED|DENYWRITE", turn that into
      
         # perf trace -e syscalls:sys_enter_mmap --filter=0x812
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-8xst3zrqqogax7fmfzwymvbl@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      154c978d
    • A
      perf trace: Use strtoul for the fcntl 'cmd' argument · 82c38338
      Arnaldo Carvalho de Melo 提交于
      Since its values are in two ranges of values we ended up codifying it
      using a 'struct strarrays', so now hook it up with STUL_STRARRAYS so
      that we can do:
      
        # perf trace -e syscalls:*enter_fcntl --filter=cmd==SETLK||cmd==SETLKW
           0.000 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4dee0)
           1.523 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de90)
           1.629 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de90)
           2.711 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de70)
        ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-mob96wyzri4r3rvyigqfjv0a@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      82c38338
    • A
      libbeauty: Introduce syscall_arg__strtoul_strarrays() · 1a8a90b8
      Arnaldo Carvalho de Melo 提交于
      To allow going from string to integer for 'struct strarrays'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-b1ia3xzcy72hv0u4m168fcd0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1a8a90b8
    • A
      perf trace: Pass a syscall_arg to syscall_arg_fmt->strtoul() · 9afec87e
      Arnaldo Carvalho de Melo 提交于
      With just what we need for the STUL_STRARRAY, i.e. the 'struct strarray'
      pointer to be used, just like with syscall_arg_fmt->scnprintf() for the
      other direction (number -> string).
      
      With this all the strarrays that are associated with syscalls can be
      used with '-e syscalls:sys_enter_SYSCALLNAME --filter', and soon will be
      possible as well to use with the strace-like shorter form, with just the
      syscall names, i.e. something like:
      
         -e lseek/whence==END/
      
      For now we have to use the longer form:
      
          # perf trace -e syscalls:sys_enter_lseek
             0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
             0.031 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
             0.046 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
          5003.528 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
          5003.575 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
          5003.593 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
         10002.017 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
         10002.051 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
         10002.068 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
        ^C# perf trace -e syscalls:sys_enter_lseek --filter="whence!=CUR"
             0.000 sshd/24476 syscalls:sys_enter_lseek(fd: 3, offset: 9032, whence: SET)
             0.060 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 9032, whence: SET)
             0.187 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET)
             0.203 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET)
             0.349 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 61936, whence: SET)
        ^C#
      
      And for those curious about what are those lseek(DSO, offset, SET), well, its the loader:
      
        # perf trace -e syscalls:sys_enter_lseek/max-stack=16/ --filter="whence!=CUR"
           0.000 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET)
                                             __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                             _dl_map_object (/usr/lib64/ld-2.29.so)
           0.067 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET)
                                             __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                             _dl_map_object_from_fd (/usr/lib64/ld-2.29.so)
                                             _dl_map_object (/usr/lib64/ld-2.29.so)
           0.198 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET)
                                             __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                             _dl_map_object (/usr/lib64/ld-2.29.so)
           0.219 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET)
                                             __libc_lseek64 (/usr/lib64/ld-2.29.so)
                                             _dl_map_object_from_fd (/usr/lib64/ld-2.29.so)
                                             _dl_map_object (/usr/lib64/ld-2.29.so)
        ^C#
      
      :-)
      
      With this we can use strings in strarrays in filters, which allows us to
      reuse all these that are in place for syscalls:
      
        $ find tools/perf/trace/beauty/ -name "*.c" | xargs grep -w DEFINE_STRARRAY
        tools/perf/trace/beauty/fcntl.c:	static DEFINE_STRARRAY(fcntl_setlease, "F_");
        tools/perf/trace/beauty/mmap.c:       static DEFINE_STRARRAY(mmap_flags, "MAP_");
        tools/perf/trace/beauty/mmap.c:       static DEFINE_STRARRAY(madvise_advices, "MADV_");
        tools/perf/trace/beauty/sync_file_range.c:       static DEFINE_STRARRAY(sync_file_range_flags, "SYNC_FILE_RANGE_");
        tools/perf/trace/beauty/socket.c:	static DEFINE_STRARRAY(socket_ipproto, "IPPROTO_");
        tools/perf/trace/beauty/mount_flags.c:	static DEFINE_STRARRAY(mount_flags, "MS_");
        tools/perf/trace/beauty/pkey_alloc.c:	static DEFINE_STRARRAY(pkey_alloc_access_rights, "PKEY_");
        tools/perf/trace/beauty/sockaddr.c:DEFINE_STRARRAY(socket_families, "PF_");
        tools/perf/trace/beauty/tracepoints/x86_irq_vectors.c:static DEFINE_STRARRAY(x86_irq_vectors, "_VECTOR");
        tools/perf/trace/beauty/tracepoints/x86_msr.c:static DEFINE_STRARRAY(x86_MSRs, "MSR_");
        tools/perf/trace/beauty/prctl.c:	static DEFINE_STRARRAY(prctl_options, "PR_");
        tools/perf/trace/beauty/prctl.c:	static DEFINE_STRARRAY(prctl_set_mm_options, "PR_SET_MM_");
        tools/perf/trace/beauty/fspick.c:       static DEFINE_STRARRAY(fspick_flags, "FSPICK_");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(ioctl_tty_cmd, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(drm_ioctl_cmds, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(sndrv_pcm_ioctl_cmds, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(sndrv_ctl_ioctl_cmds, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(kvm_ioctl_cmds, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(vhost_virtio_ioctl_cmds, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(vhost_virtio_ioctl_read_cmds, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(perf_ioctl_cmds, "");
        tools/perf/trace/beauty/ioctl.c:	static DEFINE_STRARRAY(usbdevfs_ioctl_cmds, "");
        tools/perf/trace/beauty/fsmount.c:       static DEFINE_STRARRAY(fsmount_attr_flags, "MOUNT_ATTR_");
        tools/perf/trace/beauty/renameat.c:       static DEFINE_STRARRAY(rename_flags, "RENAME_");
        tools/perf/trace/beauty/kcmp.c:	static DEFINE_STRARRAY(kcmp_types, "KCMP_");
        tools/perf/trace/beauty/move_mount.c:       static DEFINE_STRARRAY(move_mount_flags, "MOVE_MOUNT_");
        $
      
      Well, some, as the mmap flags are like:
      
        $ tools/perf/trace/beauty/mmap_flags.sh
        static const char *mmap_flags[] = {
        	[ilog2(0x40) + 1] = "32BIT",
        	[ilog2(0x01) + 1] = "SHARED",
        	[ilog2(0x02) + 1] = "PRIVATE",
        	[ilog2(0x10) + 1] = "FIXED",
        	[ilog2(0x20) + 1] = "ANONYMOUS",
        	[ilog2(0x008000) + 1] = "POPULATE",
        	[ilog2(0x010000) + 1] = "NONBLOCK",
        	[ilog2(0x020000) + 1] = "STACK",
        	[ilog2(0x040000) + 1] = "HUGETLB",
        	[ilog2(0x080000) + 1] = "SYNC",
        	[ilog2(0x100000) + 1] = "FIXED_NOREPLACE",
        	[ilog2(0x0100) + 1] = "GROWSDOWN",
        	[ilog2(0x0800) + 1] = "DENYWRITE",
        	[ilog2(0x1000) + 1] = "EXECUTABLE",
        	[ilog2(0x2000) + 1] = "LOCKED",
        	[ilog2(0x4000) + 1] = "NORESERVE",
        };
        $
      
      So we'll need a strarray__strtoul_flags() that will break donw the flags
      into tokens separated by '|' before doing the lookup and then go on
      reconstructing the value from, say:
      
            # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE"
      
      into:
      
            # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x2|0x10|0x0800"
      
      and finally into:
      
            # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x812"
      
      That is what we see if we don't use the augmented view obtained from:
      
        # perf trace -e mmap
        <SNIP>
        211792.885 procmail/15393 mmap(addr: 0x7fcd11645000, len: 8192, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 8, off: 0xa000) = 0x7fcd11645000
        <SNIP>
      
      But plain use tracefs:
      
              procmail-15559 [000] .... 54557.178262: sys_mmap(addr: 7f5c9bf7a000, len: 9b000, prot: 1, flags: 812, fd: 3, off: a9000)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-c6mgkjt8ujnc263eld5tb7q3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9afec87e
  5. 18 10月, 2019 7 次提交
    • A
      perf trace: Honour --max-events in processing syscalls:sys_enter_* · db25bf98
      Arnaldo Carvalho de Melo 提交于
      We were doing this only at the sys_exit syscall tracepoint, as for
      strace-like we count the pair of sys_enter and sys_exit as one event,
      but when asking specifically for a the syscalls:sys_enter_NAME
      tracepoint we need to count each of those as an event.
      
      I.e. things like:
      
        # perf trace --max-events=4 -e syscalls:sys_enter_lseek
           0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR)
           0.034 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR)
           0.051 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR)
        2307.900 sshd/30800 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libsystemd.so.0.25.0>, offset: 9032, whence: SET)
        #
      
      Were going on forever, since we only had sys_enter events.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-0ob1dky1a9ijlfrfhxyl40wr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      db25bf98
    • A
      libbeauty: Introduce syscall_arg__strtoul_strarray() · d066da97
      Arnaldo Carvalho de Melo 提交于
      To go from strarrays strings to its indexes.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-wta0qvo207z27huib2c4ijxq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d066da97
    • A
      perf trace: Initialize evsel_trace->fmt for syscalls:sys_enter_* tracepoints · 362222f8
      Arnaldo Carvalho de Melo 提交于
      From the syscall_fmts->arg entries for formatting strace-like syscalls.
      
      This is when resolving the string "whence" on a filter expression for
      the syscalls:sys_enter_lseek:
      
        Breakpoint 3, perf_evsel__syscall_arg_fmt (evsel=0xc91ed0, arg=0x7fffffff7cd0 "whence") at builtin-trace.c:3626
        3626	{
        (gdb) n
        3628		struct syscall_arg_fmt *fmt = __evsel__syscall_arg_fmt(evsel);
        (gdb) n
        3630		if (evsel->tp_format == NULL || fmt == NULL)
        (gdb) n
        3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
        (gdb) n
        3634			if (strcmp(field->name, arg) == 0)
        (gdb) p field->name
        $3 = 0xc945e0 "__syscall_nr"
        (gdb) n
        3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
        (gdb) p *fmt
        $4 = {scnprintf = 0x0, strtoul = 0x0, mask_val = 0x0, parm = 0x0, name = 0x0, nr_entries = 0, show_zero = false}
        (gdb) n
        3634			if (strcmp(field->name, arg) == 0)
        (gdb) p field->name
        $5 = 0xc94690 "fd"
        (gdb) n
        3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
        (gdb) n
        3634			if (strcmp(field->name, arg) == 0)
        (gdb) n
        3633		for (field = evsel->tp_format->format.fields; field; field = field->next, ++fmt)
        (gdb) n
        3634			if (strcmp(field->name, arg) == 0)
        (gdb) p *fmt
        $9 = {scnprintf = 0x489be2 <syscall_arg__scnprintf_strarray>, strtoul = 0x0, mask_val = 0x0, parm = 0xa2da80 <strarray.whences>, name = 0x0,
          nr_entries = 0, show_zero = false}
        (gdb) p field->name
        $10 = 0xc947b0 "whence"
        (gdb) p fmt->parm
        $11 = (void *) 0xa2da80 <strarray.whences>
        (gdb) p *(struct strarray *)fmt->parm
        $12 = {offset = 0, nr_entries = 5, prefix = 0x724d37 "SEEK_", entries = 0xa2da40 <whences>}
        (gdb) p (struct strarray *)fmt->parm)->entries
        Junk after end of expression.
        (gdb) p ((struct strarray *)fmt->parm)->entries
        $13 = (const char **) 0xa2da40 <whences>
        (gdb) p ((struct strarray *)fmt->parm)->entries[0]
        $14 = 0x724d21 "SET"
        (gdb) p ((struct strarray *)fmt->parm)->entries[1]
        $15 = 0x724d25 "CUR"
        (gdb) p ((struct strarray *)fmt->parm)->entries[2]
        $16 = 0x724d29 "END"
        (gdb) p ((struct strarray *)fmt->parm)->entries[2]
        $17 = 0x724d29 "END"
        (gdb) p ((struct strarray *)fmt->parm)->entries[3]
        $18 = 0x724d2d "DATA"
        (gdb) p ((struct strarray *)fmt->parm)->entries[4]
        $19 = 0x724d32 "HOLE"
        (gdb)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-lc8h9jgvbnboe0g7ic8tra1y@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      362222f8
    • A
      perf trace: Introduce 'struct evsel__trace' for evsel->priv needs · 2b00bb62
      Arnaldo Carvalho de Melo 提交于
      For syscalls we need to cache the 'syscall_id' and 'ret' field offsets
      but as well have a pointer to the syscall_fmt_arg array for the fields,
      so that we can expand strings in filter expressions, so introduce
      a 'struct evsel_trace' to have in evsel->priv that allows for that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-hx8ukasuws5sz6rsar73cocv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b00bb62
    • A
      perf trace: Hide evsel->access further, simplify code · 8b913df5
      Arnaldo Carvalho de Melo 提交于
      Next step will be to have a 'struct evsel_trace' to allow for handling
      the syscalls tracepoints via the strace-like code while reusing parts of
      that code with the other tracepoints, where we don't have things like
      the 'syscall_nr' or 'ret' ((raw_)?syscalls:sys_{enter,exit}(_SYSCALL)?)
      args that we want to cache offsets and have been using evsel->priv for
      that, while for the other tracepoints we'll have just an array of
      'struct syscall_arg_fmt' (i.e. ->scnprint() for number->string and
      ->strtoul() string->number conversions and other state those functions
      need).
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-fre21jbyoqxmmquxcho7oa0x@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8b913df5
    • A
      perf trace: Introduce accessors to trace specific evsel->priv · fecd9907
      Arnaldo Carvalho de Melo 提交于
      We're using evsel->priv in syscalls:sys_{enter,exit}_SYSCALL and in
      raw_syscalls:sys_{enter,exit} to cache the offset of the common fields,
      the multiplexor id/syscall_id in the sys_enter case and syscall_id + ret
      for sys_exit.
      
      And for the rest of the tracepoints we use it to have a syscall_arg_fmt
      array to have scnprintf/strtoul for tracepoint args.
      
      So we better clearly mark them with accessors so that we can move to
      having a 'struct evsel_trace' struct for all 'perf trace' specific
      evsel->priv usage.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-dcoyxfslg7atz821tz9aupjh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fecd9907
    • A
      perf trace: Show error message when not finding a field used in a filter expression · 3cdc8db9
      Arnaldo Carvalho de Melo 提交于
      It was there, but as pr_debug(), make it pr_err() so that we can see it
      without -v:
      
        # trace -e syscalls:*lseek --filter="whenc==SET" sleep 1
        "whenc" not found in "syscalls:sys_enter_lseek", can't set filter "whenc==SET"
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-ly4rgm1bto8uwc2itpaixjob@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3cdc8db9
  6. 16 10月, 2019 5 次提交
    • A
      perf trace: Hook the 'vec' tracepoint argument with the x86 IRQ vectors scnprintf/strtoul · df604bfd
      Arnaldo Carvalho de Melo 提交于
      Ended up only being useful when filtering multiple irq_vectors
      tracepoints, as we end up having a tracepoint for each of the entries,
      i.e.:
      
      This will always come with the "RESCHEDULE_VECTOR" in the 'vector' arg:
      
        # perf trace --max-events 8 -e irq_vectors:reschedule*
           0.000 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
           0.004 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
           0.553 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
           0.556 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
           1.182 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
           1.185 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
           1.203 :29052/29052 irq_vectors:reschedule_entry(vector: RESCHEDULE)
           1.206 :29052/29052 irq_vectors:reschedule_exit(vector: RESCHEDULE)
        #
      
      While filtering that value will produce nothing:
      
        # perf trace --max-events 8 -e irq_vectors:reschedule* --filter="vector != RESCHEDULE"
        ^C#
      
      Maybe it'll be useful for those other tracepoints:
      
        # perf list irq_vectors:vector_*
      
        List of pre-defined events (to be used in -e):
      
          irq_vectors:vector_activate                        [Tracepoint event]
          irq_vectors:vector_alloc                           [Tracepoint event]
          irq_vectors:vector_alloc_managed                   [Tracepoint event]
          irq_vectors:vector_clear                           [Tracepoint event]
          irq_vectors:vector_config                          [Tracepoint event]
          irq_vectors:vector_deactivate                      [Tracepoint event]
          irq_vectors:vector_free_moved                      [Tracepoint event]
          irq_vectors:vector_reserve                         [Tracepoint event]
          irq_vectors:vector_reserve_managed                 [Tracepoint event]
          irq_vectors:vector_setup                           [Tracepoint event]
          irq_vectors:vector_teardown                        [Tracepoint event]
          irq_vectors:vector_update                          [Tracepoint event]
        #
      
      But since we have it done, keep it.
      
      This at least served to teach me that all those irq vectors have a entry
      and an exit tracepoint that I can then use just like with
      raw_syscalls:sys_{enter,exit}, i.e. pair them, use just a
      trace__irq_vectors_entry() + trace__irq_vectors_exit() and use the
      'vector' arg as I use the 'syscall id' one for syscalls.
      
      Then the default for 'perf trace' will include irq_vectors in addition
      to syscalls.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-wer4cwbbqub3o7sa8h1j3uzb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      df604bfd
    • A
      libbeauty: Add a strarray__scnprintf_suffix() method · 97c2a780
      Arnaldo Carvalho de Melo 提交于
      In some cases, like with x86 IRQ vectors, the common part in names is at
      the end, so a suffix, add a scnprintf function for that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-agxbj6es2ke3rehwt4gkdw23@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      97c2a780
    • A
      perf trace: Support tracepoint dynamic char arrays · c5e006cd
      Arnaldo Carvalho de Melo 提交于
      Things like:
      
        # grep __data_loc /sys/kernel/debug/tracing/events/sched/sched_process_exec/format
      	field:__data_loc char[] filename;	offset:8;	size:4;	signed:1;
        #
      
      That, at that offset (8) and with that size(8) have an integer that
      contains the real length and offset for the contents of that array.
      
      Now this works:
      
        # perf trace --max-events 1 -e sched:*exec -a
           0.000 sed/19441 sched:sched_process_exec(filename: "/usr/bin/sync", pid: 19441 (sync), old_pid: 19441 (sync))
        #
      
      As when using the libtraceevent based beautifier:
      
        # perf trace --libtraceevent --max-events 1 -e sched:*exec -a
           0.000 sync/19463 sched:sched_process_exec(filename=/usr/bin/sync pid=19463 old_pid=19463)
        #
      
      I.e. that 'filename' is implemented as a dynamic char array.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-950p0m842fe6n7sxsdwqj5i2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5e006cd
    • A
      perf trace: Filter own pid to avoid a feedback look in 'perf trace record -a' · 7fbfe22c
      Arnaldo Carvalho de Melo 提交于
      When doing a system wide 'perf trace record' we need, just like in 'perf
      trace' live mode, to filter out perf trace's own pid, so set up a
      tracepoint filter for the raw_syscalls tracepoints right after adding
      them to the argv array that is set up to then call cmd_record().
      Reported-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-uysx5w8f2y5ndoln5cq370tv@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7fbfe22c
    • A
      perf trace: Introduce --errno-summary · b88b14db
      Arnaldo Carvalho de Melo 提交于
      To be used with -S or -s, using just this new option implies -s,
      examples:
      
        # perf trace --errno-summary sleep 1
      
         Summary of events:
      
         sleep (10793), 80 events, 93.0%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           nanosleep              1      0  1000.427  1000.427  1000.427  1000.427      0.00%
           mmap                   8      0     0.026     0.002     0.003     0.005      9.18%
           close                  5      0     0.018     0.001     0.004     0.009     48.97%
           mprotect               4      0     0.017     0.003     0.004     0.006     16.49%
           openat                 3      0     0.012     0.003     0.004     0.005      9.41%
           munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
           brk                    4      0     0.005     0.001     0.001     0.002     22.77%
           read                   4      0     0.005     0.001     0.001     0.002     22.33%
           access                 1      1     0.004     0.004     0.004     0.004      0.00%
        				ENOENT: 1
           fstat                  3      0     0.004     0.001     0.001     0.002     17.18%
           lseek                  3      0     0.003     0.001     0.001     0.001     11.62%
           arch_prctl             2      1     0.002     0.001     0.001     0.001      3.32%
        				EINVAL: 1
           execve                 1      0     0.000     0.000     0.000     0.000      0.00%
      
        #
      
      Works as well together with --failure and -S, i.e. collect the stats and
      show just the syscalls that failed:
      
        # perf trace --failure -S --errno-summary sleep 1
             0.032 arch_prctl(option: 0x3001, arg2: 0x7fffdb11b580) = -1 EINVAL (Invalid argument)
             0.045 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)
      
         Summary of events:
      
         sleep (10806), 80 events, 93.0%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           nanosleep              1      0  1000.094  1000.094  1000.094  1000.094      0.00%
           mmap                   8      0     0.026     0.002     0.003     0.005      9.06%
           close                  5      0     0.018     0.001     0.004     0.010     49.58%
           mprotect               4      0     0.017     0.003     0.004     0.006     17.56%
           openat                 3      0     0.014     0.004     0.005     0.006     12.29%
           munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
           brk                    4      0     0.005     0.001     0.001     0.002     22.75%
           read                   4      0     0.005     0.001     0.001     0.002     17.19%
           access                 1      1     0.005     0.005     0.005     0.005      0.00%
        				ENOENT: 1
           fstat                  3      0     0.004     0.001     0.001     0.002     21.66%
           lseek                  3      0     0.003     0.001     0.001     0.001     11.71%
           arch_prctl             2      1     0.002     0.001     0.001     0.001      2.66%
        				EINVAL: 1
           execve                 1      0     0.000     0.000     0.000     0.000      0.00%
      
        #
      Suggested-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-l0mjwczkpouov7lss5zn8d9h@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b88b14db
  7. 15 10月, 2019 1 次提交
    • A
      perf trace: Add syscall failure stats to -s/--summary and -S/--with-summary · 8eded45f
      Arnaldo Carvalho de Melo 提交于
      Just like strace has:
      
        # trace -s sleep 1
      
        Summary of events:
      
        sleep (32370), 80 events, 93.0%
      
          syscall            calls  errors  total       min       avg       max       stddev
                                            (msec)    (msec)    (msec)    (msec)        (%)
          --------------- --------  ------ -------- --------- --------- ---------     ------
          nanosleep              1      0  1000.402  1000.402  1000.402  1000.402      0.00%
          mmap                   8      0     0.023     0.002     0.003     0.004      8.49%
          close                  5      0     0.015     0.001     0.003     0.009     51.39%
          mprotect               4      0     0.014     0.002     0.003     0.005     16.95%
          openat                 3      0     0.013     0.003     0.004     0.005     14.29%
          munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
          read                   4      0     0.005     0.001     0.001     0.002     16.83%
          brk                    4      0     0.004     0.001     0.001     0.002     20.82%
          access                 1      1     0.004     0.004     0.004     0.004      0.00%
          fstat                  3      0     0.003     0.001     0.001     0.001     12.17%
          lseek                  3      0     0.003     0.001     0.001     0.001     11.45%
          arch_prctl             2      1     0.002     0.001     0.001     0.001      2.30%
          execve                 1      0     0.000     0.000     0.000     0.000      0.00%
      
        #
      
        # perf trace -S sleep 1
               ?  ... [continued]: execve())             = 0
           0.028 brk(brk: NULL)                          = 0x559f5bd96000
           0.033 arch_prctl(option: 0x3001, arg2: 0x7ffda8b715a0) = -1 EINVAL (Invalid argument)
           0.046 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory)
           0.055 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3
           0.060 fstat(fd: 3, statbuf: 0x7ffda8b707a0)   = 0
           0.062 mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3aedfc4000
           0.066 close(fd: 3)                            = 0
           0.079 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) = 3
           0.085 read(fd: 3, buf: 0x7ffda8b70948, count: 832) = 832
           0.088 lseek(fd: 3, offset: 792, whence: SET)  = 792
           0.090 read(fd: 3, buf: 0x7ffda8b70810, count: 68) = 68
           0.093 fstat(fd: 3, statbuf: 0x7ffda8b707f0)   = 0
           0.095 mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7f3aedfc2000
           0.101 lseek(fd: 3, offset: 792, whence: SET)  = 792
           0.103 read(fd: 3, buf: 0x7ffda8b70450, count: 68) = 68
           0.105 lseek(fd: 3, offset: 864, whence: SET)  = 864
           0.107 read(fd: 3, buf: 0x7ffda8b70470, count: 32) = 32
           0.110 mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3, off: 0) = 0x7f3aeddfc000
           0.114 mprotect(start: 0x7f3aede1e000, len: 1679360, prot: NONE) = 0
           0.121 mmap(addr: 0x7f3aede1e000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000) = 0x7f3aede1e000
           0.127 mmap(addr: 0x7f3aedf6b000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000) = 0x7f3aedf6b000
           0.131 mmap(addr: 0x7f3aedfb8000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000) = 0x7f3aedfb8000
           0.138 mmap(addr: 0x7f3aedfbe000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7f3aedfbe000
           0.147 close(fd: 3)                            = 0
           0.158 arch_prctl(option: SET_FS, arg2: 0x7f3aedfc3580) = 0
           0.210 mprotect(start: 0x7f3aedfb8000, len: 16384, prot: READ) = 0
           0.230 mprotect(start: 0x559f5b27d000, len: 4096, prot: READ) = 0
           0.236 mprotect(start: 0x7f3aee00f000, len: 4096, prot: READ) = 0
           0.240 munmap(addr: 0x7f3aedfc4000, len: 134346) = 0
           0.300 brk(brk: NULL)                          = 0x559f5bd96000
           0.302 brk(brk: 0x559f5bdb7000)                = 0x559f5bdb7000
           0.305 brk(brk: NULL)                          = 0x559f5bdb7000
           0.310 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) = 3
           0.315 fstat(fd: 3, statbuf: 0x7f3aedfbdac0)   = 0
           0.318 mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3ae0e52000
           0.325 close(fd: 3)                            = 0
           0.358 nanosleep(rqtp: 0x7ffda8b714b0, rmtp: NULL) = 0
        1000.622 close(fd: 1)                            = 0
        1000.641 close(fd: 2)                            = 0
        1000.664 exit_group(error_code: 0)               = ?
      
         Summary of events:
      
         sleep (722), 80 events, 93.0%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           nanosleep              1      0  1000.194  1000.194  1000.194  1000.194      0.00%
           mmap                   8      0     0.025     0.002     0.003     0.005     10.17%
           close                  5      0     0.018     0.001     0.004     0.010     50.18%
           mprotect               4      0     0.016     0.003     0.004     0.006     16.81%
           openat                 3      0     0.011     0.003     0.004     0.004      6.57%
           munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
           brk                    4      0     0.005     0.001     0.001     0.002     20.72%
           read                   4      0     0.005     0.001     0.001     0.002     16.71%
           access                 1      1     0.005     0.005     0.005     0.005      0.00%
           fstat                  3      0     0.004     0.001     0.001     0.002     14.82%
           lseek                  3      0     0.003     0.001     0.001     0.001     11.66%
           arch_prctl             2      1     0.002     0.001     0.001     0.001      3.59%
           execve                 1      0     0.000     0.000     0.000     0.000      0.00%
      
        #
      
      Works for system wide, e.g. for 1ms:
      
        # perf trace -s -a sleep 0.001
      
         Summary of events:
      
         sleep (768), 94 events, 37.9%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           nanosleep              1      0     1.133     1.133     1.133     1.133      0.00%
           execve                 7      6     0.351     0.003     0.050     0.316     88.53%
           mmap                   8      0     0.024     0.002     0.003     0.004      8.86%
           mprotect               4      0     0.017     0.003     0.004     0.006     16.02%
           openat                 3      0     0.013     0.004     0.004     0.005      8.34%
           munmap                 1      0     0.010     0.010     0.010     0.010      0.00%
           brk                    4      0     0.007     0.001     0.002     0.002     10.99%
           close                  5      0     0.005     0.001     0.001     0.002     11.69%
           read                   5      0     0.005     0.000     0.001     0.002     30.53%
           access                 1      1     0.004     0.004     0.004     0.004      0.00%
           fstat                  3      0     0.004     0.001     0.001     0.002     10.74%
           lseek                  3      0     0.003     0.001     0.001     0.001     10.20%
           arch_prctl             2      1     0.002     0.001     0.001     0.001      3.34%
      
         Web Content (21258), 46 events, 18.5%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           recvmsg               12     12     0.015     0.001     0.001     0.002      8.50%
           futex                  2      0     0.008     0.003     0.004     0.005     27.08%
           poll                   6      0     0.006     0.000     0.001     0.002     22.14%
           read                   2      0     0.006     0.002     0.003     0.003     26.08%
           write                  1      0     0.002     0.002     0.002     0.002      0.00%
      
         Web Content (4365), 36 events, 14.5%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           recvmsg               10     10     0.015     0.001     0.002     0.003     11.83%
           poll                   5      0     0.006     0.000     0.001     0.002     28.44%
           futex                  2      0     0.005     0.001     0.003     0.004     48.29%
           read                   1      0     0.003     0.003     0.003     0.003      0.00%
      
         Timer (21275), 14 events, 5.6%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           futex                  6      1     0.240     0.000     0.040     0.149     64.58%
           write                  1      0     0.008     0.008     0.008     0.008      0.00%
      
         Timer (4383), 14 events, 5.6%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           futex                  6      2     0.186     0.000     0.031     0.181     96.45%
           write                  1      0     0.010     0.010     0.010     0.010      0.00%
      
         Web Content (20354), 28 events, 11.3%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           recvmsg                8      8     0.010     0.001     0.001     0.002     15.24%
           poll                   4      0     0.004     0.000     0.001     0.002     35.68%
           futex                  1      0     0.003     0.003     0.003     0.003      0.00%
           read                   1      0     0.003     0.003     0.003     0.003      0.00%
      
         Timer (20371), 10 events, 4.0%
      
           syscall            calls  errors  total       min       avg       max       stddev
                                             (msec)    (msec)    (msec)    (msec)        (%)
           --------------- --------  ------ -------- --------- --------- ---------     ------
           futex                  4      1     0.077     0.000     0.019     0.075     95.46%
           write                  1      0     0.005     0.005     0.005     0.005      0.00%
      
        [root@quaco ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-k7kh2muo5oeg56yx446hnw9v@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8eded45f
  8. 10 10月, 2019 8 次提交
    • J
      libperf: Adopt perf_mmap__read_event() from tools/perf · 151ed5d7
      Jiri Olsa 提交于
      Move perf_mmap__read_event() from tools/perf to libperf and export it in
      the perf/mmap.h header.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191007125344.14268-13-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      151ed5d7
    • J
      libperf: Adopt perf_mmap__read_done() from tools/perf · 32fdc2ca
      Jiri Olsa 提交于
      Move perf_mmap__read_init() from tools/perf to libperf and export it in
      the perf/mmap.h header.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191007125344.14268-12-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      32fdc2ca
    • J
      libperf: Adopt perf_mmap__read_init() from tools/perf · 7c4d4182
      Jiri Olsa 提交于
      Move perf_mmap__read_init() from tools/perf to libperf and export it in
      perf/mmap.h header.
      
      And add pr_debug2()/pr_debug3() macros support, because the code is
      using them.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191007125344.14268-11-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7c4d4182
    • J
      libperf: Adopt perf_mmap__consume() function from tools/perf · 7728fa0c
      Jiri Olsa 提交于
      Move perf_mmap__consume() vrom tools/perf to libperf and export it in
      the perf/mmap.h header.
      
      Move also the needed helpers perf_mmap__write_tail(),
      perf_mmap__read_head() and perf_mmap__empty().
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191007125344.14268-10-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7728fa0c
    • A
      perf beauty: Introduce strtoul() for x86 MSRs · 728db198
      Arnaldo Carvalho de Melo 提交于
      Continuing from the previous cset comment, now that filter expression
      works:
      
        # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750
           0.000 Timer/5033 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
           0.009 Timer/5033 msr:write_msr(msr: LSTAR, val: -1398800368)
           0.010 Timer/5033 msr:write_msr(msr: TSC_AUX, val: 4)
           0.050 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
          45.661 gnome-terminal/12595 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
          45.672 gnome-terminal/12595 msr:write_msr(msr: LSTAR, val: -1398800368)
          45.675 gnome-terminal/12595 msr:write_msr(msr: TSC_AUX, val: 3)
          54.852 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         130.508 Timer/4050 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
         130.527 Timer/4050 msr:write_msr(msr: LSTAR, val: -1398800368)
         130.531 Timer/4050 msr:write_msr(msr: TSC_AUX, val: 3)
         140.924 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         164.738 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         603.578 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         620.809 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         690.115 JS Watchdog/4259 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
         690.136 JS Watchdog/4259 msr:write_msr(msr: LSTAR, val: -1398800368)
         690.141 JS Watchdog/4259 msr:write_msr(msr: TSC_AUX, val: 3)
         690.186 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
         759.016 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
      ^C[root@quaco ~]#
      
      Or look at the first 3 write_msr events for that IA32_TSC_DEADLINE to learn why
      it happens so often:
      
        # perf trace --max-events=3 --max-stack=8 -e msr:* --filter="msr==IA32_TSC_DEADLINE" --filter-pids 3750
           0.000 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296732550862)
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_trace_write_msr ([kernel.kallsyms])
                                             lapic_next_deadline ([kernel.kallsyms])
                                             clockevents_program_event ([kernel.kallsyms])
                                             hrtimer_interrupt ([kernel.kallsyms])
                                             smp_apic_timer_interrupt ([kernel.kallsyms])
                                             apic_timer_interrupt ([kernel.kallsyms])
                                             cpuidle_enter_state ([kernel.kallsyms])
          32.646 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296800134158)
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_trace_write_msr ([kernel.kallsyms])
                                             lapic_next_deadline ([kernel.kallsyms])
                                             clockevents_program_event ([kernel.kallsyms])
                                             hrtimer_start_range_ns ([kernel.kallsyms])
                                             tick_nohz_restart_sched_tick ([kernel.kallsyms])
                                             tick_nohz_idle_exit ([kernel.kallsyms])
                                             do_idle ([kernel.kallsyms])
          32.802 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19297507436922)
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_trace_write_msr ([kernel.kallsyms])
                                             lapic_next_deadline ([kernel.kallsyms])
                                             clockevents_program_event ([kernel.kallsyms])
                                             hrtimer_try_to_cancel ([kernel.kallsyms])
                                             hrtimer_cancel ([kernel.kallsyms])
                                             tick_nohz_restart_sched_tick ([kernel.kallsyms])
                                             tick_nohz_idle_exit ([kernel.kallsyms])
        #
      
      And if some of the strings can't be found:
      
        # trace -e msr:* --filter="msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750
        "SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION" not found for "msr" in "msr:read_msr", can't set filter "(msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 28131 && common_pid != 3750)"
        #
      
      Next step is to automatically wire up the pre-existing strarrays, which there
      are quite a few.
      
      The strtoul() methods will be further enhanced to allow for looking at other
      arguments in a syscall/tracepoint, just like going from integer to string
      (scnprintf methods), so that those "val" lines for the msr tracepoints can be
      properly formatted or even resolved into some string.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-4qaai5iqjgefd11k4ddm7qg8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      728db198
    • A
      perf trace: Expand strings in filters to integers · 90df0249
      Arnaldo Carvalho de Melo 提交于
      So that one can try things like:
      
        # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750
      
      That, at this point in the patchset, without any strtoul in place for
      tracepoint arguments, will result in:
      
        No resolver (strtoul) for "msr" in "msr:read_msr", can't set filter "(msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 25407 && common_pid != 3750)"
        #
      
      See you in the next cset!
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-dx5j70fv2rgkeezd1cb3hv2p@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      90df0249
    • A
      perf trace: Introduce a strtoul() method for 'struct strarrays' · d0a3a104
      Arnaldo Carvalho de Melo 提交于
      And also for 'struct strarray', since its needed to implement
      strarrays__strtoul(). This just traverses the entries and when finding a
      match, returns (offset + index), i.e. the value associated with the
      searched string.
      
      E.g. "EFER" (MSR_EFER) returns:
      
        # grep -w EFER -B2 /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
        #define x86_64_specific_MSRs_offset 0xc0000080
        static const char *x86_64_specific_MSRs[] = {
      	[0xc0000080 - x86_64_specific_MSRs_offset] = "EFER",
        #
      
        0xc0000080
      
      This will be auto-attached to 'struct syscall_arg_fmt' entries
      associated with strarrays as soon as we add a ->strarray and ->strarrays
      to 'struct syscall_arg_fmt'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-r2hpaahf8lishyb1owko9vs1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d0a3a104
    • A
      perf trace: Add a strtoul() method to 'struct syscall_arg_fmt' · 3f41b778
      Arnaldo Carvalho de Melo 提交于
      This will go from a string to a number, so that filter expressions can
      be constructed with strings and then, before applying the tracepoint
      filters (or eBPF, in the future) we can map those strings to numbers.
      
      The first one will be for 'msr' tracepoint arguments, but real quickly
      we will be able to reuse all strarrays for that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-wgqq48agcgr95b8dmn6fygtr@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3f41b778
  9. 09 10月, 2019 3 次提交
    • A
      perf trace: Introduce --filter for tracepoint events · d4097f19
      Arnaldo Carvalho de Melo 提交于
      Similar to what is in 'perf record', works just like there:
      
        # perf trace -e msr:*
         328.297 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.302 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.306 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.317 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.322 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.327 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.331 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.336 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888)
         328.340 :0/0 ^Cmsr:write_msr(msr: FS_BASE, val: 140240388381888)
        #
      
      So, for a system wide trace session looking at the write_msr tracepoint
      we see a flood of MSR_FS_BASE, we need to get the number for that:
      
        # grep FS_BASE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
      	[0xc0000100 - x86_64_specific_MSRs_offset] = "FS_BASE",
        #
      
      And then use it in a filter:
      
        # perf trace -e msr:* --filter="msr!=0xc0000100"
        <SNIP>
         942.177 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068232)
         942.199 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3057135655252)
         942.203 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068222)
         942.231 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056998373022)
         942.241 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068236)
        <SNIP>
        #
      
      Ok, lets filter that too, too noisy:
      
        # grep TSC_DEADLINE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
      	[0x000006E0] = "IA32_TSC_DEADLINE",
        #
      
        # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" -a sleep 0.1
           0.000 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
           0.066 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
           0.070 CPU 0/KVM/4895 msr:write_msr(msr: 0x830, val: 34359740667)
           0.099 CPU 0/KVM/4895 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199021993472)
           0.100 CPU 0/KVM/4895 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
           0.101 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR)
           0.109 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
           1.000 :0/0 msr:write_msr(msr: 0x830, val: 17179871485)
          18.893 :0/0 msr:write_msr(msr: 0x83f, val: 246)
          28.810 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
          40.117 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
          40.127 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR)
          40.139 CPU 0/KVM/4895 msr:write_msr(msr: LSTAR, val: -2130661312)
          40.141 CPU 0/KVM/4895 msr:write_msr(msr: SYSCALL_MASK, val: 14080)
          40.142 CPU 0/KVM/4895 msr:write_msr(msr: TSC_AUX)
          40.144 CPU 0/KVM/4895 msr:write_msr(msr: KERNEL_GS_BASE)
          40.147 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL)
          40.148 CPU 0/KVM/4895 msr:write_msr(msr: IA32_FLUSH_CMD, val: 1)
          40.151 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
        ^C
        #
      
      One can combine that with filtering pids as well:
      
        # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" --filter-pids 4895 -a sleep 0.09
           0.000 :0/0 msr:write_msr(msr: 0x830, val: 4294969597)
           0.291 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
           0.294 gnome-terminal/2790 msr:write_msr(msr: LSTAR, val: -1935671280)
           0.295 gnome-terminal/2790 msr:write_msr(msr: TSC_AUX, val: 6)
          10.940 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
          15.943 gnome-shell/2096 msr:write_msr(msr: 0x830, val: 4294969597)
          16.975 :0/0 msr:write_msr(msr: 0x830, val: 4294969597)
          19.560 :0/0 msr:write_msr(msr: 0x83f, val: 246)
          25.162 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
          25.807 JS Watchdog/3635 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
          25.820 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
          25.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
          26.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
          29.942 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
          45.313 :0/0 msr:write_msr(msr: 0x83f, val: 246)
          56.945 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
          60.946 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597)
          74.096 JS Watchdog/8971 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
          74.130 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL)
          79.673 :0/0 msr:write_msr(msr: 0x83f, val: 246)
          79.947 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 17179871485)
        #
      
      Or for just a pid, with callchains:
      
        # grep SYSCALL_MAS /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c
      	[0xc0000084 - x86_64_specific_MSRs_offset] = "SYSCALL_MASK",
        # perf trace -e msr:* --filter="msr==0xc0000084" --pid 2790 --call-graph=dwarf
      
           0.000 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_trace_write_msr ([kernel.kallsyms])
                                             kvm_on_user_return ([kvm])
                                             fire_user_return_notifiers ([kernel.kallsyms])
                                             exit_to_usermode_loop ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             entry_SYSCALL_64 ([kernel.kallsyms])
                                             __GI___poll (inlined)
        9299.073 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_trace_write_msr ([kernel.kallsyms])
                                             kvm_on_user_return ([kvm])
                                             fire_user_return_notifiers ([kernel.kallsyms])
                                             exit_to_usermode_loop ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             entry_SYSCALL_64 ([kernel.kallsyms])
                                             __GI___poll (inlined)
        9348.374 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608)
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_trace_write_msr ([kernel.kallsyms])
                                             kvm_on_user_return ([kvm])
                                             fire_user_return_notifiers ([kernel.kallsyms])
                                             exit_to_usermode_loop ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             entry_SYSCALL_64 ([kernel.kallsyms])
                                             __GI___poll (inlined)
        <SNIP>
        #
      
      Ok, just another form of KVM to emit MSRs :-)
      
      Next step: elliminate those greps by getting the filter expression,
      looking for arg names, then for the arrays associated with it to do a
      reverse lookup.
      
      Also allow those filters to be associated with strace-like syscall
      names.
      
      After that: augment the 'val' arg for 'msr:write_msr' based on the first
      arg, 'msr'.
      
      Then, do that with eBPF too, not just with tracepoint filters.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-95bfe5d4tzy5f66bx49d05rj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d4097f19
    • A
      perf trace: Associate the "msr" tracepoint arg name with x86_MSR__scnprintf() · c330ef28
      Arnaldo Carvalho de Melo 提交于
      So that we can go from:
      
        # perf trace -e msr:write_msr --max-stack=16 sleep 1
             0.000 sleep/6740 msr:write_msr(msr: 3221225728, val: 139636317451648)
                                               do_trace_write_msr ([kernel.kallsyms])
                                               do_trace_write_msr ([kernel.kallsyms])
                                               do_arch_prctl_64 ([kernel.kallsyms])
                                               __x64_sys_arch_prctl ([kernel.kallsyms])
                                               do_syscall_64 ([kernel.kallsyms])
                                               entry_SYSCALL_64 ([kernel.kallsyms])
                                               init_tls (/usr/lib64/ld-2.29.so)
                                               dl_main (/usr/lib64/ld-2.29.so)
                                               _dl_sysdep_start (/usr/lib64/ld-2.29.so)
                                               _dl_start (/usr/lib64/ld-2.29.so)
        #
      
      To:
      
        # perf trace -e msr:write_msr --max-stack=16 sleep 1
           0.000 sleep/8519 msr:write_msr(msr: FS_BASE, val: 139878031705472)
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_trace_write_msr ([kernel.kallsyms])
                                             do_arch_prctl_64 ([kernel.kallsyms])
                                             __x64_sys_arch_prctl ([kernel.kallsyms])
                                             do_syscall_64 ([kernel.kallsyms])
                                             entry_SYSCALL_64 ([kernel.kallsyms])
                                             init_tls (/usr/lib64/ld-2.29.so)
                                             dl_main (/usr/lib64/ld-2.29.so)
                                             _dl_sysdep_start (/usr/lib64/ld-2.29.so)
                                             _dl_start (/usr/lib64/ld-2.29.so)
        #
      
      This, in reverse, will allow for symbolic system call/tracepoint
      filtering.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-q1q4unmqja5ex7dy0kb5cjaa@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c330ef28
    • A
      perf trace: Allow associating scnprintf routines with well known arg names · 5d88099b
      Arnaldo Carvalho de Melo 提交于
      For instance 'msr' appears in several tracepoints, so we can associate
      it with a single scnprintf() routine auto-generated from kernel headers,
      as will be done in followup patches.
      
      Start with an empty array of associations.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-89ptht6s5fez82lykuwq1eyb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5d88099b
  10. 07 10月, 2019 1 次提交
    • A
      perf trace: Allow choosing how to augment the tracepoint arguments · f11b2803
      Arnaldo Carvalho de Melo 提交于
      So far we used the libtraceevent printing routines when showing
      tracepoint arguments, but since 'perf trace' has a lot of beautifiers
      for syscall arguments, and since some of those can be used to augment
      tracepoint arguments, add a routine to make use of those beautifiers
      and allow the user to choose which one to use.
      
      The default now is to use the same beautifiers used for the strace-like
      sys_enter+sys_exit lines, but the user can choose the libtraceevent ones
      by either using the:
      
          perf trace --libtraceevent_print
      
      command line option, or by setting:
      
        # cat ~/.perfconfig
        [trace]
      	tracepoint_beautifiers = libtraceevent
      
      For instance, here are some examples:
      
        # perf trace -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
             0.000 sched:sched_wakeup(comm: "perf", pid: 5273 (perf), prio: 120, success: 1, target_cpu: 6)
             0.621 nanosleep(rqtp: 0x7ffdd06d1140, rmtp: NULL) ...
             0.628 sched:sched_switch(prev_comm: "sleep", prev_pid: 5273 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/6", next_pid: 0, next_prio: 120)
          1000.879 sched:sched_wakeup(comm: "sleep", pid: 5273 (sleep), prio: 120, success: 1, target_cpu: 6)
             0.621  ... [continued]: nanosleep())          = 0
          1001.026 exit_group(error_code: 0)               = ?
          1001.216 sched:sched_process_exit(comm: "sleep", pid: 5273 (sleep), prio: 120)
        #
      
      And then using libtraceevent, as before:
      
        # perf trace --libtraceevent_print -e sched:*switch,*sleep,sched:*wakeup,exit*,sched:*exit sleep 1
             0.000 sched:sched_wakeup(comm=perf pid=5288 prio=120 target_cpu=001)
             0.739 nanosleep(rqtp: 0x7ffeba6c2f40, rmtp: NULL) ...
             0.747 sched:sched_switch(prev_comm=sleep prev_pid=5288 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120)
          1000.902 sched:sched_wakeup(comm=sleep pid=5288 prio=120 target_cpu=001)
             0.739  ... [continued]: nanosleep())          = 0
          1001.012 exit_group(error_code: 0)               = ?
        #
      
      The new default allocates an array of 'struct syscall_arg_fmt' for the
      tracepoint arguments and, just like with syscall arguments, tries to
      find suitable syscall_arg__scnprintf_NAME() routines to augment those
      tracepoint arguments based on their type (as in the tracefs "format"
      file), or even in their name + type, for instance arguntents with names
      ending in "fd" with type "int" get the fd scnprintf beautifier attached,
      etc.
      
      Soon this will take advantage of the kernel BTF information to augment
      enumerations based on the tracefs "format" type info.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-o8qdluotkcb3b1x2gjqrejcl@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f11b2803