1. 02 4月, 2019 13 次提交
    • A
      tools build: Implement libzstd feature check, LIBZSTD_DIR and NO_LIBZSTD defines · 3b1c5d96
      Alexey Budankov 提交于
      Implement libzstd feature check, NO_LIBZSTD and LIBZSTD_DIR defines to
      override Zstd library sources or disable the feature from the command
      line:
      
        $ make -C tools/perf LIBZSTD_DIR=/path/to/zstd/sources/ clean all
        $ make -C tools/perf NO_LIBZSTD=1 clean all
      
      Auto detection feature status is reported just before compilation
      starts.  If your system has some version of the zstd library
      preinstalled then the build system finds and uses it during the build.
      
      If you still prefer to compile with some other version of zstd library
      you have capability to refer the compilation to that version using
      LIBZSTD_DIR define.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/9b4cd8b0-10a3-1f1e-8d6b-5922a7ca216b@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3b1c5d96
    • T
      perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event to "tep" · 69769ce1
      Tzvetomir Stoyanov 提交于
      The member "pevent" of the struct tep_event is renamed to "tep". This
      makes the struct consistent with the chosen naming convention:
      
        tep (trace event parser), instead of the old pevent.
      Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-3-tstoyanov@vmware.com
      Link: http://lkml.kernel.org/r/20190401164344.627724996@goodmis.orgSigned-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      69769ce1
    • T
      tools tools, tools lib traceevent: Make traceevent APIs more consistent · 55c34ae0
      Tzvetomir Stoyanov 提交于
      Rename some traceevent APIs for consistency:
      
      tep_pid_is_registered() to tep_is_pid_registered()
      tep_file_bigendian() to tep_is_file_bigendian()
      
        to make the names and return values consistent with other tep_is_... APIs
      
      tep_data_lat_fmt() to tep_data_latency_format()
      
        to make the name more descriptive
      
      tep_host_bigendian() to tep_is_bigendian()
      tep_set_host_bigendian() to tep_set_local_bigendian()
      tep_is_host_bigendian() to tep_is_local_bigendian()
      
        "host" can be confused with VMs, and "local" is about the local
        machine. All tep_is_..._bigendian(struct tep_handle *tep) APIs return
        the saved data in the tep handle, while tep_is_bigendian() returns
        the running machine's endianness.
      
      All tep_is_... functions are modified to return bool value, instead of int.
      Signed-off-by: NTzvetomir Stoyanov <tstoyanov@vmware.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/20190327141946.4353-2-tstoyanov@vmware.com
      Link: http://lkml.kernel.org/r/20190401164344.288624897@goodmis.org
      [ Removed some extra parenthesis around return statements ]
      Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      55c34ae0
    • A
      perf list: Output tool events · 5e0861ba
      Andi Kleen 提交于
      Add support in 'perf list' to output tool internal events, currently
      only 'duration_time'.
      
      Committer testing:
      
        $ perf list dur*
      
        List of pre-defined events (to be used in -e):
      
          duration_time                                      [Tool event]
      
        Metric Groups:
      
        $ perf list sw
      
        List of pre-defined events (to be used in -e):
      
          alignment-faults                                   [Software event]
          bpf-output                                         [Software event]
          context-switches OR cs                             [Software event]
          cpu-clock                                          [Software event]
          cpu-migrations OR migrations                       [Software event]
          dummy                                              [Software event]
          emulation-faults                                   [Software event]
          major-faults                                       [Software event]
          minor-faults                                       [Software event]
          page-faults OR faults                              [Software event]
          task-clock                                         [Software event]
      
          duration_time                                      [Tool event]
      
        $ perf list | grep duration
          duration_time                                      [Tool event]
               [L1D miss outstandings duration in cycles]
                page walk duration are excluded in Skylake]
                load. EPT page walk duration are excluded in Skylake]
                page walk duration are excluded in Skylake]
                store. EPT page walk duration are excluded in Skylake]
                (instruction fetch) request. EPT page walk duration are excluded in
                instruction fetch request. EPT page walk duration are excluded in
        $
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190326221823.11518-5-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5e0861ba
    • A
      perf evsel: Support printing evsel name for 'duration_time' · 3371f389
      Andi Kleen 提交于
      Implement printing the correct name for duration_time
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190326221823.11518-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3371f389
    • A
      perf stat: Implement duration_time as a proper event · f0fbb114
      Andi Kleen 提交于
      The perf metric expression use 'duration_time' internally to normalize
      events.  Normal 'perf stat' without -x also prints the duration time.
      But when using -x, the interval is not output anywhere, which is
      inconvenient for any post processing which often wants to normalize
      values to time.
      
      So implement 'duration_time' as a proper perf event that can be
      specified explicitely with -e.
      
      The previous implementation of 'duration_time' only worked for metric
      processing. This adds the concept of a tool event that is handled by the
      tool. On the kernel level it is still mapped to the dummy software
      event, but the values are not read anymore, but instead computed by the
      tool.
      
      Add proper plumbing to handle this in the event parser, and display it
      in 'perf stat'. We don't want 'duration_time' to be added up, so it's
      only printed for the first CPU.
      
      % perf stat -e duration_time,cycles true
      
       Performance counter stats for 'true':
      
                 555,476 ns   duration_time
                 771,958      cycles
      
             0.000555476 seconds time elapsed
      
             0.000644000 seconds user
             0.000000000 seconds sys
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190326221823.11518-3-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f0fbb114
    • A
      perf stat: Revert checks for duration_time · c2b3c170
      Andi Kleen 提交于
      This reverts e864c5ca ("perf stat: Hide internal duration_time
      counter") but doing it manually since the code has now moved to a
      different file.
      
      The next patch will properly implement duration_time as a full event, so
      no need to hide it anymore.
      Signed-off-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c2b3c170
    • T
      perf list: Fix s390 counter long description for L1D_RO_EXCL_WRITES · 7fcfa9a2
      Thomas Richter 提交于
      Command
      
        # perf list --long-desc pmu
      
      lists the long description of the available counters. For counter
      named L1D_RO_EXCL_WRITES on machine types 3906 and 3907 the long
      description contains the counter number 'Counter:128 Name:'
      prefix. This is wrong.
      
      The fix changes the description text and removes this prefix.
      
      Output before:
      
        [root@m35lp76 perf]# ./perf list --long-desc pmu
         ...
         L1D_ONDRAWER_L4_SOURCED_WRITES
          [A directory write to the Level-1 Data cache directory where the
           returned cache line was sourced from On-Drawer Level-4 cache]
      
         L1D_RO_EXCL_WRITES
          [Counter:128 Name:L1D_RO_EXCL_WRITES A directory write to the Level-1
           Data cache where the line was originally in a Read-Only state in the
           cache but has been updated to be in the Exclusive state that allows
           stores to the cache line]
      
         ...
      
      Output after:
      
        [root@m35lp76 perf]# ./perf list --long-desc pmu
         ...
         L1D_ONDRAWER_L4_SOURCED_WRITES
          [A directory write to the Level-1 Data cache directory where the
           returned cache line was sourced from On-Drawer Level-4 cache]
      
         L1D_RO_EXCL_WRITES
          [L1D_RO_EXCL_WRITES A directory write to the Level-1
           Data cache where the line was originally in a Read-Only state in the
           cache but has been updated to be in the Exclusive state that allows
           stores to the cache line]
      
         ...
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Fixes: 109d59b9 ("perf vendor events s390: Add JSON files for IBM z14")
      Link: http://lkml.kernel.org/r/20190329133337.60255-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7fcfa9a2
    • A
      perf tools: Add header defining used namespace struct to event.h · 514c5403
      Arnaldo Carvalho de Melo 提交于
      When adding the 'struct namespaces_event' to event.h, referencing the
      'struct perf_ns_link_info' type, we forgot to add the header where it is
      defined, getting that definition only by sheer luck.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: f3b3614a ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
      Link: https://lkml.kernel.org/n/tip-qkrld0v7boc9uabjbd8csxux@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      514c5403
    • A
      perf trace beauty renameat: No need to include linux/fs.h · b64f1cc6
      Arnaldo Carvalho de Melo 提交于
      There is no use for what is in that file, as everything is
      built by the tools/perf/trace/beauty/rename_flags.sh script from
      the copied kernel headers, the end result being:
      
        $ cat /tmp/build/perf/trace/beauty/generated/rename_flags_array.c
        static const char *rename_flags[] = {
      	[0 + 1] = "NOREPLACE",
      	[1 + 1] = "EXCHANGE",
      	[2 + 1] = "WHITEOUT",
        };
        $
      
      I.e. no use of any defines from uapi/linux/fs.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-lgugmfa8z4bpw5zsbuoitllb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b64f1cc6
    • A
      perf augmented_raw_syscalls: Use a PERCPU_ARRAY map to copy more string bytes · 59f3bd78
      Arnaldo Carvalho de Melo 提交于
      The previous method, copying to the BPF stack limited us in how many
      bytes we could copy from strings, use a PERCPU_ARRAY map like devised by
      the sysdig guys[1] to copy more bytes:
      
      Before:
      
        # trace --no-inherit -e openat touch `python -c "print "$s" 'a' * 2000"`
        touch: cannot touch 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa': File name too long
        openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", O_CREAT|O_NOCTTY|O_NONBLOCK|O_WRONLY, S_IRUGO|S_IWUGO) = -1 ENAMETOOLONG (File name too long)
        openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
        <SNIP some openat calls>
        #
      
      After:
      
        [root@quaco acme]# trace --no-inherit -e openat touch `python -c "print "$s" 'a' * 2000"`
        <STRIP what is the same as in the 'before' part>
        openat(AT_FDCWD, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", O_CREAT|O_NOCTTY|O_NONBLOC) = -1 ENAMETOOLONG (File name too long)
        <STRIP what is the same as in the 'before' part>
      
      If we leave something like 'perf trace -e string' to trace all syscalls
      with a string, and then do some 'perf top', to get some annotation for
      the augmented_raw_syscalls.o BPF program we get:
      
             │     → callq  *ffffffffc45576d1                                                                                                          ▒
             │                augmented_args->filename.size = probe_read_str(&augmented_args->filename.value,                                          ▒
        0.05 │       mov    %eax,0x40(%r13)
      
      Looking with pahole, expanding types, asking for hex offsets and sizes,
      and use of BTF type information to see what is at that 0x40 offset from
      %r13:
      
        # pahole -F btf -C augmented_args_filename --expand_types --hex /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        struct augmented_args_filename {
      	struct syscall_enter_args {
      		long long unsigned int common_tp_fields;                                 /*     0   0x8 */
      		long int           syscall_nr;                                           /*   0x8   0x8 */
      		long unsigned int  args[6];                                              /*  0x10  0x30 */
      	} args; /*     0  0x40 */
      	/* --- cacheline 1 boundary (64 bytes) --- */
      	struct augmented_filename {
      		unsigned int       size;                                                 /*  0x40   0x4 */
      		int                reserved;                                             /*  0x44   0x4 */
      		char               value[4096];                                          /*  0x48 0x1000 */
      	} filename; /*  0x40 0x1008 */
      
      	/* size: 4168, cachelines: 66, members: 2 */
      	/* last cacheline: 8 bytes */
        };
        #
      
      Then looking if PATH_MAX leaves some signature in the tests:
      
             │                if (augmented_args->filename.size < sizeof(augmented_args->filename.value)) {                                            ▒
             │       cmp    $0xfff,%rdi
      
      0xfff == 4095
      sizeof(augmented_args->filename.value) == PATH_MAX == 4096
      
      [1] https://sysdig.com/blog/the-art-of-writing-ebpf-programs-a-primer/
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Daniel Borkmann <borkmann@iogearbox.net>
      Cc: Gianluca Borello <g.borello@gmail.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      cc: Martin Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/n/tip-76gce2d2ghzq537ubwhjkone@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      59f3bd78
    • A
      perf augmented_raw_syscalls: Copy strings from all syscalls with 1st or 2nd string arg · c52a82f7
      Arnaldo Carvalho de Melo 提交于
      Gets the augmented_raw_syscalls a bit more useful as-is, add a comment
      stating that the intent is to have all this in a map populated by
      userspace via the 'syscalls' BPF map, that right now has only a flag
      stating if the syscall is filtered or not.
      
      With it:
      
        # grep -B1 augmented_raw ~/.perfconfig
        [trace]
      	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
        #
        # perf trace -e string
        weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10)  = 0
        gnome-shell/1943 openat(AT_FDCWD, "/proc/self/stat", O_RDONLY) = 81
        weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10)  = 0
        gmain/2475 inotify_add_watch(20<anon_inode:inotify>, "/home/acme/.config/firewall", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/cache/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/lib/app-info/xmls", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/lib/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/share/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/local/share/app-info/xmls", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/local/share/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/home/acme/.local/share/app-info/yaml", 16789454) = -1 ENOENT (No such file or directory)
        gmain/1121 inotify_add_watch(12<anon_inode:inotify>, "/etc/NetworkManager/VPN", 16789454) = -1 ENOENT (No such file or directory)
        weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10)  = 0
        gmain/2050 inotify_add_watch(8<anon_inode:inotify>, "/home/acme/~", 16789454) = -1 ENOENT (No such file or directory)
        gmain/2521 inotify_add_watch(6<anon_inode:inotify>, "/var/lib/fwupd/remotes.d/lvfs-testing", 16789454) = -1 ENOENT (No such file or directory)
        weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10)  = 0
        DOM Worker/22714  ... [continued]: openat())             = 257
        FS Broker 3982/3990 openat(AT_FDCWD, "/dev/urandom", O_RDONLY|O_CLOEXEC|O_NOCTTY) = 187
        DOMCacheThread/16652 mkdir("/home/acme/.mozilla/firefox/ina67tev.default/storage/default/https+++web.whatsapp.com/cache/morgue/192", S_IRUGO|S_IXUGO|S_IWUSR) = -1 EEXIST (File exists)
        ^C#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-a1hxffoy8t43e0wq6bzhp23u@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c52a82f7
    • A
      perf trace: Add 'string' event alias to select syscalls with string args · 2b64b2ed
      Arnaldo Carvalho de Melo 提交于
      Will be used in conjunction with the change to augmented_raw_syscalls.c
      in the next cset that adds all syscalls with a first or second arg
      string.
      
      With just what we have in the syscall tracepoints we get:
      
        # perf trace -e string ls > /dev/null
               ? (         ): ls/22382  ... [continued]: execve())                                           = 0
           0.043 ( 0.004 ms): ls/22382 access(filename: 0x51ad420, mode: R)                                  = -1 ENOENT (No such file or directory)
           0.051 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51aa8b3, flags: RDONLY|CLOEXEC)          = 3
           0.071 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51b4d00, flags: RDONLY|CLOEXEC)          = 3
           0.138 ( 0.009 ms): ls/22382 openat(dfd: CWD, filename: 0x51684d0, flags: RDONLY|CLOEXEC)          = 3
           0.192 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51689c0, flags: RDONLY|CLOEXEC)          = 3
           0.255 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x5168eb0, flags: RDONLY|CLOEXEC)          = 3
           0.342 ( 0.003 ms): ls/22382 openat(dfd: CWD, filename: 0x51693a0, flags: RDONLY|CLOEXEC)          = 3
           0.380 ( 0.003 ms): ls/22382 openat(dfd: CWD, filename: 0x5169950, flags: RDONLY|CLOEXEC)          = 3
           0.670 ( 0.011 ms): ls/22382 statfs(pathname: 0x515c783, buf: 0x7fff54d75b70)                      = 0
           0.683 ( 0.005 ms): ls/22382 statfs(pathname: 0x515c783, buf: 0x7fff54d75a60)                      = 0
           0.725 ( 0.004 ms): ls/22382 access(filename: 0x515c7ab)                                           = 0
           0.744 ( 0.005 ms): ls/22382 openat(dfd: CWD, filename: 0x50fba20, flags: RDONLY|CLOEXEC)          = 3
           0.793 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x9e3e8390, flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 3
           0.921 ( 0.006 ms): ls/22382 openat(dfd: CWD, filename: 0x50f7d90)                                 = 3
        #
      
      If we put the vfs_getname probe point in place:
      
        # perf probe 'vfs_getname=getname_flags:73 pathname=result->name:string'
        Added new events:
          probe:vfs_getname    (on getname_flags:73 with pathname=result->name:string)
          probe:vfs_getname_1  (on getname_flags:73 with pathname=result->name:string)
      
        You can now use it in all perf tools, such as:
      
      	perf record -e probe:vfs_getname_1 -aR sleep 1
      
        # perf trace -e string ls > /dev/null
               ? (         ): ls/22440  ... [continued]: execve())                                           = 0
           0.048 ( 0.008 ms): ls/22440 access(filename: /etc/ld.so.preload, mode: R)                         = -1 ENOENT (No such file or directory)
           0.061 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: RDONLY|CLOEXEC)   = 3
           0.092 ( 0.008 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libselinux.so.1, flags: RDONLY|CLOEXEC) = 3
           0.165 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libcap.so.2, flags: RDONLY|CLOEXEC) = 3
           0.216 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: RDONLY|CLOEXEC)   = 3
           0.282 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libpcre2-8.so.0, flags: RDONLY|CLOEXEC) = 3
           0.340 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libdl.so.2, flags: RDONLY|CLOEXEC)  = 3
           0.383 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libpthread.so.0, flags: RDONLY|CLOEXEC) = 3
           0.697 ( 0.021 ms): ls/22440 statfs(pathname: /sys/fs/selinux, buf: 0x7ffee7dc9010)                = 0
           0.720 ( 0.007 ms): ls/22440 statfs(pathname: /sys/fs/selinux, buf: 0x7ffee7dc8f00)                = 0
           0.757 ( 0.007 ms): ls/22440 access(filename: /etc/selinux/config)                                 = 0
           0.779 ( 0.009 ms): ls/22440 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: RDONLY|CLOEXEC) = 3
           0.830 ( 0.006 ms): ls/22440 openat(dfd: CWD, filename: ., flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 3
           0.958 ( 0.010 ms): ls/22440 openat(dfd: CWD, filename: /usr/lib64/gconv/gconv-modules.cache)      = 3
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-6fh1myvn7ulf4xwq9iz3o776@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2b64b2ed
  2. 29 3月, 2019 9 次提交
    • K
      perf pmu: Fix parser error for uncore event alias · e94d6b7f
      Kan Liang 提交于
      Perf fails to parse uncore event alias, for example:
      
        # perf stat -e unc_m_clockticks -a --no-merge sleep 1
        event syntax error: 'unc_m_clockticks'
                             \___ parser error
      
      Current code assumes that the event alias is from one specific PMU.
      
      To find the PMU, perf strcmps the PMU name of event alias with the real
      PMU name on the system.
      
      However, the uncore event alias may be from multiple PMUs with common
      prefix. The PMU name of uncore event alias is the common prefix.
      
      For example, UNC_M_CLOCKTICKS is clock event for iMC, which include 6
      PMUs with the same prefix "uncore_imc" on a skylake server.
      
      The real PMU names on the system for iMC are uncore_imc_0 ...
      uncore_imc_5.
      
      The strncmp is used to only check the common prefix for uncore event
      alias.
      
      With the patch:
      
        # perf stat -e unc_m_clockticks -a --no-merge sleep 1
        Performance counter stats for 'system wide':
      
             723,594,722      unc_m_clockticks [uncore_imc_5]
             724,001,954      unc_m_clockticks [uncore_imc_3]
             724,042,655      unc_m_clockticks [uncore_imc_1]
             724,161,001      unc_m_clockticks [uncore_imc_4]
             724,293,713      unc_m_clockticks [uncore_imc_2]
             724,340,901      unc_m_clockticks [uncore_imc_0]
      
             1.002090060 seconds time elapsed
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: stable@vger.kernel.org
      Fixes: ea1fa48c ("perf stat: Handle different PMU names with common prefix")
      Link: http://lkml.kernel.org/r/1552672814-156173-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e94d6b7f
    • A
      perf scripts python: exported-sql-viewer.py: Fix python3 support · 606bd60a
      Adrian Hunter 提交于
      Unlike python2, python3 strings are not compatible with byte strings.
      That results in disassembly not working for the branches reports. Fixup
      those places overlooked in the port to python3.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Fixes: beda0e72 ("perf script python: Add Python3 support to exported-sql-viewer.py")
      Link: http://lkml.kernel.org/r/20190327072826.19168-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      606bd60a
    • A
      perf scripts python: exported-sql-viewer.py: Fix never-ending loop · 8453c936
      Adrian Hunter 提交于
      pyside version 1 fails to handle python3 large integers in some cases,
      resulting in Qt getting into a never-ending loop. This affects:
      	samples Table
      	samples_view Table
      	All branches Report
      	Selected branches Report
      
      Add workarounds for those cases.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Fixes: beda0e72 ("perf script python: Add Python3 support to exported-sql-viewer.py")
      Link: http://lkml.kernel.org/r/20190327072826.19168-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8453c936
    • W
      perf machine: Update kernel map address and re-order properly · 977c7a6d
      Wei Li 提交于
      Since commit 1fb87b8e ("perf machine: Don't search for active kernel
      start in __machine__create_kernel_maps"), the __machine__create_kernel_maps()
      just create a map what start and end are both zero. Though the address will be
      updated later, the order of map in the rbtree may be incorrect.
      
      The commit ee05d217 ("perf machine: Set main kernel end address properly")
      fixed the logic in machine__create_kernel_maps(), but it's still wrong in
      function machine__process_kernel_mmap_event().
      
      To reproduce this issue, we need an environment which the module address
      is before the kernel text segment. I tested it on an aarch64 machine with
      kernel 4.19.25:
      
        [root@localhost hulk]# grep _stext /proc/kallsyms
        ffff000008081000 T _stext
        [root@localhost hulk]# grep _etext /proc/kallsyms
        ffff000009780000 R _etext
        [root@localhost hulk]# tail /proc/modules
        hisi_sas_v2_hw 77824 0 - Live 0xffff00000191d000
        nvme_core 126976 7 nvme, Live 0xffff0000018b6000
        mdio 20480 1 ixgbe, Live 0xffff0000018ab000
        hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0xffff000001861000
        hns_mdio 20480 2 - Live 0xffff000001822000
        hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0xffff000001815000
        dm_mirror 40960 0 - Live 0xffff000001804000
        dm_region_hash 32768 1 dm_mirror, Live 0xffff0000017f5000
        dm_log 32768 2 dm_mirror,dm_region_hash, Live 0xffff0000017e7000
        dm_mod 315392 17 dm_mirror,dm_log, Live 0xffff000001780000
        [root@localhost hulk]#
      
      Before fix:
      
        [root@localhost bin]# perf record sleep 3
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
        [root@localhost bin]# perf buildid-list -i perf.data
        4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso]
        [root@localhost bin]# perf buildid-list -i perf.data -H
        0000000000000000000000000000000000000000 /proc/kcore
        [root@localhost bin]#
      
      After fix:
      
        [root@localhost tools]# ./perf/perf record sleep 3
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ]
        [root@localhost tools]# ./perf/perf buildid-list -i perf.data
        28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms]
        106c14ce6e4acea3453e484dc604d66666f08a2f [vdso]
        [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H
        28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore
      Signed-off-by: NWei Li <liwei391@huawei.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Li Bin <huawei.libin@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei391@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      977c7a6d
    • A
      tools headers: Update x86's syscall_64.tbl and uapi/asm-generic/unistd · 8142bd82
      Arnaldo Carvalho de Melo 提交于
      To pick up the changes introduced in the following csets:
      
        2b188cc1 ("Add io_uring IO interface")
        edafccee ("io_uring: add support for pre-mapped user IO buffers")
        3eb39f47 ("signal: add pidfd_send_signal() syscall")
      
      This makes 'perf trace' to become aware of these new syscalls, so that
      one can use them like 'perf trace -e ui_uring*,*signal' to do a system
      wide strace-like session looking at those syscalls, for instance.
      
      For example:
      
        # perf trace -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla
      
         Summary of events:
      
         io_uring-cp (383), 1208866 events, 100.0%
      
           syscall         calls   total    min     avg     max   stddev
                                   (msec) (msec)  (msec)  (msec)     (%)
           -------------- ------ -------- ------ ------- -------  ------
           io_uring_enter 605780 2955.615  0.000   0.005  33.804   1.94%
           openat              4  459.446  0.004 114.861 459.435 100.00%
           munmap              4    0.073  0.009   0.018   0.042  44.03%
           mmap               10    0.054  0.002   0.005   0.026  43.24%
           brk                28    0.038  0.001   0.001   0.003   7.51%
           io_uring_setup      1    0.030  0.030   0.030   0.030   0.00%
           mprotect            4    0.014  0.002   0.004   0.005  14.32%
           close               5    0.012  0.001   0.002   0.004  28.87%
           fstat               3    0.006  0.001   0.002   0.003  35.83%
           read                4    0.004  0.001   0.001   0.002  13.58%
           access              1    0.003  0.003   0.003   0.003   0.00%
           lseek               3    0.002  0.001   0.001   0.001   9.00%
           arch_prctl          2    0.002  0.001   0.001   0.001   0.69%
           execve              1    0.000  0.000   0.000   0.000   0.00%
        #
        # perf trace -e io_uring* -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla
      
         Summary of events:
      
         io_uring-cp (390), 1191250 events, 100.0%
      
           syscall         calls   total    min    avg    max  stddev
                                   (msec) (msec) (msec) (msec)    (%)
           -------------- ------ -------- ------ ------ ------ ------
           io_uring_enter 597093 2706.060  0.001  0.005 14.761  1.10%
           io_uring_setup      1    0.038  0.038  0.038  0.038  0.00%
        #
      
      More work needed to make the tools/perf/examples/bpf/augmented_raw_syscalls.c
      BPF program to copy the 'struct io_uring_params' arguments to perf's ring
      buffer so that 'perf trace' can use the BTF info put in place by pahole's
      conversion of the kernel DWARF and then auto-beautify those arguments.
      
      This patch produces the expected change in the generated syscalls table
      for x86_64:
      
        --- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before	2019-03-26 13:37:46.679057774 -0300
        +++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c	2019-03-26 13:38:12.755990383 -0300
        @@ -334,5 +334,9 @@ static const char *syscalltbl_x86_64[] =
         	[332] = "statx",
         	[333] = "io_pgetevents",
         	[334] = "rseq",
        +	[424] = "pidfd_send_signal",
        +	[425] = "io_uring_setup",
        +	[426] = "io_uring_enter",
        +	[427] = "io_uring_register",
         };
        -#define SYSCALLTBL_x86_64_MAX_ID 334
        +#define SYSCALLTBL_x86_64_MAX_ID 427
      
      This silences these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
        diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
        Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Christian Brauner <christian@brauner.io>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/n/tip-p0ars3otuc52x5iznf21shhw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8142bd82
    • A
      tools headers uapi: Sync asm-generic/mman-common.h and linux/mman.h · be709d48
      Arnaldo Carvalho de Melo 提交于
      To deal with the move of some defines from asm-generic/mmap-common.h to
      linux/mman.h done in:
      
        746c9398 ("arch: move common mmap flags to linux/mman.h")
      
      The generated mmap_flags array stays the same:
      
        $ tools/perf/trace/beauty/mmap_flags.sh
        static const char *mmap_flags[] = {
      	[ilog2(0x40) + 1] = "32BIT",
      	[ilog2(0x01) + 1] = "SHARED",
      	[ilog2(0x02) + 1] = "PRIVATE",
      	[ilog2(0x10) + 1] = "FIXED",
      	[ilog2(0x20) + 1] = "ANONYMOUS",
      	[ilog2(0x100000) + 1] = "FIXED_NOREPLACE",
      	[ilog2(0x0100) + 1] = "GROWSDOWN",
      	[ilog2(0x0800) + 1] = "DENYWRITE",
      	[ilog2(0x1000) + 1] = "EXECUTABLE",
      	[ilog2(0x2000) + 1] = "LOCKED",
      	[ilog2(0x4000) + 1] = "NORESERVE",
      	[ilog2(0x8000) + 1] = "POPULATE",
      	[ilog2(0x10000) + 1] = "NONBLOCK",
      	[ilog2(0x20000) + 1] = "STACK",
      	[ilog2(0x40000) + 1] = "HUGETLB",
      	[ilog2(0x80000) + 1] = "SYNC",
        };
        $
      
      And to have the system's sys/mman.h find the definition of MAP_SHARED
      and MAP_PRIVATE, make sure they are defined in the tools/ mman-common.h
      in a way that keeps it the same as the kernel's, need for keeping the
      Android's NDK cross build working.
      
      This silences these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h'
        diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h
        Warning: Kernel ABI header at 'tools/include/uapi/linux/mman.h' differs from latest version at 'include/uapi/linux/mman.h'
        diff -u tools/include/uapi/linux/mman.h include/uapi/linux/mman.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-h80ycpc6pedg9s5z2rwpy6ws@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      be709d48
    • J
      perf evsel: Fix max perf_event_attr.precise_ip detection · 4e8a5c15
      Jiri Olsa 提交于
      After a discussion with Andi, move the perf_event_attr.precise_ip
      detection for maximum precise config (via :P modifier or for default
      cycles event) to perf_evsel__open().
      
      The current detection in perf_event_attr__set_max_precise_ip() is
      tricky, because precise_ip config is specific for given event and it
      currently checks only hw cycles.
      
      We now check for valid precise_ip value right after failing
      sys_perf_event_open() for specific event, before any of the
      perf_event_attr fallback code gets executed.
      
      This way we get the proper config in perf_event_attr together with
      allowed precise_ip settings.
      
      We can see that code activity with -vv, like:
      
        $ perf record -vv ls
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          ...
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
          ksymbol                          1
        ------------------------------------------------------------
        sys_perf_event_open: pid 9926  cpu 0  group_fd -1  flags 0x8
        sys_perf_event_open failed, error -95
        decreasing precise_ip by one (2)
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          ...
          precise_ip                       2
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
          ksymbol                          1
        ------------------------------------------------------------
        sys_perf_event_open: pid 9926  cpu 0  group_fd -1  flags 0x8 = 4
        ...
      Suggested-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/n/tip-dkvxxbeg7lu74155d4jhlmc9@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4e8a5c15
    • A
      perf intel-pt: Fix TSC slip · f3b4e06b
      Adrian Hunter 提交于
      A TSC packet can slip past MTC packets so that the timestamp appears to
      go backwards. One estimate is that can be up to about 40 CPU cycles,
      which is certainly less than 0x1000 TSC ticks, but accept slippage an
      order of magnitude more to be on the safe side.
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: 79b58424 ("perf tools: Add Intel PT support for decoding MTC packets")
      Link: http://lkml.kernel.org/r/20190325135135.18348-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f3b4e06b
    • S
      perf cs-etm: Add missing case value · c8fa7a80
      Solomon Tan 提交于
      The following error was thrown when compiling `tools/perf` using OpenCSD
      v0.11.1. This patch fixes said error.
      
          CC       util/intel-pt-decoder/intel-pt-log.o
          CC       util/cs-etm-decoder/cs-etm-decoder.o
        util/cs-etm-decoder/cs-etm-decoder.c: In function
        ‘cs_etm_decoder__buffer_range’:
        util/cs-etm-decoder/cs-etm-decoder.c:370:2: error: enumeration value
        ‘OCSD_INSTR_WFI_WFE’ not handled in switch [-Werror=switch-enum]
          switch (elem->last_i_type) {
          ^~~~~~
          CC       util/intel-pt-decoder/intel-pt-decoder.o
        cc1: all warnings being treated as errors
      
      Because `OCSD_INSTR_WFI_WFE` case was added only in v0.11.0, the minimum
      required OpenCSD library version for this patch is no longer v0.10.0.
      Signed-off-by: NSolomon Tan <solomonbobstoner@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: Suzuki K Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20190322052255.GA4809@w-OptiPlex-7050Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c8fa7a80
  3. 21 3月, 2019 6 次提交
    • S
      perf bpf: Show more BPF program info in print_bpf_prog_info() · f8dfeae0
      Song Liu 提交于
      This patch enables showing bpf program name, address, and size in the
      header.
      
      Before the patch:
      
        perf report --header-only
        ...
        # bpf_prog_info of id 9
        # bpf_prog_info of id 10
        # bpf_prog_info of id 13
      
      After the patch:
      
        # bpf_prog_info 9: bpf_prog_7be49e3934a125ba addr 0xffffffffa0024947 size 229
        # bpf_prog_info 10: bpf_prog_2a142ef67aaad174 addr 0xffffffffa007c94d size 229
        # bpf_prog_info 13: bpf_prog_47368425825d7384_task__task_newt addr 0xffffffffa0251137 size 369
      
      Committer notes:
      
      Fix the fallback definition when HAVE_LIBBPF_SUPPORT is not defined,
      i.e. add the missing 'static inline' and add the __maybe_unused to the
      args. Also add stdio.h since we now use FILE * in bpf-event.h.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190319165454.1298742-3-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f8dfeae0
    • S
      perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog() · fc462ac7
      Song Liu 提交于
      Extract logic to create program names to synthesize_bpf_prog_name(), so
      that it can be reused in header.c:print_bpf_prog_info().
      
      This commit doesn't change the behavior.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190319165454.1298742-2-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fc462ac7
    • S
      perf tools: Save bpf_prog_info and BTF of new BPF programs · d56354dc
      Song Liu 提交于
      To fully annotate BPF programs with source code mapping, 4 different
      information are needed:
      
          1) PERF_RECORD_KSYMBOL
          2) PERF_RECORD_BPF_EVENT
          3) bpf_prog_info
          4) btf
      
      This patch handles 3) and 4) for BPF programs loaded after 'perf
      record|top'.
      
      For timely process of these information, a dedicated event is added to
      the side band evlist.
      
      When PERF_RECORD_BPF_EVENT is received via the side band event, the
      polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env.
      
      This information is saved to perf.data at the end of 'perf record'.
      
      Committer testing:
      
      The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a
      unnamed union, so can't be used in a struct designated initialization
      with older gccs, get it out of that, isolating as 'attr.wakeup_watermark
      = 1;' to work with all gcc versions.
      
      We also need to add '--no-bpf-event' to the 'perf record'
      perf_event_attr tests in 'perf test', as the way that that test goes is
      to intercept the events being setup and looking if they match the fields
      described in the control files, since now it finds first the side band
      event used to catch the PERF_RECORD_BPF_EVENT, they all fail.
      
      With these issues fixed:
      
      Same scenario as for testing BPF programs loaded before 'perf record' or
      'perf top' starts, only start the BPF programs after 'perf record|top',
      so that its information get collected by the sideband threads, the rest
      works as for the programs loaded before start monitoring.
      
      Add missing 'inline' to the bpf_event__add_sb_event() when
      HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without
      binutils devel files installed.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d56354dc
    • S
      perf evlist: Introduce side band thread · 657ee553
      Song Liu 提交于
      This patch introduces side band thread that captures extended
      information for events like PERF_RECORD_BPF_EVENT.
      
      This new thread uses its own evlist that uses ring buffer with very low
      watermark for lower latency.
      
      To use side band thread, we need to:
      
      1. add side band event(s) by calling perf_evlist__add_sb_event();
      2. calls perf_evlist__start_sb_thread();
      3. at the end of perf run, perf_evlist__stop_sb_thread().
      
      In the next patch, we use this thread to handle PERF_RECORD_BPF_EVENT.
      
      Committer notes:
      
      Add fix by Jiri Olsa for when te sb_tread can't get started and then at
      the end the stop_sb_thread() segfaults when joining the (non-existing)
      thread.
      
      That can happen when running 'perf top' or 'perf record' as a normal
      user, for instance.
      
      Further checks need to be done on top of this to more graciously handle
      these possible failure scenarios.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-15-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      657ee553
    • S
      perf annotate: Enable annotation of BPF programs · 6987561c
      Song Liu 提交于
      In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
      a new function symbol__disassemble_bpf(), where annotation line
      information is filled based on the bpf_prog_info and btf data saved in
      given perf_env.
      
      symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
      programs.
      
      Committer testing:
      
      After fixing this:
      
        -               u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
        +               u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms);
      
      Detected when crossbuilding to a 32-bit arch.
      
      And making all this dependent on HAVE_LIBBFD_SUPPORT and
      HAVE_LIBBPF_SUPPORT:
      
      1) Have a BPF program running, one that has BTF info, etc, I used
         the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
         by 'perf trace'.
      
        # grep -B1 augmented_raw ~/.perfconfig
        [trace]
      	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
        #
        # perf trace -e *mmsg
        dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
        NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2
      
      2) Then do a 'perf record' system wide for a while:
      
        # perf record -a
        ^C[ perf record: Woken up 68 times to write data ]
        [ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
        #
      
      3) Check that we captured BPF and BTF info in the perf.data file:
      
        # perf report --header-only | grep 'b[pt]f'
        # event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
        # bpf_prog_info of id 13
        # bpf_prog_info of id 14
        # bpf_prog_info of id 15
        # bpf_prog_info of id 16
        # bpf_prog_info of id 17
        # bpf_prog_info of id 18
        # bpf_prog_info of id 21
        # bpf_prog_info of id 22
        # bpf_prog_info of id 41
        # bpf_prog_info of id 42
        # btf info of id 2
        #
      
      4) Check which programs got recorded:
      
         # perf report | grep bpf_prog | head
           0.16%  exe              bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.14%  exe              bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.08%  fuse-overlayfs   bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.07%  fuse-overlayfs   bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.01%  clang-4.0        bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.01%  clang-4.0        bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.00%  clang            bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
           0.00%  runc             bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.00%  clang            bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
           0.00%  sh               bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
        #
      
        This was with the default --sort order for 'perf report', which is:
      
          --sort comm,dso,symbol
      
        If we just look for the symbol, for instance:
      
         # perf report --sort symbol | grep bpf_prog | head
           0.26%  [k] bpf_prog_819967866022f1e1_sys_enter                -      -
           0.24%  [k] bpf_prog_c1bd85c092d6e4aa_sys_exit                 -      -
         #
      
        or the DSO:
      
         # perf report --sort dso | grep bpf_prog | head
           0.26%  bpf_prog_819967866022f1e1_sys_enter
           0.24%  bpf_prog_c1bd85c092d6e4aa_sys_exit
        #
      
      We'll see the two BPF programs that augmented_raw_syscalls.o puts in
      place,  one attached to the raw_syscalls:sys_enter and another to the
      raw_syscalls:sys_exit tracepoints, as expected.
      
      Now we can finally do, from the command line, annotation for one of
      those two symbols, with the original BPF program source coude intermixed
      with the disassembled JITed code:
      
        # perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter
      
        Samples: 950  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period]
        bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
        Percent      int sys_enter(struct syscall_enter_args *args)
         53.41         push   %rbp
      
          0.63         mov    %rsp,%rbp
          0.31         sub    $0x170,%rsp
          1.93         sub    $0x28,%rbp
          7.02         mov    %rbx,0x0(%rbp)
          3.20         mov    %r13,0x8(%rbp)
          1.07         mov    %r14,0x10(%rbp)
          0.61         mov    %r15,0x18(%rbp)
          0.11         xor    %eax,%eax
          1.29         mov    %rax,0x20(%rbp)
          0.11         mov    %rdi,%rbx
                     	return bpf_get_current_pid_tgid();
          2.02       → callq  *ffffffffda6776d9
          2.76         mov    %eax,-0x148(%rbp)
                       mov    %rbp,%rsi
                     int sys_enter(struct syscall_enter_args *args)
                       add    $0xfffffffffffffeb8,%rsi
                     	return bpf_map_lookup_elem(pids, &pid) != NULL;
                       movabs $0xffff975ac2607800,%rdi
      
          1.26       → callq  *ffffffffda6789e9
                       cmp    $0x0,%rax
          2.43       → je     0
                       add    $0x38,%rax
          0.21         xor    %r13d,%r13d
                     	if (pid_filter__has(&pids_filtered, getpid()))
          0.81         cmp    $0x0,%rax
                     → jne    0
                       mov    %rbp,%rdi
                     	probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
          2.22         add    $0xfffffffffffffeb8,%rdi
          0.11         mov    $0x40,%esi
          0.32         mov    %rbx,%rdx
          2.74       → callq  *ffffffffda658409
                     	syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
          0.22         mov    %rbp,%rsi
          1.69         add    $0xfffffffffffffec0,%rsi
                     	syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
                       movabs $0xffff975bfcd36000,%rdi
      
                       add    $0xd0,%rdi
          0.21         mov    0x0(%rsi),%eax
          0.93         cmp    $0x200,%rax
                     → jae    0
          0.10         shl    $0x3,%rax
      
          0.11         add    %rdi,%rax
          0.11       → jmp    0
                       xor    %eax,%eax
                     	if (syscall == NULL || !syscall->enabled)
          1.07         cmp    $0x0,%rax
                     → je     0
                     	if (syscall == NULL || !syscall->enabled)
          6.57         movzbq 0x0(%rax),%rdi
      
                     	if (syscall == NULL || !syscall->enabled)
                       cmp    $0x0,%rdi
          0.95       → je     0
                       mov    $0x40,%r8d
                     	switch (augmented_args.args.syscall_nr) {
                       mov    -0x140(%rbp),%rdi
                     	switch (augmented_args.args.syscall_nr) {
                       cmp    $0x2,%rdi
                     → je     0
                       cmp    $0x101,%rdi
                     → je     0
                       cmp    $0x15,%rdi
                     → jne    0
                     	case SYS_OPEN:	 filename_arg = (const void *)args->args[0];
                       mov    0x10(%rbx),%rdx
                     → jmp    0
                     	case SYS_OPENAT: filename_arg = (const void *)args->args[1];
                       mov    0x18(%rbx),%rdx
                     	if (filename_arg != NULL) {
                       cmp    $0x0,%rdx
                     → je     0
                       xor    %edi,%edi
                     		augmented_args.filename.reserved = 0;
                       mov    %edi,-0x104(%rbp)
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    %rbp,%rdi
                       add    $0xffffffffffffff00,%rdi
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    $0x100,%esi
                     → callq  *ffffffffda658499
                       mov    $0x148,%r8d
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    %eax,-0x108(%rbp)
                     		augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                       mov    %rax,%rdi
                       shl    $0x20,%rdi
      
                       shr    $0x20,%rdi
      
                     		if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
                       cmp    $0xff,%rdi
                     → ja     0
                     			len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
                       add    $0x48,%rax
                     			len &= sizeof(augmented_args.filename.value) - 1;
                       and    $0xff,%rax
                       mov    %rax,%r8
                       mov    %rbp,%rcx
                     	return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
                       add    $0xfffffffffffffeb8,%rcx
                       mov    %rbx,%rdi
                       movabs $0xffff975fbd72d800,%rsi
      
                       mov    $0xffffffff,%edx
                     → callq  *ffffffffda658ad9
                       mov    %rax,%r13
                     }
                       mov    %r13,%rax
          0.72         mov    0x0(%rbp),%rbx
                       mov    0x8(%rbp),%r13
          1.16         mov    0x10(%rbp),%r14
          0.10         mov    0x18(%rbp),%r15
          0.42         add    $0x28,%rbp
          0.54         leaveq
          0.54       ← retq
        #
      
      Please see 'man perf-config' to see how to control what should be seen,
      via ~/.perfconfig [annotate] section, for instance, one can suppress the
      source code and see just the disassembly, etc.
      
      Alternatively, use the TUI bu just using 'perf annotate', press
      '/bpf_prog' to see the bpf symbols, press enter and do the interactive
      annotation, which allows for dumping to a file after selecting the
      the various output tunables, for instance, the above without source code
      intermixed, plus showing all the instruction offsets:
      
        # perf annotate bpf_prog_819967866022f1e1_sys_enter
      
      Then press: 's' to hide the source code + 'O' twice to show all
      instruction offsets, then 'P' to print to the
      bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:
      
        # cat bpf_prog_819967866022f1e1_sys_enter.annotation
        bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
        Event: cycles:ppp
      
         53.41    0:   push   %rbp
      
          0.63    1:   mov    %rsp,%rbp
          0.31    4:   sub    $0x170,%rsp
          1.93    b:   sub    $0x28,%rbp
          7.02    f:   mov    %rbx,0x0(%rbp)
          3.20   13:   mov    %r13,0x8(%rbp)
          1.07   17:   mov    %r14,0x10(%rbp)
          0.61   1b:   mov    %r15,0x18(%rbp)
          0.11   1f:   xor    %eax,%eax
          1.29   21:   mov    %rax,0x20(%rbp)
          0.11   25:   mov    %rdi,%rbx
          2.02   28: → callq  *ffffffffda6776d9
          2.76   2d:   mov    %eax,-0x148(%rbp)
                 33:   mov    %rbp,%rsi
                 36:   add    $0xfffffffffffffeb8,%rsi
                 3d:   movabs $0xffff975ac2607800,%rdi
      
          1.26   47: → callq  *ffffffffda6789e9
                 4c:   cmp    $0x0,%rax
          2.43   50: → je     0
                 52:   add    $0x38,%rax
          0.21   56:   xor    %r13d,%r13d
          0.81   59:   cmp    $0x0,%rax
                 5d: → jne    0
                 63:   mov    %rbp,%rdi
          2.22   66:   add    $0xfffffffffffffeb8,%rdi
          0.11   6d:   mov    $0x40,%esi
          0.32   72:   mov    %rbx,%rdx
          2.74   75: → callq  *ffffffffda658409
          0.22   7a:   mov    %rbp,%rsi
          1.69   7d:   add    $0xfffffffffffffec0,%rsi
                 84:   movabs $0xffff975bfcd36000,%rdi
      
                 8e:   add    $0xd0,%rdi
          0.21   95:   mov    0x0(%rsi),%eax
          0.93   98:   cmp    $0x200,%rax
                 9f: → jae    0
          0.10   a1:   shl    $0x3,%rax
      
          0.11   a5:   add    %rdi,%rax
          0.11   a8: → jmp    0
                 aa:   xor    %eax,%eax
          1.07   ac:   cmp    $0x0,%rax
                 b0: → je     0
          6.57   b6:   movzbq 0x0(%rax),%rdi
      
                 bb:   cmp    $0x0,%rdi
          0.95   bf: → je     0
                 c5:   mov    $0x40,%r8d
                 cb:   mov    -0x140(%rbp),%rdi
                 d2:   cmp    $0x2,%rdi
                 d6: → je     0
                 d8:   cmp    $0x101,%rdi
                 df: → je     0
                 e1:   cmp    $0x15,%rdi
                 e5: → jne    0
                 e7:   mov    0x10(%rbx),%rdx
                 eb: → jmp    0
                 ed:   mov    0x18(%rbx),%rdx
                 f1:   cmp    $0x0,%rdx
                 f5: → je     0
                 f7:   xor    %edi,%edi
                 f9:   mov    %edi,-0x104(%rbp)
                 ff:   mov    %rbp,%rdi
                102:   add    $0xffffffffffffff00,%rdi
                109:   mov    $0x100,%esi
                10e: → callq  *ffffffffda658499
                113:   mov    $0x148,%r8d
                119:   mov    %eax,-0x108(%rbp)
                11f:   mov    %rax,%rdi
                122:   shl    $0x20,%rdi
      
                126:   shr    $0x20,%rdi
      
                12a:   cmp    $0xff,%rdi
                131: → ja     0
                133:   add    $0x48,%rax
                137:   and    $0xff,%rax
                13d:   mov    %rax,%r8
                140:   mov    %rbp,%rcx
                143:   add    $0xfffffffffffffeb8,%rcx
                14a:   mov    %rbx,%rdi
                14d:   movabs $0xffff975fbd72d800,%rsi
      
                157:   mov    $0xffffffff,%edx
                15c: → callq  *ffffffffda658ad9
                161:   mov    %rax,%r13
                164:   mov    %r13,%rax
          0.72  167:   mov    0x0(%rbp),%rbx
                16b:   mov    0x8(%rbp),%r13
          1.16  16f:   mov    0x10(%rbp),%r14
          0.10  173:   mov    0x18(%rbp),%r15
          0.42  177:   add    $0x28,%rbp
          0.54  17b:   leaveq
          0.54  17c: ← retq
      
      Another cool way to test all this is to symple use 'perf top' look for
      those symbols, go there and press enter, annotate it live :-)
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6987561c
    • S
      perf build: Check what binutils's 'disassembler()' signature to use · 8a1b1718
      Song Liu 提交于
      Commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils
      repo, which changed the disassembler() function signature, so we must
      use the feature test introduced in fb982666 ("tools/bpftool: fix
      bpftool build with bintutils >= 2.9") to deal with that.
      
      Committer testing:
      
      After adding the missing function call to test-all.c, and:
      
        FEATURE_CHECK_LDFLAGS-disassembler-four-args = -bfd -lopcodes
      
      And the fallbacks for cases where we need -liberty and sometimes -lz to
      tools/perf/Makefile.config, we get:
      
        $ make -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j8' parallel build
      
        Auto-detecting system features:
        ...                         dwarf: [ on  ]
        ...            dwarf_getlocations: [ on  ]
        ...                         glibc: [ on  ]
        ...                          gtk2: [ on  ]
        ...                      libaudit: [ on  ]
        ...                        libbfd: [ on  ]
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ on  ]
        ...                       libperl: [ on  ]
        ...                     libpython: [ on  ]
        ...                      libslang: [ on  ]
        ...                     libcrypto: [ on  ]
        ...                     libunwind: [ on  ]
        ...            libdw-dwarf-unwind: [ on  ]
        ...                          zlib: [ on  ]
        ...                          lzma: [ on  ]
        ...                     get_cpuid: [ on  ]
        ...                           bpf: [ on  ]
        ...                        libaio: [ on  ]
        ...        disassembler-four-args: [ on  ]
          CC       /tmp/build/perf/jvmti/libjvmti.o
          CC       /tmp/build/perf/builtin-bench.o
        <SNIP>
        $
        $
      
      The feature detection test-all.bin gets successfully built and linked:
      
        $ ls -la /tmp/build/perf/feature/test-all.bin
        -rwxrwxr-x. 1 acme acme 2680352 Mar 19 11:07 /tmp/build/perf/feature/test-all.bin
        $ nm /tmp/build/perf/feature/test-all.bin  | grep -w disassembler
        0000000000061f90 T disassembler
        $
      
      Time to move on to the patches that make use of this disassembler()
      routine in binutils's libopcodes.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com
      [ split from a larger patch, added missing FEATURE_CHECK_LDFLAGS-disassembler-four-args ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8a1b1718
  4. 20 3月, 2019 12 次提交
    • S
      perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation · 3ca3877a
      Song Liu 提交于
      This patch adds processing of PERF_BPF_EVENT_PROG_LOAD, which sets
      proper DSO type/id/etc of memory regions mapped to BPF programs to
      DSO_BINARY_TYPE__BPF_PROG_INFO.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-14-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3ca3877a
    • S
      perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO · 9b86d04d
      Song Liu 提交于
      Introduce a new dso type DSO_BINARY_TYPE__BPF_PROG_INFO for BPF programs. In
      symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso will call into a new
      function symbol__disassemble_bpf() in an upcoming patch, where annotation line
      information is filled based bpf_prog_info and btf saved in given perf_env.
      
      Committer notes:
      
      Removed the unnamed union with 'bpf_prog' and 'cache' in 'struct dso',
      to fix this bug when exiting 'perf top':
      
        # perf top
        perf: Segmentation fault
        -------- backtrace --------
        perf[0x5a785a]
        /lib64/libc.so.6(+0x385bf)[0x7fd68443c5bf]
        perf(rb_first+0x2b)[0x4d6eeb]
        perf(dso__delete+0xb7)[0x4dffb7]
        perf[0x4f9e37]
        perf(perf_session__delete+0x64)[0x504df4]
        perf(cmd_top+0x1957)[0x454467]
        perf[0x4aad18]
        perf(main+0x61c)[0x42ec7c]
        /lib64/libc.so.6(__libc_start_main+0xf2)[0x7fd684428412]
        perf(_start+0x2d)[0x42eead]
        #
        # addr2line -fe ~/bin/perf 0x4dffb7
        dso_cache__free
        /home/acme/git/perf/tools/perf/util/dso.c:713
      
      That is trying to access the dso->data.cache, and that is not used with
      BPF programs, so we end up accessing what is in bpf_prog.first_member,
      b00m.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com
      [ split from a larger patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9b86d04d
    • S
      perf feature detection: Add -lopcodes to feature-libbfd · 31be9478
      Song Liu 提交于
      Both libbfd and libopcodes are distributed with binutil-dev/devel. When
      libbfd is present, it is OK to assume that libopcodes also present. This
      has been a safe assumption for bpftool.
      
      This patch adds -lopcodes to perf/Makefile.config. libopcodes will be
      used in the next commit for BPF annotation.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-12-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      31be9478
    • S
      perf top: Add option --no-bpf-event · ee7a112f
      Song Liu 提交于
      This patch adds option --no-bpf-event to 'perf top', which is the same
      as the option of 'perf record'.
      
      The following patches will use this option.
      
      Committer testing:
      
        # perf top -vv 2> /tmp/perf_event_attr.out
        # cat  /tmp/perf_event_attr.out
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          mmap                             1
          comm                             1
          freq                             1
          task                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
          ksymbol                          1
          bpf_event                        1
        ------------------------------------------------------------
        #
      
      After this patch:
      
        # perf top --no-bpf-event -vv 2> /tmp/perf_event_attr.out
        # cat  /tmp/perf_event_attr.out
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          mmap                             1
          comm                             1
          freq                             1
          task                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
          mmap2                            1
          comm_exec                        1
          ksymbol                          1
        ------------------------------------------------------------
        #
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-11-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ee7a112f
    • S
      perf bpf: Save BTF information as headers to perf.data · a70a1123
      Song Liu 提交于
      This patch enables 'perf record' to save BTF information as headers to
      perf.data.
      
      A new header type HEADER_BPF_BTF is introduced for this data.
      
      Committer testing:
      
      As root, being on the kernel sources top level directory, run:
      
          # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg
      
      Just to compile and load a BPF program that attaches to the
      raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending
      in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc).
      
      Make sure you have a recent enough clang, say version 9, to get the
      BTF ELF sections needed for this testing:
      
        # clang --version | head -1
        clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
        # readelf -SW tools/perf/examples/bpf/augmented_raw_syscalls.o | grep BTF
          [22] .BTF              PROGBITS        0000000000000000 000ede 000b0e 00      0   0  1
          [23] .BTF.ext          PROGBITS        0000000000000000 0019ec 0002a0 00      0   0  1
          [24] .rel.BTF.ext      REL             0000000000000000 002fa8 000270 10     30  23  8
      
      Then do a systemwide perf record session for a few seconds:
      
        # perf record -a sleep 2s
      
      Then look at:
      
        # perf report --header-only | grep b[pt]f
        # event : name = cycles:ppp, , id = { 1116204, 1116205, 1116206, 1116207, 1116208, 1116209, 1116210, 1116211 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
        # bpf_prog_info of id 13
        # bpf_prog_info of id 14
        # bpf_prog_info of id 15
        # bpf_prog_info of id 16
        # bpf_prog_info of id 17
        # bpf_prog_info of id 18
        # bpf_prog_info of id 21
        # bpf_prog_info of id 22
        # bpf_prog_info of id 51
        # bpf_prog_info of id 52
        # btf info of id 8
        #
      
      We need to show more info about these BPF and BTF entries , but that can
      be done later.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-10-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a70a1123
    • S
      perf bpf: Save BTF in a rbtree in perf_env · 3792cb2f
      Song Liu 提交于
      BTF contains information necessary to annotate BPF programs. This patch
      saves BTF for BPF programs loaded in the system.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-9-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3792cb2f
    • S
      perf bpf: Save bpf_prog_info information as headers to perf.data · 606f972b
      Song Liu 提交于
      This patch enables perf-record to save bpf_prog_info information as
      headers to perf.data. A new header type HEADER_BPF_PROG_INFO is
      introduced for this data.
      
      Committer testing:
      
      As root, being on the kernel sources top level directory, run:
      
        # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg
      
      Just to compile and load a BPF program that attaches to the
      raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending
      in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc).
      
      Then do a systemwide perf record session for a few seconds:
      
        # perf record -a sleep 2s
      
      Then look at:
      
        # perf report --header-only | grep -i bpf
        # bpf_prog_info of id 13
        # bpf_prog_info of id 14
        # bpf_prog_info of id 15
        # bpf_prog_info of id 16
        # bpf_prog_info of id 17
        # bpf_prog_info of id 18
        # bpf_prog_info of id 21
        # bpf_prog_info of id 22
        # bpf_prog_info of id 208
        # bpf_prog_info of id 209
        #
      
      We need to show more info about these programs, like bpftool does for
      the ones running on the system, i.e. 'perf record/perf report' become a
      way of saving the BPF state in a machine to then analyse on another,
      together with all the other information that is already saved in the
      perf.data header:
      
        # perf report --header-only
        # ========
        # captured on    : Tue Mar 12 11:42:13 2019
        # header version : 1
        # data offset    : 296
        # data size      : 16294184
        # feat offset    : 16294480
        # hostname : quaco
        # os release : 5.0.0+
        # perf version : 5.0.gd783c8
        # arch : x86_64
        # nrcpus online : 8
        # nrcpus avail : 8
        # cpudesc : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
        # cpuid : GenuineIntel,6,142,10
        # total memory : 24555720 kB
        # cmdline : /home/acme/bin/perf (deleted) record -a
        # event : name = cycles:ppp, , id = { 3190123, 3190124, 3190125, 31901264, 3190127, 3190128, 3190129, 3190130 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1
        # CPU_TOPOLOGY info available, use -I to display
        # NUMA_TOPOLOGY info available, use -I to display
        # pmu mappings: intel_pt = 8, software = 1, power = 11, uprobe = 7, uncore_imc = 12, cpu = 4, cstate_core = 18, uncore_cbox_2 = 15, breakpoint = 5, uncore_cbox_0 = 13, tracepoint = 2, cstate_pkg = 19, uncore_arb = 17, kprobe = 6, i915 = 10, msr = 9, uncore_cbox_3 = 16, uncore_cbox_1 = 14
        # CACHE info available, use -I to display
        # time of first sample : 116392.441701
        # time of last sample : 116400.932584
        # sample duration :   8490.883 ms
        # MEM_TOPOLOGY info available, use -I to display
        # bpf_prog_info of id 13
        # bpf_prog_info of id 14
        # bpf_prog_info of id 15
        # bpf_prog_info of id 16
        # bpf_prog_info of id 17
        # bpf_prog_info of id 18
        # bpf_prog_info of id 21
        # bpf_prog_info of id 22
        # bpf_prog_info of id 208
        # bpf_prog_info of id 209
        # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT
        # ========
        #
      
      Committer notes:
      
      We can't use the libbpf unconditionally, as the build may have been with
      NO_LIBBPF, when we end up with linking errors, so provide dummy
      {process,write}_bpf_prog_info() wrapped by HAVE_LIBBPF_SUPPORT for that
      case.
      
      Printing are not affected by this, so can continue as is.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-8-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      606f972b
    • S
      perf bpf: Save bpf_prog_info in a rbtree in perf_env · e4378f0c
      Song Liu 提交于
      bpf_prog_info contains information necessary to annotate bpf programs.
      
      This patch saves bpf_prog_info for bpf programs loaded in the system.
      
      Some big picture of the next few patches:
      
      To fully annotate BPF programs with source code mapping, 4 different
      informations are needed:
      
          1) PERF_RECORD_KSYMBOL
          2) PERF_RECORD_BPF_EVENT
          3) bpf_prog_info
          4) btf
      
      Before this set, 1) and 2) in the list are already saved to perf.data
      file. For BPF programs that are already loaded before perf run, 1) and 2)
      are synthesized by perf_event__synthesize_bpf_events(). For short living
      BPF programs, 1) and 2) are generated by kernel.
      
      This set handles 3) and 4) from the list. Again, it is necessary to handle
      existing BPF program and short living program separately.
      
      This patch handles 3) for exising BPF programs while synthesizing 1) and
      2) in perf_event__synthesize_bpf_events(). These data are stored in
      perf_env. The next patch saves these data from perf_env to perf.data as
      headers.
      
      Similarly, the two patches after the next saves 4) of existing BPF
      programs to perf_env and perf.data.
      
      Another patch later will handle 3) and 4) for short living BPF programs
      by monitoring 1) and 2) in a dedicate thread.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-7-songliubraving@fb.com
      [ set env->bpf_progs.infos_cnt to zero in perf_env__purge_bpf() as noted by jolsa ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e4378f0c
    • S
      perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool · e5416950
      Song Liu 提交于
      This patch changes the arguments of perf_event__synthesize_bpf_events()
      to include perf_session* instead of perf_tool*. perf_session will be
      used in the next patch.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: kernel-team@fb.com
      Link: http://lkml.kernel.org/r/20190312053051.2690567-6-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e5416950
    • S
      perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear() · a742258a
      Song Liu 提交于
      With bpf_program__get_prog_info_linear, we can simplify the logic that
      synthesizes bpf events.
      
      This patch doesn't change the behavior of the code.
      
      Commiter notes:
      
      Needed this (for all four variables), suggested by Song, to overcome
      build failure on debian experimental cross building to MIPS 32-bit:
      
        -               u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(info->prog_tags);
        +               u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(uintptr_t)(info->prog_tags);
      
        util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
        util/bpf-event.c:143:35: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
           u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(info->prog_tags);
                                           ^
        util/bpf-event.c:144:22: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
           __u32 *prog_lens = (__u32 *)(info->jited_func_lens);
                              ^
        util/bpf-event.c:145:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
           __u64 *prog_addrs = (__u64 *)(info->jited_ksyms);
                               ^
        util/bpf-event.c:146:22: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
           void *func_infos = (void *)(info->func_info);
                              ^
        cc1: all warnings being treated as errors
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: kernel-team@fb.com
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-5-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a742258a
    • S
      perf record: Replace option --bpf-event with --no-bpf-event · 71184c6a
      Song Liu 提交于
      Currently, monitoring of BPF programs through bpf_event is off by
      default for 'perf record'.
      
      To turn it on, the user need to use option "--bpf-event".  As BPF gets
      wider adoption in different subsystems, this option becomes
      inconvenient.
      
      This patch makes bpf_event on by default, and adds option "--no-bpf-event"
      to turn it off. Since option --bpf-event is not released yet, it is safe
      to remove it.
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: kernel-team@fb.com
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Link: http://lkml.kernel.org/r/20190312053051.2690567-2-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      71184c6a
    • C
      perf tests: Fix a memory leak in test__perf_evsel__tp_sched_test() · d982b331
      Changbin Du 提交于
        =================================================================
        ==20875==ERROR: LeakSanitizer: detected memory leaks
      
        Direct leak of 1160 byte(s) in 1 object(s) allocated from:
            #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138)
            #1 0x55bd50005599 in zalloc util/util.h:23
            #2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327
            #3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216
            #4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69
            #5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358
            #6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388
            #7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583
            #8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722
            #9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302
            #10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354
            #11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398
            #12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520
            #13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
      
        Indirect leak of 19 byte(s) in 1 object(s) allocated from:
            #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30)
            #1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f)
      Signed-off-by: NChangbin Du <changbin.du@gmail.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Fixes: 6a6cd11d ("perf test: Add test for the sched tracepoint format fields")
      Link: http://lkml.kernel.org/r/20190316080556.3075-17-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d982b331