1. 19 11月, 2019 14 次提交
    • M
      perf probe: Support DW_AT_const_value constant value · 66f69b21
      Masami Hiramatsu 提交于
      Support DW_AT_const_value for variable assignment instead of location.
      Note that this requires ftrace supporting immediate value.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157406476012.24476.16096289871757175775.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      66f69b21
    • M
      perf probe: Support multiprobe event · 72363540
      Masami Hiramatsu 提交于
      Support multiprobe event if the event is based on function and lines and
      kernel supports it. In this case, perf probe creates the first probe
      with an event, and tries to append following probes on that event, since
      those probes must be on the same source code line.
      
      Before this patch;
      
        # perf probe -a vfs_read:18
        Added new events:
          probe:vfs_read_L18   (on vfs_read:18)
          probe:vfs_read_L18_1 (on vfs_read:18)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:vfs_read_L18_1 -aR sleep 1
      
        #
      
      After this patch (on multiprobe supported kernel)
        # perf probe -a vfs_read:18
        Added new events:
          probe:vfs_read_L18   (on vfs_read:18)
          probe:vfs_read_L18   (on vfs_read:18)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:vfs_read_L18 -aR sleep 1
      
        #
      
      Committer testing:
      
      On a kernel that doesn't support multiprobe events, after this patch:
      
        # uname -a
        Linux quaco 5.3.8-200.fc30.x86_64 #1 SMP Tue Oct 29 14:46:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
        # grep append /sys/kernel/debug/tracing/README
        	    be modified by appending '.descending' or '.ascending' to a
        	    can be modified by appending any of the following modifiers
        #
        # perf probe -a vfs_read:18
        Added new events:
          probe:vfs_read_L18   (on vfs_read:18)
          probe:vfs_read_L18_1 (on vfs_read:18)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:vfs_read_L18_1 -aR sleep 1
      
        # perf probe -l
          probe:vfs_read_L18   (on vfs_read:18@fs/read_write.c)
          probe:vfs_read_L18_1 (on vfs_read:18@fs/read_write.c)
        #
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157406475010.24476.586290752591512351.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      72363540
    • M
      perf probe: Generate event name with line number · 15354d54
      Masami Hiramatsu 提交于
      Generate event name from function name with line number as
      <function>_L<line_number>. Note that this is only for the new event
      which is defined by the line number of function (except for line 0).
      
      If there is another event on same line, you have to use
      "-f" option. In that case, the new event has "_1" suffix.
      
       e.g.
        # perf probe -a kernel_read:2
        Added new event:
          probe:kernel_read_L2 (on kernel_read:2)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:kernel_read_L2 -aR sleep 1
      
      But if we omit the line number or 0th line, it will
      have no suffix.
      
        # perf probe -a kernel_read:0
        Added new event:
          probe:kernel_read (on kernel_read)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:kernel_read -aR sleep 1
      
        probe:kernel_read    (on kernel_read@linux-5.0.0/fs/read_write.c)
        probe:kernel_read_L2 (on kernel_read:2@linux-5.0.0/fs/read_write.c)
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157406474026.24476.2828897745502059569.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      15354d54
    • M
      perf probe: Do not show non representive lines by perf-probe -L · 499144c8
      Masami Hiramatsu 提交于
      Since perf probe -L shows non representive lines, it can be mislead
      users where user can put probes.  This prevents to show such non
      representive lines so that user can understand which lines user can
      probe.
      
        # perf probe -L kernel_read
        <kernel_read@/build/linux-pvZVvI/linux-5.0.0/fs/read_write.c:0>
              0  ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
                 {
              2         mm_segment_t old_fs;
                        ssize_t result;
      
                        old_fs = get_fs();
              6         set_fs(get_ds());
                        /* The cast to a user pointer is valid due to the set_fs() */
              8         result = vfs_read(file, (void __user *)buf, count, pos);
              9         set_fs(old_fs);
             10         return result;
                 }
                 EXPORT_SYMBOL(kernel_read);
      
      Committer testing:
      
      Before:
      
        # perf probe -L kernel_read
        <kernel_read@/usr/src/debug/kernel-5.3.fc30/linux-5.3.8-200.fc30.x86_64/fs/read_write.c:0>
              0  ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
              1  {
              2         mm_segment_t old_fs;
              3         ssize_t result;
      
              5         old_fs = get_fs();
              6         set_fs(KERNEL_DS);
                        /* The cast to a user pointer is valid due to the set_fs() */
              8         result = vfs_read(file, (void __user *)buf, count, pos);
              9         set_fs(old_fs);
             10         return result;
                 }
                 EXPORT_SYMBOL(kernel_read);
        #
      
      See the 1, 3, 5 lines? They shouldn't be there, after this patch:
      
        # perf probe -L kernel_read
        <kernel_read@/usr/src/debug/kernel-5.3.fc30/linux-5.3.8-200.fc30.x86_64/fs/read_write.c:0>
              0  ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos)
                 {
              2         mm_segment_t old_fs;
                        ssize_t result;
      
                        old_fs = get_fs();
              6         set_fs(KERNEL_DS);
                        /* The cast to a user pointer is valid due to the set_fs() */
              8         result = vfs_read(file, (void __user *)buf, count, pos);
              9         set_fs(old_fs);
             10         return result;
                 }
                 EXPORT_SYMBOL(kernel_read);
        #
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157406473064.24476.2913278267727587314.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      499144c8
    • M
      perf probe: Verify given line is a representive line · 1ae5d88a
      Masami Hiramatsu 提交于
      Verify user given probe line is a representive line (which doesn't share
      the address with other lines or the line is the least line among the
      lines which shares same address), and if not, it shows what is the
      representive line.
      
      Without this fix, user can put a probe on the lines which is not a a
      representive line. But since this is not a representive line, perf probe
      -l shows a representive line number instead of user given line number.
      e.g. (put kernel_read:3, but listed as kernel_read:2)
      
        # perf probe -a kernel_read:3
        Added new event:
          probe:kernel_read    (on kernel_read:3)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:kernel_read -aR sleep 1
      
        # perf probe -l
          probe:kernel_read    (on kernel_read:2@linux-5.0.0/fs/read_write.c)
      
      With this fix, perf probe doesn't allow user to put a probe on a
      representive line, and tell what is the representive line.
      
        # perf probe -a kernel_read:3
        This line is sharing the addrees with other lines.
        Please try to probe at kernel_read:2 instead.
          Error: Failed to add events.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157406472071.24476.14915451439785001021.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1ae5d88a
    • M
      perf probe: Show correct statement line number by perf probe -l · 57f95bf5
      Masami Hiramatsu 提交于
      The dwarf_getsrc_die() can return the line which is not a statement nor
      the least line number among the lines which shares same address.
      
      This can lead perf probe --list shows incorrect line number for probed
      address.
      
      To fix this, this introduces cu_getsrc_die() which returns only a
      statement line and which is the least line number (we call it the
      representive line for an address), and use it in cu_find_lineinfo().
      
      Also, if the given address is the entry address of a real function,
      cu_find_lineinfo() returns the function declared line number instead of
      the start line number of the function body.
      
      For example, without this change perf probe -l shows incorrect line as
      below.
      
        # perf probe -a kernel_read:2
        Added new event:
          probe:kernel_read    (on kernel_read:2)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe:kernel_read -aR sleep 1
      
        # perf probe -l
          probe:kernel_read    (on kernel_read:1@linux-5.0.0/fs/read_write.c)
      
      With this fix, it shows correct line number as below;
      
        # perf probe -l
          probe:kernel_read    (on kernel_read:2@linux-5.0.0/fs/read_write.c)
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Link: http://lore.kernel.org/lkml/157406471067.24476.17463149618465494448.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      57f95bf5
    • A
      x86/insn: Add some Intel instructions to the opcode map · b980be18
      Adrian Hunter 提交于
      Add to the opcode map the following instructions:
              cldemote
              tpause
              umonitor
              umwait
              movdiri
              movdir64b
              enqcmd
              enqcmds
              encls
              enclu
              enclv
              pconfig
              wbnoinvd
      
      For information about the instructions, refer Intel SDM May 2019
      (325462-070US) and Intel Architecture Instruction Set Extensions
      May 2019 (319433-037).
      
      The instruction decoding can be tested using the perf tools'
      "x86 instruction decoder - new instructions" test as folllows:
      
        $ perf test -v "new " 2>&1 | grep -i cldemote
        Decoded ok: 0f 1c 00                    cldemote (%eax)
        Decoded ok: 0f 1c 05 78 56 34 12        cldemote 0x12345678
        Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%eax,%ecx,8)
        Decoded ok: 0f 1c 00                    cldemote (%rax)
        Decoded ok: 41 0f 1c 00                 cldemote (%r8)
        Decoded ok: 0f 1c 04 25 78 56 34 12     cldemote 0x12345678
        Decoded ok: 0f 1c 84 c8 78 56 34 12     cldemote 0x12345678(%rax,%rcx,8)
        Decoded ok: 41 0f 1c 84 c8 78 56 34 12  cldemote 0x12345678(%r8,%rcx,8)
        $ perf test -v "new " 2>&1 | grep -i tpause
        Decoded ok: 66 0f ae f3                 tpause %ebx
        Decoded ok: 66 0f ae f3                 tpause %ebx
        Decoded ok: 66 41 0f ae f0              tpause %r8d
        $ perf test -v "new " 2>&1 | grep -i umonitor
        Decoded ok: 67 f3 0f ae f0              umonitor %ax
        Decoded ok: f3 0f ae f0                 umonitor %eax
        Decoded ok: 67 f3 0f ae f0              umonitor %eax
        Decoded ok: f3 0f ae f0                 umonitor %rax
        Decoded ok: 67 f3 41 0f ae f0           umonitor %r8d
        $ perf test -v "new " 2>&1 | grep -i umwait
        Decoded ok: f2 0f ae f0                 umwait %eax
        Decoded ok: f2 0f ae f0                 umwait %eax
        Decoded ok: f2 41 0f ae f0              umwait %r8d
        $ perf test -v "new " 2>&1 | grep -i movdiri
        Decoded ok: 0f 38 f9 03                 movdiri %eax,(%ebx)
        Decoded ok: 0f 38 f9 88 78 56 34 12     movdiri %ecx,0x12345678(%eax)
        Decoded ok: 48 0f 38 f9 03              movdiri %rax,(%rbx)
        Decoded ok: 48 0f 38 f9 88 78 56 34 12  movdiri %rcx,0x12345678(%rax)
        $ perf test -v "new " 2>&1 | grep -i movdir64b
        Decoded ok: 66 0f 38 f8 18              movdir64b (%eax),%ebx
        Decoded ok: 66 0f 38 f8 88 78 56 34 12  movdir64b 0x12345678(%eax),%ecx
        Decoded ok: 67 66 0f 38 f8 1c           movdir64b (%si),%bx
        Decoded ok: 67 66 0f 38 f8 8c 34 12     movdir64b 0x1234(%si),%cx
        Decoded ok: 66 0f 38 f8 18              movdir64b (%rax),%rbx
        Decoded ok: 66 0f 38 f8 88 78 56 34 12  movdir64b 0x12345678(%rax),%rcx
        Decoded ok: 67 66 0f 38 f8 18           movdir64b (%eax),%ebx
        Decoded ok: 67 66 0f 38 f8 88 78 56 34 12       movdir64b 0x12345678(%eax),%ecx
        $ perf test -v "new " 2>&1 | grep -i enqcmd
        Decoded ok: f2 0f 38 f8 18              enqcmd (%eax),%ebx
        Decoded ok: f2 0f 38 f8 88 78 56 34 12  enqcmd 0x12345678(%eax),%ecx
        Decoded ok: 67 f2 0f 38 f8 1c           enqcmd (%si),%bx
        Decoded ok: 67 f2 0f 38 f8 8c 34 12     enqcmd 0x1234(%si),%cx
        Decoded ok: f3 0f 38 f8 18              enqcmds (%eax),%ebx
        Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%eax),%ecx
        Decoded ok: 67 f3 0f 38 f8 1c           enqcmds (%si),%bx
        Decoded ok: 67 f3 0f 38 f8 8c 34 12     enqcmds 0x1234(%si),%cx
        Decoded ok: f2 0f 38 f8 18              enqcmd (%rax),%rbx
        Decoded ok: f2 0f 38 f8 88 78 56 34 12  enqcmd 0x12345678(%rax),%rcx
        Decoded ok: 67 f2 0f 38 f8 18           enqcmd (%eax),%ebx
        Decoded ok: 67 f2 0f 38 f8 88 78 56 34 12       enqcmd 0x12345678(%eax),%ecx
        Decoded ok: f3 0f 38 f8 18              enqcmds (%rax),%rbx
        Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%rax),%rcx
        Decoded ok: 67 f3 0f 38 f8 18           enqcmds (%eax),%ebx
        Decoded ok: 67 f3 0f 38 f8 88 78 56 34 12       enqcmds 0x12345678(%eax),%ecx
        $ perf test -v "new " 2>&1 | grep -i enqcmds
        Decoded ok: f3 0f 38 f8 18              enqcmds (%eax),%ebx
        Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%eax),%ecx
        Decoded ok: 67 f3 0f 38 f8 1c           enqcmds (%si),%bx
        Decoded ok: 67 f3 0f 38 f8 8c 34 12     enqcmds 0x1234(%si),%cx
        Decoded ok: f3 0f 38 f8 18              enqcmds (%rax),%rbx
        Decoded ok: f3 0f 38 f8 88 78 56 34 12  enqcmds 0x12345678(%rax),%rcx
        Decoded ok: 67 f3 0f 38 f8 18           enqcmds (%eax),%ebx
        Decoded ok: 67 f3 0f 38 f8 88 78 56 34 12       enqcmds 0x12345678(%eax),%ecx
        $ perf test -v "new " 2>&1 | grep -i encls
        Decoded ok: 0f 01 cf                    encls
        Decoded ok: 0f 01 cf                    encls
        $ perf test -v "new " 2>&1 | grep -i enclu
        Decoded ok: 0f 01 d7                    enclu
        Decoded ok: 0f 01 d7                    enclu
        $ perf test -v "new " 2>&1 | grep -i enclv
        Decoded ok: 0f 01 c0                    enclv
        Decoded ok: 0f 01 c0                    enclv
        $ perf test -v "new " 2>&1 | grep -i pconfig
        Decoded ok: 0f 01 c5                    pconfig
        Decoded ok: 0f 01 c5                    pconfig
        $ perf test -v "new " 2>&1 | grep -i wbnoinvd
        Decoded ok: f3 0f 09                    wbnoinvd
        Decoded ok: f3 0f 09                    wbnoinvd
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20191115135447.6519-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b980be18
    • A
      x86/insn: perf tools: Add some instructions to the new instructions test · 1e5f0154
      Adrian Hunter 提交于
      Add to the "x86 instruction decoder - new instructions" test the following
      instructions:
      	cldemote
      	tpause
      	umonitor
      	umwait
      	movdiri
      	movdir64b
      	enqcmd
      	enqcmds
      	encls
      	enclu
      	enclv
      	pconfig
      	wbnoinvd
      
      For information about the instructions, refer Intel SDM May 2019
      (325462-070US) and Intel Architecture Instruction Set Extensions
      May 2019 (319433-037).
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20191115135447.6519-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1e5f0154
    • A
      perf map: Move seldom used ->flags field to second cacheline · 7624e694
      Arnaldo Carvalho de Melo 提交于
      So we start with:
      
        $ pahole -C map ~/bin/perf
        struct map {
        	union {
        		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
        		struct list_head node;                   /*     0    16 */
        	} __attribute__((__aligned__(8)));                                               /*     0    24 */
        	u64                        start;                /*    24     8 */
        	u64                        end;                  /*    32     8 */
        	_Bool                      erange_warned:1;      /*    40: 0  1 */
        	_Bool                      priv:1;               /*    40: 1  1 */
      
        	/* XXX 6 bits hole, try to pack */
        	/* XXX 3 bytes hole, try to pack */
      
        	u32                        prot;                 /*    44     4 */
        	u32                        flags;                /*    48     4 */
      
        	/* XXX 4 bytes hole, try to pack */
      
        	u64                        pgoff;                /*    56     8 */
        	/* --- cacheline 1 boundary (64 bytes) --- */
        	u64                        reloc;                /*    64     8 */
        	u32                        maj;                  /*    72     4 */
        	u32                        min;                  /*    76     4 */
        	u64                        ino;                  /*    80     8 */
        	u64                        ino_generation;       /*    88     8 */
        	u64                        (*map_ip)(struct map *, u64); /*    96     8 */
        	u64                        (*unmap_ip)(struct map *, u64); /*   104     8 */
        	struct dso *               dso;                  /*   112     8 */
        	refcount_t                 refcnt;               /*   120     4 */
      
        	/* size: 128, cachelines: 2, members: 17 */
        	/* sum members: 116, holes: 2, sum holes: 7 */
        	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
        	/* padding: 4 */
        	/* forced alignments: 1 */
        } __attribute__((__aligned__(8)));
        $
      
      and 'flags' is seldom used when printing details about the map or with
      the "cacheline" sort order, we can move them it to the second cacheline,
      that will allow combining it with 'refcnt', that is only four bytes:
      
        $ pahole -C map ~/bin/perf
        struct map {
        	union {
        		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
        		struct list_head node;                   /*     0    16 */
        	} __attribute__((__aligned__(8)));                                               /*     0    24 */
        	u64                        start;                /*    24     8 */
        	u64                        end;                  /*    32     8 */
        	_Bool                      erange_warned:1;      /*    40: 0  1 */
        	_Bool                      priv:1;               /*    40: 1  1 */
      
        	/* XXX 6 bits hole, try to pack */
        	/* XXX 3 bytes hole, try to pack */
      
        	u32                        prot;                 /*    44     4 */
        	u64                        pgoff;                /*    48     8 */
        	u64                        reloc;                /*    56     8 */
        	/* --- cacheline 1 boundary (64 bytes) --- */
        	u32                        maj;                  /*    64     4 */
        	u32                        min;                  /*    68     4 */
        	u64                        ino;                  /*    72     8 */
        	u64                        ino_generation;       /*    80     8 */
        	u64                        (*map_ip)(struct map *, u64); /*    88     8 */
        	u64                        (*unmap_ip)(struct map *, u64); /*    96     8 */
        	struct dso *               dso;                  /*   104     8 */
        	refcount_t                 refcnt;               /*   112     4 */
        	u32                        flags;                /*   116     4 */
      
        	/* size: 120, cachelines: 2, members: 17 */
        	/* sum members: 116, holes: 1, sum holes: 3 */
        	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
        	/* forced alignments: 1 */
        	/* last cacheline: 56 bytes */
        } __attribute__((__aligned__(8)));
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-2cdw3zlw1mkamaf7nqtdlxfi@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7624e694
    • A
      perf map: Use bitmap for booleans · dbc984c9
      Arnaldo Carvalho de Melo 提交于
      The map->priv and map->erange_warned are seldom used, the first only in
      tests/vmlinux-kallsyms.c, the later only when hist_entry__inc_addr_samples()
      returns -ERANGE in 'perf top', which are really rare occasions, so make
      them a bool bitfield.
      
      This will open up space for other members on the first cacheline.
      
        $ pahole -C map ~/bin/perf
        struct map {
        	union {
        		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
        		struct list_head node;                   /*     0    16 */
        	} __attribute__((__aligned__(8)));                                               /*     0    24 */
        	u64                        start;                /*    24     8 */
        	u64                        end;                  /*    32     8 */
        	_Bool                      erange_warned:1;      /*    40: 0  1 */
        	_Bool                      priv:1;               /*    40: 1  1 */
      
        	/* XXX 6 bits hole, try to pack */
        	/* XXX 3 bytes hole, try to pack */
      
        	u32                        prot;                 /*    44     4 */
        	u32                        flags;                /*    48     4 */
      
        	/* XXX 4 bytes hole, try to pack */
      
        	u64                        pgoff;                /*    56     8 */
        	/* --- cacheline 1 boundary (64 bytes) --- */
        	u64                        reloc;                /*    64     8 */
        	u32                        maj;                  /*    72     4 */
        	u32                        min;                  /*    76     4 */
        	u64                        ino;                  /*    80     8 */
        	u64                        ino_generation;       /*    88     8 */
        	u64                        (*map_ip)(struct map *, u64); /*    96     8 */
        	u64                        (*unmap_ip)(struct map *, u64); /*   104     8 */
        	struct dso *               dso;                  /*   112     8 */
        	refcount_t                 refcnt;               /*   120     4 */
      
        	/* size: 128, cachelines: 2, members: 17 */
        	/* sum members: 116, holes: 2, sum holes: 7 */
        	/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
        	/* padding: 4 */
        	/* forced alignments: 1 */
        } __attribute__((__aligned__(8)));
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-g5545pcq4ff0wr17tfb1piqt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      dbc984c9
    • K
      libtraceevent: Fix parsing of event %o and %X argument types · 10f64581
      Konstantin Khlebnikov 提交于
      Add missing "%o" and "%X". Ext4 events use "%o" for printing i_mode.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
      Link: http://lore.kernel.org/lkml/157338066113.6548.11461421296091086041.stgit@buzzSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      10f64581
    • A
      perf callchain: Fix segfault in thread__resolve_callchain_sample() · aceb9826
      Adrian Hunter 提交于
      Do not dereference 'chain' when it is NULL.
      
        $ perf record -e intel_pt//u -e branch-misses:u uname
        $ perf report --itrace=l --branch-history
        perf: Segmentation fault
      
      Fixes: e9024d51 ("perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20191114142538.4097-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aceb9826
    • A
      perf map_groups: Auto sort maps by name, if needed · a7c2b572
      Arnaldo Carvalho de Melo 提交于
      There are still lots of lookups by name, even if just when loading
      vmlinux, till that code is studied to figure out if its possible to do
      away with those map lookup by names, provide a way to sort it using
      libc's qsort/bsearch.
      
      Doing it at the first lookup defers the sorting a bit, and as the code
      stands now, is never done for user maps, just for the kernel ones.
      
        # perf probe -l
        # perf probe -x ~/bin/perf -L __map_groups__find_by_name
        <__map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
              0  static struct map *__map_groups__find_by_name(struct map_groups *mg, const char *name)
              1  {
                        struct map **mapp;
      
              4         if (mg->maps_by_name == NULL &&
              5             map__groups__sort_by_name_from_rbtree(mg))
              6                 return NULL;
      
              8         mapp = bsearch(name, mg->maps_by_name, mg->nr_maps, sizeof(*mapp), map__strcmp_name);
              9         if (mapp)
             10                 return *mapp;
             11         return NULL;
             12  }
      
                 struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
                 {
      
        # perf probe -x ~/bin/perf 'found=__map_groups__find_by_name:10 name:string'
        Added new event:
          probe_perf:found     (on __map_groups__find_by_name:10 in /home/acme/bin/perf with name:string)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_perf:found -aR sleep 1
      
        #
        # perf probe -x ~/bin/perf -L map_groups__find_by_name
        <map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
              0  struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
              1  {
              2         struct maps *maps = &mg->maps;
                        struct map *map;
      
              5         down_read(&maps->lock);
      
              7         if (mg->last_search_by_name && strcmp(mg->last_search_by_name->dso->short_name, name) == 0) {
              8                 map = mg->last_search_by_name;
              9                 goto out_unlock;
                        }
                        /*
                         * If we have mg->maps_by_name, then the name isn't in the rbtree,
                         * as mg->maps_by_name mirrors the rbtree when lookups by name are
                         * made.
                         */
             16         map = __map_groups__find_by_name(mg, name);
             17         if (map || mg->maps_by_name != NULL)
             18                 goto out_unlock;
      
                        /* Fallback to traversing the rbtree... */
             21         maps__for_each_entry(maps, map)
             22                 if (strcmp(map->dso->short_name, name) == 0) {
             23                         mg->last_search_by_name = map;
             24                         goto out_unlock;
                                }
      
             27         map = NULL;
      
                 out_unlock:
             30         up_read(&maps->lock);
             31         return map;
             32  }
      
                 int dso__load_vmlinux(struct dso *dso, struct map *map,
                                      const char *vmlinux, bool vmlinux_allocated)
      
        # perf probe -x ~/bin/perf 'fallback=map_groups__find_by_name:21 name:string'
        Added new events:
          probe_perf:fallback  (on map_groups__find_by_name:21 in /home/acme/bin/perf with name:string)
          probe_perf:fallback_1 (on map_groups__find_by_name:21 in /home/acme/bin/perf with name:string)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_perf:fallback_1 -aR sleep 1
      
        #
        # perf probe -l
          probe_perf:fallback  (on map_groups__find_by_name:21@util/symbol.c in /home/acme/bin/perf with name_string)
          probe_perf:fallback_1 (on map_groups__find_by_name:21@util/symbol.c in /home/acme/bin/perf with name_string)
          probe_perf:found     (on __map_groups__find_by_name:10@util/symbol.c in /home/acme/bin/perf with name_string)
        #
        # perf stat -e probe_perf:*
      
      Now run 'perf top' in another term and then, after a while, stop 'perf stat':
      
      Furthermore, if we ask for interval printing, we can see that that is done just
      at the start of the workload:
      
        # perf stat -I1000 -e probe_perf:*
        #           time             counts unit events
             1.000319513                  0      probe_perf:found
             1.000319513                  0      probe_perf:fallback_1
             1.000319513                  0      probe_perf:fallback
             2.001868092             23,251      probe_perf:found
             2.001868092                  0      probe_perf:fallback_1
             2.001868092                  0      probe_perf:fallback
             3.002901597                  0      probe_perf:found
             3.002901597                  0      probe_perf:fallback_1
             3.002901597                  0      probe_perf:fallback
             4.003358591                  0      probe_perf:found
             4.003358591                  0      probe_perf:fallback_1
             4.003358591                  0      probe_perf:fallback
        ^C
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-c5lmbyr14x448rcfii7y6t3k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a7c2b572
    • A
      perf machine: No need to check if kernel module maps pre-exist · a94ab91a
      Arnaldo Carvalho de Melo 提交于
      We'only populating maps for kernel modules either from perf.data file
      PERF_RECORD_MMAP records or when parsing /proc/modules, so there is no
      need to first look if we already have those module maps in the list,
      that would mean the kernel has duplicate entries.
      
      So ditch one use of looking up maps by name.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-gnzjg2hhuz6jnrw91m35059y@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a94ab91a
  2. 18 11月, 2019 4 次提交
    • A
      perf record: No need to process the synthesized MMAP events twice · 6e0a9b3d
      Arnaldo Carvalho de Melo 提交于
      At the end of a 'perf record' session, by default, we'll process all
      samples and populate the threads, maps, etc so as to find out which of
      the DSOs got samples, to reduce the size of the build-id table we'll
      add to the perf.data headers.
      
      But we don't need to process the PERF_RECORD_MMAP events synthesized
      for the kernel modules, as we have those already via
      perf_session__create_kernel_maps(), so add mmap/mmap2 handlers that
      first look at event->header.misc to see if the event is for a user map,
      bailing out if not.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-mofoxvcx2dryppcw3o689jdd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6e0a9b3d
    • A
      perf map: No need to adjust the long name of modules · f068435d
      Arnaldo Carvalho de Melo 提交于
      At some point in the past we needed to make sure we would get the long
      name of modules and not just what we get from /proc/modules, but that
      need, as described in the cset that introduced the adjustment function:
      
      Fixes: c03d5184 ("perf machine: Adjust dso->long_name for offline module")
      
      Without using the buildid-cache:
      
        # lsmod | grep trusted
        # insmod trusted.ko
        # lsmod | grep trusted
        trusted                24576  0
        # strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
        openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
        openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
        openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
          probe:key_seal       (on key_seal in trusted)
        # perf probe -l
          probe:key_seal       (on key_seal in trusted)
        #
      
      No attempt at opening '[trusted]'.
      
      Now using the build-id cache:
      
        # rmmod trusted
        # perf buildid-cache --add ./trusted.ko
        # insmod trusted.ko
        # strace -e open,openat perf probe -m ./trusted.ko key_seal |& grep trusted
        openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 4
        openat(AT_FDCWD, "/sys/module/trusted/notes/.note.gnu.build-id", O_RDONLY) = 7
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/.debug/root/trusted.ko/dd3d355d567394d540f527e093e0f64b95879584/probes", O_RDWR|O_CREAT, 0644) = 3
        openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/usr/lib/debug/root/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/.debug/trusted.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, ".debug/trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "trusted.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 4
        openat(AT_FDCWD, "/root/trusted.ko", O_RDONLY) = 3
        #
      
      Again, no attempt at reading '[trusted]'.
      
      Finally, adding a probe to that function and then using:
      
      [root@quaco ~]# perf trace -e probe_perf:*/max-stack=16/ --max-events=2
           0.000 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
                                             dso__adjust_kmod_long_name (/home/acme/bin/perf)
                                             machine__process_kernel_mmap_event (/home/acme/bin/perf)
                                             machine__process_mmap_event (/home/acme/bin/perf)
                                             perf_event__process_mmap (/home/acme/bin/perf)
                                             machines__deliver_event (/home/acme/bin/perf)
                                             perf_session__deliver_event (/home/acme/bin/perf)
                                             perf_session__process_event (/home/acme/bin/perf)
                                             process_simple (/home/acme/bin/perf)
                                             reader__process_events (/home/acme/bin/perf)
                                             __perf_session__process_events (/home/acme/bin/perf)
                                             perf_session__process_events (/home/acme/bin/perf)
                                             process_buildids (/home/acme/bin/perf)
                                             record__finish_output (/home/acme/bin/perf)
                                             __cmd_record (/home/acme/bin/perf)
                                             cmd_record (/home/acme/bin/perf)
                                             run_builtin (/home/acme/bin/perf)
           0.055 perf/13456 probe_perf:dso__adjust_kmod_long_name(__probe_ip: 5492263)
                                             dso__adjust_kmod_long_name (/home/acme/bin/perf)
                                             machine__process_kernel_mmap_event (/home/acme/bin/perf)
                                             machine__process_mmap_event (/home/acme/bin/perf)
                                             perf_event__process_mmap (/home/acme/bin/perf)
                                             machines__deliver_event (/home/acme/bin/perf)
                                             perf_session__deliver_event (/home/acme/bin/perf)
                                             perf_session__process_event (/home/acme/bin/perf)
                                             process_simple (/home/acme/bin/perf)
                                             reader__process_events (/home/acme/bin/perf)
                                             __perf_session__process_events (/home/acme/bin/perf)
                                             perf_session__process_events (/home/acme/bin/perf)
                                             process_buildids (/home/acme/bin/perf)
                                             record__finish_output (/home/acme/bin/perf)
                                             __cmd_record (/home/acme/bin/perf)
                                             cmd_record (/home/acme/bin/perf)
                                             run_builtin (/home/acme/bin/perf)
        #
      
      This was the only path I could find using the perf tools that reach at this
      function, then as of november/2019, if we put a probe in the line where the
      actuall setting of the dso->long_name is done:
      
        # perf trace -e probe_perf:*
        ^C[root@quaco ~]
        # perf stat -e probe_perf:*  -I 2000
             2.000404265                  0      probe_perf:dso__adjust_kmod_long_name
             4.001142200                  0      probe_perf:dso__adjust_kmod_long_name
             6.001704120                  0      probe_perf:dso__adjust_kmod_long_name
             8.002398316                  0      probe_perf:dso__adjust_kmod_long_name
            10.002984010                  0      probe_perf:dso__adjust_kmod_long_name
            12.003597851                  0      probe_perf:dso__adjust_kmod_long_name
            14.004113303                  0      probe_perf:dso__adjust_kmod_long_name
            16.004582773                  0      probe_perf:dso__adjust_kmod_long_name
            18.005176373                  0      probe_perf:dso__adjust_kmod_long_name
            20.005801605                  0      probe_perf:dso__adjust_kmod_long_name
            22.006467540                  0      probe_perf:dso__adjust_kmod_long_name
        ^C    23.683261941                  0      probe_perf:dso__adjust_kmod_long_name
      
        #
      
      Its not being used at all.
      
      To further test this I used kvm.ko as the offline module, i.e. removed
      if from the buildid-cache by nuking it completely (rm -rf ~/.debug) and
      moved it from the normal kernel distro path, removed the modules, stoped
      the kvm guest, and then installed it manually, etc.
      
        # rmmod kvm-intel
        # rmmod kvm
        # lsmod | grep kvm
        # modprobe kvm-intel
        modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
        modprobe: ERROR: ctx=0x55d3b1722260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
        modprobe: ERROR: could not insert 'kvm_intel': Unknown symbol in module, or unknown parameter (see dmesg)
        # insmod ./kvm.ko
        # modprobe kvm-intel
        modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
        modprobe: ERROR: ctx=0x562f34026260 path=/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm/kvm.ko.xz error=No such file or directory
        # lsmod | grep kvm
        kvm_intel             299008  0
        kvm                   765952  1 kvm_intel
        irqbypass              16384  1 kvm
        #
        # perf probe -x ~/bin/perf machine__findnew_module_map:12 mname=m.name:string filename=filename:string 'dso_long_name=map->dso->long_name:string' 'dso_name=map->dso->name:string'
        # perf probe -l
          probe_perf:machine__findnew_module_map (on machine__findnew_module_map:12@util/machine.c in /home/acme/bin/perf with mname filename dso_long_name dso_name)
        # perf record
        ^C[ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 3.416 MB perf.data (33956 samples) ]
        # perf trace -e probe_perf:machine*
        <SNIP>
             6.322 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[salsa20_generic]", filename: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_long_name: "/lib/modules/5.3.8-200.fc30.x86_64/kernel/crypto/salsa20_generic.ko.xz", dso_name: "[salsa20_generic]")
             6.375 perf/23099 probe_perf:machine__findnew_module_map(__probe_ip: 5492493, mname: "[kvm]", filename: "[kvm]", dso_long_name: "[kvm]", dso_name: "[kvm]")
        <SNIP>
      
      The filename doesn't come with the path, no point in trying to set the dso->long_name.
      
        [root@quaco ~]# strace -e open,openat perf probe -m ./kvm.ko kvm_apic_local_deliver |& egrep 'open.*kvm'
        openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 4
        openat(AT_FDCWD, "/sys/module/kvm/notes/.note.gnu.build-id", O_RDONLY) = 4
        openat(AT_FDCWD, "/lib/modules/5.3.8-200.fc30.x86_64/kernel/arch/x86/kvm", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 7
        openat(AT_FDCWD, "/sys/module/kvm_intel/notes/.note.gnu.build-id", O_RDONLY) = 8
        openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/.debug/root/kvm.ko/5955f426cb93f03f30f3e876814be2db80ab0b55/probes", O_RDWR|O_CREAT, 0644) = 3
        openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/usr/lib/debug/root/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/.debug/kvm.ko", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, ".debug/kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "kvm.ko.debug", O_RDONLY) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
        openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 4
        openat(AT_FDCWD, "/root/kvm.ko", O_RDONLY) = 3
        [root@quaco ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-jlfew3lyb24d58egrp0o72o2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f068435d
    • A
      perf map_groups: Add a front end cache for map lookups by name · 1ae14516
      Arnaldo Carvalho de Melo 提交于
      Lets see if it helps:
      
      First look at the probeable lines for the function that does lookups by
      name in a map_groups struct:
      
        # perf probe -x ~/bin/perf -L map_groups__find_by_name
        <map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
              0  struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
              1  {
              2         struct maps *maps = &mg->maps;
                        struct map *map;
      
              5         down_read(&maps->lock);
      
              7         if (mg->last_search_by_name && strcmp(mg->last_search_by_name->dso->short_name, name) == 0) {
              8                 map = mg->last_search_by_name;
              9                 goto out_unlock;
                        }
      
             12         maps__for_each_entry(maps, map)
             13                 if (strcmp(map->dso->short_name, name) == 0) {
             14                         mg->last_search_by_name = map;
             15                         goto out_unlock;
                                }
      
             18         map = NULL;
      
                 out_unlock:
             21         up_read(&maps->lock);
             22         return map;
             23  }
      
                 int dso__load_vmlinux(struct dso *dso, struct map *map,
                                      const char *vmlinux, bool vmlinux_allocated)
      
        #
      
      Now add a probe to the place where we reuse the last search:
      
        # perf probe -x ~/bin/perf map_groups__find_by_name:8
        Added new event:
          probe_perf:map_groups__find_by_name (on map_groups__find_by_name:8 in /home/acme/bin/perf)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_perf:map_groups__find_by_name -aR sleep 1
      
        #
      
      Now lets do a system wide 'perf stat' counting those events:
      
        # perf stat -e probe_perf:*
      
      Leave it running and lets do a 'perf top', then, after a while, stop the
      'perf stat':
      
        # perf stat -e probe_perf:*
        ^C
         Performance counter stats for 'system wide':
      
                     3,603      probe_perf:map_groups__find_by_name
      
              44.565253139 seconds time elapsed
        #
      
      yeah, good to have.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-tcz37g3nxv3tvxw3q90vga3p@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1ae14516
    • A
      perf maps: Do not use an rbtree to sort by map name · c5c584d2
      Arnaldo Carvalho de Melo 提交于
      This is only used for the kernel maps, shave 24 bytes out 'struct map'
      and just traverse the existing per ip rbtree to look for maps by name,
      use a front end cache to reuse the last search if its the same name.
      
      After this 'struct map' is down to just two cachelines:
      
        $ pahole -C map ~/bin/perf
        struct map {
        	union {
        		struct rb_node rb_node __attribute__((__aligned__(8))); /*     0    24 */
        		struct list_head node;                   /*     0    16 */
        	} __attribute__((__aligned__(8)));                                               /*     0    24 */
        	u64                        start;                /*    24     8 */
        	u64                        end;                  /*    32     8 */
        	_Bool                      erange_warned;        /*    40     1 */
      
        	/* XXX 3 bytes hole, try to pack */
      
        	u32                        priv;                 /*    44     4 */
        	u32                        prot;                 /*    48     4 */
        	u32                        flags;                /*    52     4 */
        	u64                        pgoff;                /*    56     8 */
        	/* --- cacheline 1 boundary (64 bytes) --- */
        	u64                        reloc;                /*    64     8 */
        	u32                        maj;                  /*    72     4 */
        	u32                        min;                  /*    76     4 */
        	u64                        ino;                  /*    80     8 */
        	u64                        ino_generation;       /*    88     8 */
        	u64                        (*map_ip)(struct map *, u64); /*    96     8 */
        	u64                        (*unmap_ip)(struct map *, u64); /*   104     8 */
        	struct dso *               dso;                  /*   112     8 */
        	refcount_t                 refcnt;               /*   120     4 */
      
        	/* size: 128, cachelines: 2, members: 17 */
        	/* sum members: 121, holes: 1, sum holes: 3 */
        	/* padding: 4 */
        	/* forced alignments: 1 */
        } __attribute__((__aligned__(8)));
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-bvr8fqfgzxtgnhnwt5sssx5g@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5c584d2
  3. 14 11月, 2019 1 次提交
  4. 13 11月, 2019 4 次提交
    • A
      perf scripts python: exported-sql-viewer.py: Fix use of TRUE with SQLite · af833988
      Adrian Hunter 提交于
      Prior to version 3.23 SQLite does not support TRUE or FALSE, so always
      use 1 and 0 for SQLite.
      
      Fixes: 26c11206 ("perf scripts python: exported-sql-viewer.py: Use new 'has_calls' column")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v5.3+
      Link: http://lore.kernel.org/lkml/20191113120206.26957-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af833988
    • J
      perf vendor events power9: Fix commas so PMU event files are valid JSON · da3ef7f6
      James Clark 提交于
      No functional change.
      
      Remove extra commas in the power9 JSON files so that the files
      can be parsed and validated by other utilities such as Python
      that fail to parse invalid JSON.
      
      Before:
      
        $ diffstat -l -p1 /wb/1.patch | while read filename ; do echo $filename ; cat $filename | json_verify ; done
        tools/perf/pmu-events/arch/powerpc/power9/cache.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x300
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/floating-point.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x141
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/frontend.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x250
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/marked.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x301
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/memory.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x300
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/other.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x308
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/pipeline.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x4D0
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/pmc.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x200
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power9/translation.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x1E"
                             (right here) ------^
        JSON is invalid
        $
      
      After:
      
        $ diffstat -l -p1 /wb/1.patch | while read filename ; do echo $filename ; cat $filename | json_verify ; done
        tools/perf/pmu-events/arch/powerpc/power9/cache.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/floating-point.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/frontend.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/marked.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/memory.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/other.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/pipeline.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/pmc.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power9/translation.json
        JSON is valid
        $
      Signed-off-by: NJames Clark <james.clark@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kevin Mooney <kevin.mooney@arm.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: nd@arm.com
      Link: http://lore.kernel.org/lkml/20191112160342.26470-3-james.clark@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      da3ef7f6
    • J
      perf vendor events power8: Fix commas so PMU event files are valid JSON · 835e5bd9
      James Clark 提交于
      No functional change.
      
      Remove extra commas in the power8 JSON files so that the files
      can be parsed and validated by other utilities such as Python
      that fail to parse invalid JSON.
      
      Committer testing:
      
      Before:
      
        $ diffstat -l -p1 /wb/1.patch | while read filename ; do echo $filename ; cat $filename | json_verify ; done
        tools/perf/pmu-events/arch/powerpc/power8/cache.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x4c0
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/floating-point.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x200
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/frontend.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x250
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/marked.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x351
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/memory.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x100
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/other.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x1f0
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/pipeline.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x100
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/pmc.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x200
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/powerpc/power8/translation.json
        parse error: invalid object key (must be a string)
                                                [   {,     "EventCode": "0x4c0
                             (right here) ------^
        JSON is invalid
        $
      
      After:
      
        $ diffstat -l -p1 /wb/1.patch | while read filename ; do echo $filename ; cat $filename | json_verify ; done
        tools/perf/pmu-events/arch/powerpc/power8/cache.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/floating-point.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/frontend.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/marked.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/memory.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/other.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/pipeline.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/pmc.json
        JSON is valid
        tools/perf/pmu-events/arch/powerpc/power8/translation.json
        JSON is valid
        $
      Signed-off-by: NJames Clark <james.clark@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kevin Mooney <kevin.mooney@arm.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: nd@arm.com
      Link: http://lore.kernel.org/lkml/20191112160342.26470-2-james.clark@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      835e5bd9
    • J
      perf vendor events arm64: Fix commas so PMU event files are valid JSON · a44e4f3a
      James Clark 提交于
      No functional change.
      
      Add and remove extra commas in the arm64 JSON files so that the files
      can be parsed and validated by other utilities such as Python that fail
      to parse invalid JSON.
      
      Committer testing:
      
      Before:
      
        $ diffstat -l -p1 /wb/1.patch | while read filename ; do echo $filename ; cat $filename | json_verify ; done
        tools/perf/pmu-events/arch/arm64/ampere/emag/branch.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/bus.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/cache.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/clock.json
        parse error: unallowed token at this point in JSON text
                                                [     {         "PublicDescrip
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/exception.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/instruction.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/intrinsic.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/memory.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/ampere/emag/pipeline.json
        parse error: unallowed token at this point in JSON text
                                                [     {         "PublicDescrip
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a53/branch.json
        parse error: invalid object key (must be a string)
                                                [   {     "ArchStdEvent":  "BR
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a53/bus.json
        parse error: invalid object key (must be a string)
                                                [   {         "ArchStdEvent":
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a53/other.json
        parse error: invalid object key (must be a string)
                                                [   {         "ArchStdEvent":
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a57-a72/core-imp-def.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/armv8-recommended.json
        parse error: after array element, I expect ',' or ']'
                                                [     {         "PublicDescrip
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/cavium/thunderx2/core-imp-def.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/core-imp-def.json
        parse error: invalid object key (must be a string)
                                                [     {         "ArchStdEvent"
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-ddrc.json
        parse error: invalid object key (must be a string)
                                                [    { 	    "EventCode": "0x00
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-hha.json
        parse error: invalid object key (must be a string)
                                                [    { 	    "EventCode": "0x00
                             (right here) ------^
        JSON is invalid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-l3c.json
        parse error: invalid object key (must be a string)
                                                [    { 	    "EventCode": "0x00
                             (right here) ------^
        JSON is invalid
        $
      
      After:
      
        $ diffstat -l -p1 /wb/1.patch | while read filename ; do echo $filename ; cat $filename | json_verify ; done
        tools/perf/pmu-events/arch/arm64/ampere/emag/branch.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/bus.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/cache.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/clock.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/exception.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/instruction.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/intrinsic.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/memory.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/ampere/emag/pipeline.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a53/branch.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a53/bus.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a53/other.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/arm/cortex-a57-a72/core-imp-def.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/armv8-recommended.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/cavium/thunderx2/core-imp-def.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/core-imp-def.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-ddrc.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-hha.json
        JSON is valid
        tools/perf/pmu-events/arch/arm64/hisilicon/hip08/uncore-l3c.json
        JSON is valid
        $
      Signed-off-by: NJames Clark <james.clark@arm.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kevin Mooney <kevin.mooney@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: nd@arm.com
      Link: http://lore.kernel.org/lkml/20191112160342.26470-1-james.clark@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a44e4f3a
  5. 12 11月, 2019 15 次提交
  6. 07 11月, 2019 2 次提交
    • J
      perf report: Sort by sampled cycles percent per block for tui · 7fa46cbf
      Jin Yao 提交于
      Previous patch has implemented a new option "--total-cycles".  But only
      stdio mode is supported.
      
      This patch supports the tui mode and support '--percent-limit'.
      
      For example,
      
       perf record -b ./div
       perf report --total-cycles --percent-limit 1
      
       # Samples: 2753248 of event 'cycles'
       Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
                26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
                15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
                 5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
                 4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
                 4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
                 3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
                 3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
                 3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
                 2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
                 2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
                 2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
                 2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
                 2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
                 2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
                 1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
                 1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
                 1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
      
      --------------------------------------------------
      
       v7:
       ---
       1. Since we have used use_browser in report__browse_block_hists
          to support stdio mode, now we also add supporting for tui.
      
       2. Move block tui browser code from ui/browsers/hists.c
          to block-info.c.
      
       v6:
       ---
       Create report__tui_browse_block_hists in block-info.c
       (codes are moved from builtin-report.c).
      
       v5:
       ---
       Fix a crash issue when running perf report without
       '--total-cycles'. The issue is because the internal flag
       is renamed from 'total_cycles' to 'total_cycles_mode' in
       previous patch but this patch still uses 'total_cycles'
       to check if the '--total-cycles' option is enabled, which
       causes the code to be inconsistent.
      
       v4:
       ---
       Since the block collection is moved out of printing in
       previous patch, this patch is updated accordingly for
       tui supporting.
      
       v3:
       ---
       Minor change since the function name is changed:
       block_total_cycles_percent -> block_info__total_cycles_percent
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-8-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7fa46cbf
    • J
      perf report: Support --percent-limit for --total-cycles · 0b49f836
      Jin Yao 提交于
      We have already supported the '--total-cycles' option in previous patch.
      It's also useful to show entries only above a threshold percent.
      
      This patch enables '--percent-limit' for not showing entries
      under that percent.
      
      For example:
      
       perf report --total-cycles --stdio --percent-limit 1
      
       # To display the perf.data header info, please use --header/--header-only options.
       #
       #
       # Total Lost Samples: 0
       #
       # Samples: 2M of event 'cycles'
       # Event count (approx.): 2753248
       #
       # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                              [Program Block Range]         Shared Object
       # ...............  ..............  ...........  ..........  .................................................................  ....................
       #
                  26.04%            2.8M        0.40%          18                                             [div.c:42 -> div.c:39]                   div
                  15.17%            1.2M        0.16%           7                                 [random_r.c:357 -> random_r.c:380]          libc-2.27.so
                   5.11%          402.0K        0.04%           2                                             [div.c:27 -> div.c:28]                   div
                   4.87%          381.6K        0.04%           2                                     [random.c:288 -> random.c:291]          libc-2.27.so
                   4.53%          381.0K        0.04%           2                                             [div.c:40 -> div.c:40]                   div
                   3.85%          300.9K        0.02%           1                                             [div.c:22 -> div.c:25]                   div
                   3.08%          241.1K        0.02%           1                                           [rand.c:26 -> rand.c:27]          libc-2.27.so
                   3.06%          240.0K        0.02%           1                                     [random.c:291 -> random.c:291]          libc-2.27.so
                   2.78%          215.7K        0.02%           1                                     [random.c:298 -> random.c:298]          libc-2.27.so
                   2.52%          198.3K        0.02%           1                                     [random.c:293 -> random.c:293]          libc-2.27.so
                   2.36%          184.8K        0.02%           1                                           [rand.c:28 -> rand.c:28]          libc-2.27.so
                   2.33%          180.5K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
                   2.28%          176.7K        0.02%           1                                     [random.c:295 -> random.c:295]          libc-2.27.so
                   2.20%          168.8K        0.02%           1                                         [rand@plt+0 -> rand@plt+0]                   div
                   1.98%          158.2K        0.02%           1                                 [random_r.c:388 -> random_r.c:388]          libc-2.27.so
                   1.57%          123.3K        0.02%           1                                             [div.c:42 -> div.c:44]                   div
                   1.44%          116.0K        0.42%          19                                 [random_r.c:357 -> random_r.c:394]          libc-2.27.so
      
      Committer testing:
      
      From second exapmple onwards slightly edited for brevity:
      
        # perf report --total-cycles --percent-limit 2 --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 6M of event 'cycles'
        # Event count (approx.): 6299936
        #
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
        # ...............  ..............  ...........  ..........  ......................................................................  ....................
        #
                    2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
        #
        # (Tip: Create an archive with symtabs to analyse on other machine: perf archive)
        #
        # perf report --total-cycles --percent-limit 1 --stdio
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
                    2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
                    1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151]          libc-2.29.so
        #
        # perf report --total-cycles --percent-limit 0.7 --stdio
        # Sampled Cycles%  Sampled Cycles  Avg Cycles%  Avg Cycles                                                   [Program Block Range]         Shared Object
                    2.17%            1.7M        0.08%         607                                        [compiler.h:199 -> common.c:221]      [kernel.vmlinux]
                    1.75%            1.3M        8.34%       65.5K    [memset-vec-unaligned-erms.S:147 -> memset-vec-unaligned-erms.S:151]          libc-2.29.so
                    0.72%          544.5K        0.03%         230                                      [entry_64.S:657 -> entry_64.S:662]      [kernel.vmlinux]
        #
      
      -------------------------------------------
      
      It only shows the entries which 'Sampled Cycles%' > 1%.
      
       v7:
       ---
       No functional change. Only fix the conflict issue because
       previous patches are changed.
      
       v6:
       ---
       No functional change. Only fix the conflict issue because
       previous patches are changed.
      
       v5:
       ---
       No functional change. Only fix the conflict issue because
       previous patches are changed.
      
       v4:
       ---
       No functional change. Only fix the build issue because
       previous patches are changed.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20191107074719.26139-7-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      0b49f836