1. 05 6月, 2019 1 次提交
  2. 31 5月, 2019 4 次提交
  3. 28 5月, 2019 8 次提交
    • T
      perf record: Fix s390 missing module symbol and warning for non-root users · 6738028d
      Thomas Richter 提交于
      Command 'perf record' and 'perf report' on a system without kernel
      debuginfo packages uses /proc/kallsyms and /proc/modules to find
      addresses for kernel and module symbols. On x86 this works for root and
      non-root users.
      
      On s390, when invoked as non-root user, many of the following warnings
      are shown and module symbols are missing:
      
          proc/{kallsyms,modules} inconsistency while looking for
              "[sha1_s390]" module!
      
      Command 'perf record' creates a list of module start addresses by
      parsing the output of /proc/modules and creates a PERF_RECORD_MMAP
      record for the kernel and each module. The following function call
      sequence is executed:
      
        machine__create_kernel_maps
          machine__create_module
            modules__parse
              machine__create_module --> for each line in /proc/modules
                arch__fix_module_text_start
      
      Function arch__fix_module_text_start() is s390 specific. It opens
      file /sys/module/<name>/sections/.text to extract the module's .text
      section start address. On s390 the module loader prepends a header
      before the first section, whereas on x86 the module's text section
      address is identical the the module's load address.
      
      However module section files are root readable only. For non-root the
      read operation fails and machine__create_module() returns an error.
      Command perf record does not generate any PERF_RECORD_MMAP record
      for loaded modules. Later command perf report complains about missing
      module maps.
      
      To fix this function arch__fix_module_text_start() always returns
      success. For root users there is no change, for non-root users
      the module's load address is used as module's text start address
      (the prepended header then counts as part of the text section).
      
      This enable non-root users to use module symbols and avoid the
      warning when perf report is executed.
      
      Output before:
      
        [tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP
        0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text
      
      Output after:
      
        [tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP
        0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text
        0 0x1b8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../autofs4.ko.xz
        0 0x250 [0xa8]: PERF_RECORD_MMAP ... x /lib/modules/.../sha_common.ko.xz
        0 0x2f8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../des_generic.ko.xz
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: NHendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Link: http://lkml.kernel.org/r/20190522144601.50763-4-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6738028d
    • J
      perf machine: Read also the end of the kernel · ed9adb20
      Jiri Olsa 提交于
      We mark the end of kernel based on the first module, but that could
      cover some bpf program maps. Reading _etext symbol if it's present to
      get precise kernel map end.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20190508132010.14512-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ed9adb20
    • A
      perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms · 93f678b9
      Arnaldo Carvalho de Melo 提交于
      No need to search for aliases for the symbol that marks the end of the
      kernel text segment, the following patch will make such symbols not to
      be found when searching in the kallsyms maps causing this test to fail.
      
      So as a prep patch to avoid breaking bisection, ignore such symbols.
      Tested-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lkml.kernel.org/n/tip-qfwuih8cvmk9doh7k5k244eq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      93f678b9
    • N
      perf session: Add missing swap ops for namespace events · acd244b8
      Namhyung Kim 提交于
      In case it's recorded in a different arch.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> <hbathini@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Krister Johansen <kjlx@templeofstupid.com>
      Fixes: f3b3614a ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
      Link: http://lkml.kernel.org/r/20190522053250.207156-3-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      acd244b8
    • N
      perf namespace: Protect reading thread's namespace · 6584140b
      Namhyung Kim 提交于
      It seems that the current code lacks holding the namespace lock in
      thread__namespaces().  Otherwise it can see inconsistent results.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Krister Johansen <kjlx@templeofstupid.com>
      Link: http://lkml.kernel.org/r/20190522053250.207156-2-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6584140b
    • A
      tools include UAPI: Update copy of files related to new fspick, fsmount,... · fba29f18
      Arnaldo Carvalho de Melo 提交于
      tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls
      
      Copy the headers changed by these csets:
      
        d8076bdb ("uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]")
        9c8ad7a2 ("uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]")
        cf3cba4a ("vfs: syscall: Add fspick() to select a superblock for reconfiguration")
        93766fbd ("vfs: syscall: Add fsmount() to create a mount for a superblock")
        ecdab150 ("vfs: syscall: Add fsconfig() for configuring and managing a context")
        24dcb3d9 ("vfs: syscall: Add fsopen() to prepare for superblock creation")
        2db154b3 ("vfs: syscall: Add move_mount(2) to move mounts around")
        a07b2000 ("vfs: syscall: Add open_tree(2) to reference or clone a mount")
      
      We need to create tables for all the flags argument in the new syscalls,
      in followup patches.
      
      This silences these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h'
        diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h
        Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
        diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-knpqr1u2ffvz6641056z2mwu@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fba29f18
    • V
      perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel · f95d050c
      Vitaly Chikunov 提交于
      When a host system has kernel headers that are newer than a compiling
      kernel, mksyscalltbl fails with errors such as:
      
        <stdin>: In function 'main':
        <stdin>:271:44: error: '__NR_kexec_file_load' undeclared (first use in this function)
        <stdin>:271:44: note: each undeclared identifier is reported only once for each function it appears in
        <stdin>:272:46: error: '__NR_pidfd_send_signal' undeclared (first use in this function)
        <stdin>:273:43: error: '__NR_io_uring_setup' undeclared (first use in this function)
        <stdin>:274:43: error: '__NR_io_uring_enter' undeclared (first use in this function)
        <stdin>:275:46: error: '__NR_io_uring_register' undeclared (first use in this function)
        tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: line 48: /tmp/create-table-xvUQdD: Permission denied
      
      mksyscalltbl is compiled with default host includes, but run with
      compiling kernel tree includes, causing some syscall numbers to being
      undeclared.
      
      Committer testing:
      
      Before this patch, in my cross build environment, no build problems, but
      these new syscalls were not in the syscalls.c generated from the
      unistd.h file, which is a bug, this patch fixes it:
      
      perfbuilder@6e20056ed532:/git/perf$ tail /tmp/build/perf/arch/arm64/include/generated/asm/syscalls.c
      	[292] = "io_pgetevents",
      	[293] = "rseq",
      	[294] = "kexec_file_load",
      	[424] = "pidfd_send_signal",
      	[425] = "io_uring_setup",
      	[426] = "io_uring_enter",
      	[427] = "io_uring_register",
      	[428] = "syscalls",
      };
      perfbuilder@6e20056ed532:/git/perf$ strings /tmp/build/perf/perf | egrep '^(io_uring_|pidfd_|kexec_file)'
      kexec_file_load
      pidfd_send_signal
      io_uring_setup
      io_uring_enter
      io_uring_register
      perfbuilder@6e20056ed532:/git/perf$
      $
      
      Well, there is that last "syscalls" thing, but that looks like some
      other bug.
      Signed-off-by: NVitaly Chikunov <vt@altlinux.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NMichael Petlan <mpetlan@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20190521030203.1447-1-vt@altlinux.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f95d050c
    • S
      perf data: Fix 'strncat may truncate' build failure with recent gcc · 97acec7d
      Shawn Landden 提交于
      This strncat() is safe because the buffer was allocated with zalloc(),
      however gcc doesn't know that. Since the string always has 4 non-null
      bytes, just use memcpy() here.
      
          CC       /home/shawn/linux/tools/perf/util/data-convert-bt.o
        In file included from /usr/include/string.h:494,
                         from /home/shawn/linux/tools/lib/traceevent/event-parse.h:27,
                         from util/data-convert-bt.c:22:
        In function ‘strncat’,
            inlined from ‘string_set_value’ at util/data-convert-bt.c:274:4:
        /usr/include/powerpc64le-linux-gnu/bits/string_fortified.h:136:10: error: ‘__builtin_strncat’ output may be truncated copying 4 bytes from a string of length 4 [-Werror=stringop-truncation]
          136 |   return __builtin___strncat_chk (__dest, __src, __len, __bos (__dest));
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      Signed-off-by: NShawn Landden <shawn@git.icu>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      LPU-Reference: 20190518183238.10954-1-shawn@git.icu
      Link: https://lkml.kernel.org/n/tip-289f1jice17ta7tr3tstm9jm@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      97acec7d
  4. 21 5月, 2019 1 次提交
  5. 17 5月, 2019 9 次提交
    • J
      perf stat: Support 'percore' event qualifier · 4fc4d8df
      Jin Yao 提交于
      With this patch, we can use the 'percore' event qualifier in perf-stat.
      
        root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
          1.000773050 S0-C0   98,352,832 cpu/event=0,umask=0x3,percore=1/  (50.01%)
          1.000773050 S0-C1  103,763,057 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 S0-C2  196,776,995 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 S0-C3  176,493,779 cpu/event=0,umask=0x3,percore=1/  (50.02%)
          1.000773050 CPU0    47,699,641 cpu/event=0,umask=0x3/            (50.02%)
          1.000773050 CPU1    49,052,451 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU2   102,771,422 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU3   100,784,662 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU4    43,171,342 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU5    54,152,158 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU6    93,618,410 cpu/event=0,umask=0x3/            (49.98%)
          1.000773050 CPU7    74,477,589 cpu/event=0,umask=0x3/            (49.99%)
      
      In this example, we count the event 'ref-cycles' per-core and per-CPU in
      one perf stat command-line. From the output, we can see:
      
        S0-C0 = CPU0 + CPU4
        S0-C1 = CPU1 + CPU5
        S0-C2 = CPU2 + CPU6
        S0-C3 = CPU3 + CPU7
      
      So the result is expected (tiny difference is ignored).
      
      Note that, the 'percore' event qualifier needs to use with option '-A'.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      4fc4d8df
    • J
      perf stat: Factor out aggregate counts printing · 40480a81
      Jin Yao 提交于
      Move the aggregate counts printing to a new function
      print_counter_aggrdata, which will be used in following patches.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1555077590-27664-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      40480a81
    • J
      perf tools: Add a 'percore' event qualifier · 064b4e82
      Jin Yao 提交于
      Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/,
      that sums up the event counts for both hardware threads in a core.
      
      We can already do this with --per-core, but it's often useful to do
      this together with other metrics that are collected per hardware thread.
      So we need to support this per-core counting on a event level.
      
      This can be implemented in only the user tool, no kernel support needed.
      
       v4:
       ---
       1. Add Arnaldo's patch which updates the documentation for
          this new qualifier.
       2. Rebase to latest perf/core branch
      
       v3:
       ---
       Simplify the code according to Jiri's comments.
       Before:
         "return term->val.percore ? true : false;"
       Now:
         "return term->val.percore;"
      
       v2:
       ---
       Change the qualifier name from 'coresum' to 'percore' according to
       comments from Jiri and Andi.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1555077590-27664-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      064b4e82
    • T
      perf docs: Add description for stderr · 6cf62656
      Thomas Richter 提交于
      'perf report' displays recorded data on the screen and emits warnings
      and debug messages in the status line (last one on screen).
      
      perf also supports the possibility to write all debug messages to stderr
      (instead of writing them to the status line).
      
      This is achieved with the following command:
      
        # ./perf --debug stderr=1 report -vvvvv -i ~/fast.data 2>/tmp/2
        # ll /tmp/2
        -rw-rw-r-- 1 tmricht tmricht 5420835 May  7 13:46 /tmp/2
        #
      
      The usage of variable stderr=1 is not documented, so add it to the perf
      man page.
      Signed-off-by: NThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20190513080220.91966-1-tmricht@linux.ibm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6cf62656
    • A
      perf intel-pt: Fix sample timestamp wrt non-taken branches · 1b6599a9
      Adrian Hunter 提交于
      The sample timestamp is updated to ensure that the timestamp represents
      the time of the sample and not a branch that the decoder is still
      walking towards. The sample timestamp is updated when the decoder
      returns, but the decoder does not return for non-taken branches. Update
      the sample timestamp then also.
      
      Note that commit 3f04d98e ("perf intel-pt: Improve sample
      timestamp") was also a stable fix and appears, for example, in v4.4
      stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample
      timestamp").
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v4.4+
      Fixes: 3f04d98e ("perf intel-pt: Improve sample timestamp")
      Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1b6599a9
    • A
      perf intel-pt: Fix improved sample timestamp · 61b6e08d
      Adrian Hunter 提交于
      The decoder uses its current timestamp in samples. Usually that is a
      timestamp that has already passed, but in some cases it is a timestamp
      for a branch that the decoder is walking towards, and consequently
      hasn't reached.
      
      The intel_pt_sample_time() function decides which is which, but was not
      handling TNT packets exactly correctly.
      
      In the case of TNT, the timestamp applies to the first branch, so the
      decoder must first walk to that branch.
      
      That means intel_pt_sample_time() should return true for TNT, and this
      patch makes that change. However, if the first branch is a non-taken
      branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false
      for subsequent taken branches in the same TNT packet.
      
      To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to
      distinguish the cases.
      
      Note that commit 3f04d98e ("perf intel-pt: Improve sample
      timestamp") was also a stable fix and appears, for example, in v4.4
      stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample
      timestamp").
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v4.4+
      Fixes: 3f04d98e ("perf intel-pt: Improve sample timestamp")
      Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61b6e08d
    • A
      perf intel-pt: Fix instructions sampling rate · 7ba8fa20
      Adrian Hunter 提交于
      The timestamp used to determine if an instruction sample is made, is an
      estimate based on the number of instructions since the last known
      timestamp. A consequence is that it might go backwards, which results in
      extra samples. Change it so that a sample is only made when the
      timestamp goes forwards.
      
      Note this does not affect a sampling period of 0 or sampling periods
      specified as a count of instructions.
      
      Example:
      
       Before:
      
       $ perf script --itrace=i10us
       ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         10 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          8 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          6 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:         14 instructions:u:      7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:          4 instructions:u:      7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222728:      16423 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222734:      12731 instructions:u:      7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ...
      
       After:
       $ perf script --itrace=i10us
       ls 13812 [003] 2167315.222583:       3270 instructions:u:      7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222667:      30902 instructions:u:      7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so)
       ls 13812 [003] 2167315.222728:      16479 instructions:u:      7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so)
       ...
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org
      Fixes: f4aa0819 ("perf tools: Add Intel PT decoder")
      Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7ba8fa20
    • K
      perf regs x86: Add X86 specific arch__intr_reg_mask() · 6466ec14
      Kan Liang 提交于
      XMM registers can be collected on Icelake and later platforms.
      
      Add specific arch__intr_reg_mask(), which creating an event to check if
      the kernel and hardware can collect XMM registers.
      
      Test on Skylake which doesn't support XMM registers collection. There is
      nothing changed.
      
         #perf record -I?
         available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
         R10 R11 R12 R13 R14 R15
      
         Usage: perf record [<options>] [<command>]
          or: perf record [<options>] -- <command> [<options>]
      
          -I, --intr-regs[=<any register>]
                                sample selected machine registers on
         interrupt, use '-I?' to list register names
      
         #perf record -I
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]
      
         #perf evlist -v
         cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
         IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
         inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
         sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
         1, bpf_event: 1, sample_regs_intr: 0xff0fff
      
      Test on Icelake which support XMM registers collection.
      
         #perf record -I?
         available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
         R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
         XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
      
         Usage: perf record [<options>] [<command>]
          or: perf record [<options>] -- <command> [<options>]
      
          -I, --intr-regs[=<any register>]
                                sample selected machine registers on
         interrupt, use '-I?' to list register names
      
         #perf record -I
         [ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]
      
         #perf evlist -v
         cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
         IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
         inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
         sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
         1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff
      
      Committer notes:
      
      Don't set attr.sample_period as a named struct init, as it is part of an
      unnamed union in 'struct perf_event_attr', and doing so breaks the build
      on older gcc versions, such as:
      
        gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)
        gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC)
      
        arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask':
        arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer
        cc1: warnings being treated as errors
        arch/x86/util/perf_regs.c:279: warning: missing braces around initializer
        arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.<anonymous>')
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      [ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ]
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6466ec14
    • K
      perf parse-regs: Add generic support for arch__intr/user_reg_mask() · af785e75
      Kan Liang 提交于
      There may be different register mask for use with intr or user on some
      platforms, e.g. Icelake.
      
      Add weak functions arch__intr_reg_mask() and arch__user_reg_mask() to
      return intr and user register mask respectively.
      
      Check mask before printing or comparing the register name.
      
      Generic code always return PERF_REGS_MASK. No functional change.
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1557865174-56264-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      af785e75
  6. 16 5月, 2019 17 次提交
    • K
      perf parse-regs: Split parse_regs · aeea9062
      Kan Liang 提交于
      The available registers for --int-regs and --user-regs may be different,
      e.g. XMM registers.
      
      Split parse_regs into two dedicated functions for --int-regs and
      --user-regs respectively.
      
      Modify the warning message. "--user-regs=?" should be applied to show
      the available registers for --user-regs.
      Signed-off-by: NKan Liang <kan.liang@linux.intel.com>
      Tested-by: NRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/1557865174-56264-1-git-send-email-kan.liang@linux.intel.com
      [ Changed docs as suggested by Ravi and agreed by Kan ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aeea9062
    • F
      perf vendor events arm64: Add Cortex-A57 and Cortex-A72 events · 7025fdbe
      Florian Fainelli 提交于
      The Cortex-A57 and Cortex-A72 both support all ARMv8 recommended events
      up to the RC_ST_SPEC (0x91) event with the exception of:
      
      - L1D_CACHE_REFILL_INNER (0x44)
      - L1D_CACHE_REFILL_OUTER (0x45)
      - L1D_TLB_RD (0x4E)
      - L1D_TLB_WR (0x4F)
      - L2D_TLB_REFILL_RD (0x5C)
      - L2D_TLB_REFILL_WR (0x5D)
      - L2D_TLB_RD (0x5E)
      - L2D_TLB_WR (0x5F)
      - STREX_SPEC (0x6F)
      
      Create an appropriate JSON file for mapping those events and update the
      mapfile.csv for matching the Cortex-A57 and Cortex-A72 MIDR to that
      file.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sean V Kelley <seanvk.dev@oregontracks.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org (moderated list:arm pmu profiling and debugging)
      Link: http://lkml.kernel.org/r/20190513202522.9050-4-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7025fdbe
    • F
      perf vendor events arm64: Map Brahma-B53 CPUID to cortex-a53 events · 93fe8f1e
      Florian Fainelli 提交于
      Broadcom's Brahma-B53 CPUs support the same type of events that the
      Cortex-A53 supports, recognize its CPUID and map it to the cortex-a53
      events.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sean V Kelley <seanvk.dev@oregontracks.org>
      Cc: bcm-kernel-feedback-list@broadcom.com
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-arm-kernel@lists.infradead.org (moderated list
      Link: http://lkml.kernel.org/r/20190513202522.9050-3-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      93fe8f1e
    • F
      perf vendor events arm64: Remove [[:xdigit:]] wildcard · ae833a61
      Florian Fainelli 提交于
      ARM64's implementation of get_cpuidr_str() masks out the revision bits
      [3:0] while reading the CPU identifier, there is no need for the
      [[:xdigit:]] wildcard.
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sean V Kelley <seanvk.dev@oregontracks.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org (moderated list:arm pmu profiling and debugging)
      Link: http://lkml.kernel.org/r/20190513202522.9050-2-f.fainelli@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ae833a61
    • Z
      perf jevents: Remove unused variable · 8e8f515d
      Zenghui Yu 提交于
      Address gcc warning:
      
        pmu-events/jevents.c: In function ‘save_arch_std_events’:
        pmu-events/jevents.c:417:15: warning: unused variable ‘sb’ [-Wunused-variable]
          struct stat *sb = data;
                       ^~
      Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: wanghaibin.wang@huawei.com
      Link: http://lkml.kernel.org/r/1557919169-23972-1-git-send-email-yuzenghui@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8e8f515d
    • A
      perf test zstd: Fixup verbose mode output · d94cfbab
      Arnaldo Carvalho de Melo 提交于
      The shell tests should not redirect useful output to /dev/null, as that
      is done automatically by 'perf test' in non verbose mode, so remove that
      from the zstd comp/decomp test, fixing up verbose mode.
      
      Before:
      
        $ perf test zstd
        68: Zstd perf.data compression/decompression              : Ok
        $ perf test -v zstd
        68: Zstd perf.data compression/decompression              :
        --- start ---
        test child forked, pid 11956
            -z, --compression-level[=<n>]
        Collecting compressed record file:
        Checking compressed events stats:
        test child finished with 0
        ---- end ----
        Zstd perf.data compression/decompression: Ok
        $
      
      Now:
      
        $ perf test zstd
        68: Zstd perf.data compression/decompression              : Ok
        $ perf test -v zstd
        68: Zstd perf.data compression/decompression              :
        --- start ---
        test child forked, pid 12695
        Collecting compressed record file:
        0+500 records in
        72+1 records out
        37361 bytes (37 kB, 36 KiB) copied, 9.83796 s, 3.8 kB/s
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB /tmp/perf.data.rzq, compressed (original 0.004 MB, ratio is 3.679) ]
        Checking compressed events stats:
        # compressed : Zstd, level = 1, ratio = 4
              COMPRESSED events:          3
        test child finished with 0
        ---- end ----
        Zstd perf.data compression/decompression: Ok
        $
      
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/n/tip-tp96618ds42zic94nlh0msz3@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d94cfbab
    • A
      perf tests: Implement Zstd comp/decomp integration test · bdc35cbc
      Alexey Budankov 提交于
      Introduce a basic integration test for Zstd based record
      compression/decompression using 'perf record' and 'perf report'.
      
      Committer notes:
      
      Reduce a bit the freq (from 25 kHz to 5 kHz) and the number of /dev/null
      records read (from 1000 to 500), reducing the time it takes to something
      more in line with the time existing 'perf test' entries take to run.
      
      With that in place:
      
        $ time perf test zstd
        68: Zstd perf.data compression/decompression              : Ok
      
        real	0m10.376s
        user	0m0.105s
        sys	0m0.440s
        $ grep "model name" /proc/cpuinfo  | head -1
        model name	: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
        $
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/dc007ae4-104a-2b7c-316e-275929025f0d@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      bdc35cbc
    • A
      perf inject: Enable COMPRESSED record decompression · 371a3378
      Alexey Budankov 提交于
      Initialized decompression part of Zstd based API so COMPRESSED records
      would be decompressed into the resulting output data file.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/c27d7500-ecdd-3569-cab5-8f70bbed5ea4@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      371a3378
    • A
      perf report: Implement perf.data record decompression · cb62c6f1
      Alexey Budankov 提交于
      zstd_init(, comp_level = 0) initializes decompression part of API only
      hat now consists of zstd_decompress_stream() function.
      
      The perf.data PERF_RECORD_COMPRESSED records are decompressed using
      zstd_decompress_stream() function into a linked list of mmaped memory
      regions of mmap_comp_len size (struct decomp).
      
      After decompression of one COMPRESSED record its content is iterated and
      fetched for usual processing. The mmaped memory regions with
      decompressed events are kept in the linked list till the tool process
      termination.
      
      When dumping raw records (e.g., perf report -D --header) file offsets of
      events from compressed records are printed as zero.
      
      Committer notes:
      
      Since now we have support for processing PERF_RECORD_COMPRESSED, we see
      none, in raw form, like we saw in the previous patch commiter notes,
      they were decompressed into the usual PERF_RECORD_{FORK,MMAP,COMM,etc}
      records, we only see the stats for those PERF_RECORD_COMPRESSED events,
      and since I used the file generated in the commiter notes for the
      previous patch, there they are, 2 compressed records:
      
        $ perf report --header-only | grep cmdline
        # cmdline : /home/acme/bin/perf record -z2 sleep 1
        $ perf report -D | grep COMPRESS
              COMPRESSED events:          2
              COMPRESSED events:          0
        $ perf report --stdio
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 15  of event 'cycles:u'
        # Event count (approx.): 962227
        #
        # Overhead  Command  Shared Object     Symbol
        # ........  .......  ................  ...........................
        #
            46.99%  sleep    libc-2.28.so      [.] _dl_addr
            29.24%  sleep    [unknown]         [k] 0xffffffffaea00a67
            16.45%  sleep    libc-2.28.so      [.] __GI__IO_un_link.part.1
             5.92%  sleep    ld-2.28.so        [.] _dl_setup_hash
             1.40%  sleep    libc-2.28.so      [.] __nanosleep
             0.00%  sleep    [unknown]         [k] 0xffffffffaea00163
      
        #
        # (Tip: To see callchains in a more compact form: perf report -g folded)
        #
        $
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cb62c6f1
    • A
      perf record: Implement -z,--compression_level[=<n>] option · 504c1ad1
      Alexey Budankov 提交于
      Implemented -z,--compression_level[=<n>] option that enables compression
      of mmaped kernel data buffers content in runtime during perf record mode
      collection. Default option value is 1 (fastest compression).
      
      Compression overhead has been measured for serial and AIO streaming when
      profiling matrix multiplication workload:
      
            -------------------------------------------------------------
            | SERIAL			  | AIO-1                       |
        ----------------------------------------------------------------|
        |-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
        |---------------------------------------------------------------|
        | 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
        | 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
        | 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
        | 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
        | 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
        | 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
        -----------------------------------------------------------------
      
      OVH = (Execution time with -z N) / (Execution time with -z 0)
      
      ratio - compression ratio
      size  - number of bytes that was compressed
      
      	size ~= trace size x ratio
      
      Committer notes:
      
      Testing it I noticed that it failed to disable build id processing when
      compression is enabled, and as we'd have to uncompress everything to
      look for the PERF_RECORD_{MMAP,SAMPLE,etc} to figure out which build ids
      to read from DSOs, we better disable build id processing when
      compression is enabled, logging with pr_debug() when doing so:
      
      Original patch:
      
        # perf record -z2
        ^C[ perf record: Woken up 1 times to write data ]
        0x1746e0 [0x76]: failed to process type: 81 [Invalid argument]
        [ perf record: Captured and wrote 1.568 MB perf.data, compressed (original 0.452 MB, ratio is 3.995) ]
        #
      
      After auto-disabling build id processing when compression is enabled:
      
        $ perf record -z2 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.292) ]
        $ perf record -v -z2 sleep 1
        Compression enabled, disabling build id collection at the end of the session.
        <SNIP extra -v pr_debug() messages>
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.305) ]
        $
      
      Also, with parts of the patch originally after this one moved to just
      before this one we get:
      
        $ perf record -z2 sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.001 MB perf.data, compressed (original 0.001 MB, ratio is 2.371) ]
        $ perf report -D | grep COMPRESS
        0 0x1b8 [0x155]: PERF_RECORD_COMPRESSED: unhandled!
        0 0x30d [0x80]: PERF_RECORD_COMPRESSED: unhandled!
              COMPRESSED events:          2
              COMPRESSED events:          0
        $
      
      I.e. when faced with PERF_RECORD_COMPRESSED that we still have no code
      to process, we just show it as not being handled, skip them and
      continue, while before we had:
      
        $ perf report -D | grep COMPRESS
        0x1b8 [0x169]: failed to process type: 81 [Invalid argument]
        Error:
        failed to process sample
        0 0x1b8 [0x169]: PERF_RECORD_COMPRESSED
        $
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/9ff06518-ae63-a908-e44d-5d9e56dd66d9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      504c1ad1
    • A
      perf report: Add stub processing of compressed events for -D · 61a7773c
      Alexey Budankov 提交于
      Committer note:
      
      Split from a larger patch, this only dumps PERF_RECORD_COMPRESSED as
      unhandled, so that when we introduce the record part in the next patch,
      we don't see unhandled events when using 'perf record -D'.
      
      Changed it so that we dump the event if the handler is just a stub, i.e.
      for the case where we don't have ZSTD linked but we're processing a
      perf.data file generated by a tool with that linked.
      
      Also when failing to decompress we can't just dump the uncompressed
      event and return 0, we have to propagate the error.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/304b0a59-942c-3fe1-da02-aa749f87108b@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      61a7773c
    • A
      perf record: Implement compression for AIO trace streaming · ef781128
      Alexey Budankov 提交于
      Compression is implemented using the functions from zstd.c. As the memory
      to operate on the compression uses mmap->aio.data[] buffers. If Zstd
      streaming compression API fails for some reason the data to be compressed
      are just copied into the memory buffers using plain memcpy().
      
      Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
      records. Each element of the array is not longer that PERF_SAMPLE_MAX_SIZE
      and consists of perf_event_header followed by the compressed chunk
      that is decompressed on the loading stage.
      
      perf_mmap__aio_push() is replaced by perf_mmap__push() which is now used
      in the both serial and AIO streaming cases. perf_mmap__push() is extended
      with positive return values to signify absence of data ready for
      processing.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/77db2b2c-5d03-dbb0-aeac-c4dd92129ab9@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ef781128
    • A
      perf record: Implement compression for serial trace streaming · 5d7f4116
      Alexey Budankov 提交于
      Compression is implemented using the functions from zstd.c. As the
      memory to operate on the compression uses mmap->data buffer.
      
      If Zstd streaming compression API fails for some reason the data to be
      compressed are just copied into the memory buffers using plain memcpy().
      
      Compressed trace frame consists of an array of PERF_RECORD_COMPRESSED
      records. Each element of the array is not longer that
      PERF_SAMPLE_MAX_SIZE and consists of perf_event_header followed by the
      compressed chunk that is decompressed on the loading stage.
      
      Comitter notes:
      
      Undo some unnecessary line breaks, remove some unnecessary () around
      zstd_data to then just get its address, and fix conflicts with
      BPF_PROG_INFO/BPF_BTF patchkits.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/744df43f-3932-2594-ddef-1e99a3cad03a@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5d7f4116
    • A
      perf tools: Introduce Zstd streaming based compression API · f24c1d75
      Alexey Budankov 提交于
      Implemented functions are based on Zstd streaming compression API.
      
      The functions are used in runtime to compress data that come from mmaped
      kernel buffer. zstd_init(), zstd_fini() are used for initialization and
      finalization to allocate and deallocate internal zstd objects.
      zstd_compress_stream_to_records() is used to convert parts of mmaped
      kernel buffer into an array of PERF_RECORD_COMPRESSED records.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/18bf36f3-b85a-1fe2-dd83-10e0c6069568@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f24c1d75
    • A
      perf mmap: Implement dedicated memory buffer for data compression · 51255a8a
      Alexey Budankov 提交于
      Implemented mmap data buffer that is used as the memory to operate
      on when compressing data in case of serial trace streaming.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/49b31321-0f70-392b-9a4f-649d3affe090@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      51255a8a
    • A
      perf record: Implement COMPRESSED event record and its attributes · 42e1fd80
      Alexey Budankov 提交于
      Implemented PERF_RECORD_COMPRESSED event, related data types, header
      feature and functions to write, read and print feature attributes from
      the trace header section.
      
      comp_mmap_len preserves the size of mmaped kernel buffer that was used
      during collection. comp_mmap_len size is used on loading stage as the
      size of decomp buffer for decompression of COMPRESSED events content.
      
      Committer notes:
      
      Fixed up conflict with BPF_PROG_INFO and BTF_BTF header features.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/ebbaf031-8dda-3864-ebc6-7922d43ee515@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      42e1fd80
    • A
      perf session: Define 'bytes_transferred' and 'bytes_compressed' metrics · d3c8c08e
      Alexey Budankov 提交于
      Define 'bytes_transferred' and 'bytes_compressed' metrics to calculate
      ratio in the end of the data collection:
      
      	compression ratio = bytes_transferred / bytes_compressed
      
      The 'bytes_transferred' metric accumulates the amount of bytes that was
      extracted from the mmaped kernel buffers for compression, while
      'bytes_compressed' accumulates the amount of bytes that was received
      after applying compression.
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1d4bf499-cb03-26dc-6fc6-f14fec7622ce@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d3c8c08e