1. 16 4月, 2020 5 次提交
    • A
      perf tools: Support CAP_PERFMON capability · 6b3e0e2e
      Alexey Budankov 提交于
      Extend error messages to mention CAP_PERFMON capability as an option to
      substitute CAP_SYS_ADMIN capability for secure system performance
      monitoring and observability operations. Make
      perf_event_paranoid_check() and __cmd_ftrace() to be aware of
      CAP_PERFMON capability.
      
      CAP_PERFMON implements the principle of least privilege for performance
      monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
      principle of least privilege: A security design principle that states
      that a process or program be granted only those privileges (e.g.,
      capabilities) necessary to accomplish its legitimate function, and only
      for the time that such privileges are actually required)
      
      For backward compatibility reasons access to perf_events subsystem remains
      open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
      secure perf_events monitoring is discouraged with respect to CAP_PERFMON
      capability.
      
      Committer testing:
      
      Using a libcap with this patch:
      
        diff --git a/libcap/include/uapi/linux/capability.h b/libcap/include/uapi/linux/capability.h
        index 78b2fd4c8a95..89b5b0279b60 100644
        --- a/libcap/include/uapi/linux/capability.h
        +++ b/libcap/include/uapi/linux/capability.h
        @@ -366,8 +366,9 @@ struct vfs_ns_cap_data {
      
         #define CAP_AUDIT_READ       37
      
        +#define CAP_PERFMON	     38
      
        -#define CAP_LAST_CAP         CAP_AUDIT_READ
        +#define CAP_LAST_CAP         CAP_PERFMON
      
         #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
      
      Note that using '38' in place of 'cap_perfmon' works to some degree with
      an old libcap, its only when cap_get_flag() is called that libcap
      performs an error check based on the maximum value known for
      capabilities that it will fail.
      
      This makes determining the default of perf_event_attr.exclude_kernel to
      fail, as it can't determine if CAP_PERFMON is in place.
      
      Using 'perf top -e cycles' avoids the default check and sets
      perf_event_attr.exclude_kernel to 1.
      
      As root, with a libcap supporting CAP_PERFMON:
      
        # groupadd perf_users
        # adduser perf -g perf_users
        # mkdir ~perf/bin
        # cp ~acme/bin/perf ~perf/bin/
        # chgrp perf_users ~perf/bin/perf
        # setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" ~perf/bin/perf
        # getcap ~perf/bin/perf
        /home/perf/bin/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
        # ls -la ~perf/bin/perf
        -rwxr-xr-x. 1 root perf_users 16968552 Apr  9 13:10 /home/perf/bin/perf
      
      As the 'perf' user in the 'perf_users' group:
      
        $ perf top -a --stdio
        Error:
        Failed to mmap with 1 (Operation not permitted)
        $
      
      Either add the cap_ipc_lock capability to the perf binary or reduce the
      ring buffer size to some smaller value:
      
        $ perf top -m10 -a --stdio
        rounding mmap pages size to 64K (16 pages)
        Error:
        Failed to mmap with 1 (Operation not permitted)
        $ perf top -m4 -a --stdio
        Error:
        Failed to mmap with 1 (Operation not permitted)
        $ perf top -m2 -a --stdio
         PerfTop: 762 irqs/sec  kernel:49.7%  exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, 4 CPUs)
        ------------------------------------------------------------------------------------------------------
      
           9.83%  perf                [.] __symbols__insert
           8.58%  perf                [.] rb_next
           5.91%  [kernel]            [k] module_get_kallsym
           5.66%  [kernel]            [k] kallsyms_expand_symbol.constprop.0
           3.98%  libc-2.29.so        [.] __GI_____strtoull_l_internal
           3.66%  perf                [.] rb_insert_color
           2.34%  [kernel]            [k] vsnprintf
           2.30%  [kernel]            [k] string_nocheck
           2.16%  libc-2.29.so        [.] _IO_getdelim
           2.15%  [kernel]            [k] number
           2.13%  [kernel]            [k] format_decode
           1.58%  libc-2.29.so        [.] _IO_feof
           1.52%  libc-2.29.so        [.] __strcmp_avx2
           1.50%  perf                [.] rb_set_parent_color
           1.47%  libc-2.29.so        [.] __libc_calloc
           1.24%  [kernel]            [k] do_syscall_64
           1.17%  [kernel]            [k] __x86_indirect_thunk_rax
      
        $ perf record -a sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.552 MB perf.data (74 samples) ]
        $ perf evlist
        cycles
        $ perf evlist -v
        cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
        $ perf report | head -20
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 74  of event 'cycles'
        # Event count (approx.): 15694834
        #
        # Overhead  Command          Shared Object               Symbol
        # ........  ...............  ..........................  ......................................
        #
            19.62%  perf             [kernel.vmlinux]            [k] strnlen_user
            13.88%  swapper          [kernel.vmlinux]            [k] intel_idle
            13.83%  ksoftirqd/0      [kernel.vmlinux]            [k] pfifo_fast_dequeue
            13.51%  swapper          [kernel.vmlinux]            [k] kmem_cache_free
             6.31%  gnome-shell      [kernel.vmlinux]            [k] kmem_cache_free
             5.66%  kworker/u8:3+ix  [kernel.vmlinux]            [k] delay_tsc
             4.42%  perf             [kernel.vmlinux]            [k] __set_cpus_allowed_ptr
             3.45%  kworker/2:1-eve  [kernel.vmlinux]            [k] shmem_truncate_range
             2.29%  gnome-shell      libgobject-2.0.so.0.6000.7  [.] g_closure_ref
        $
      Signed-off-by: NAlexey Budankov <alexey.budankov@linux.intel.com>
      Reviewed-by: NJames Morris <jamorris@linux.microsoft.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Acked-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Igor Lubashev <ilubashe@akamai.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Serge Hallyn <serge@hallyn.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: intel-gfx@lists.freedesktop.org
      Cc: linux-doc@vger.kernel.org
      Cc: linux-man@vger.kernel.org
      Cc: linux-security-module@vger.kernel.org
      Cc: selinux@vger.kernel.org
      Link: http://lore.kernel.org/lkml/a66d5648-2b8e-577e-e1f2-1d56c017ab5e@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6b3e0e2e
    • J
      perf annotate: Add basic support for bpf_image · 3c29d448
      Jiri Olsa 提交于
      Add the DSO_BINARY_TYPE__BPF_IMAGE dso binary type to recognize BPF
      images that carry trampoline or dispatcher.
      
      Upcoming patches will add support to read the image data, store it
      within the BPF feature in perf.data and display it for annotation
      purposes.
      
      Currently we only display following message:
      
        # ./perf annotate bpf_trampoline_24456 --stdio
         Percent |      Source code & Disassembly of . for cycles (504  ...
        --------------------------------------------------------------- ...
                 :       to be implemented
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@redhat.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jesper Dangaard Brouer <hawk@kernel.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20200312195610.346362-16-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      3c29d448
    • J
      perf machine: Set ksymbol dso as loaded on arrival · 7eddf7e7
      Jiri Olsa 提交于
      There's no special load action for ksymbol data on map__load/dso__load
      action, where the kernel is getting loaded. It only gets confused with
      kernel kallsyms/vmlinux load for bpf object, which fails and could mess
      up with the map.
      
      Disabling any further load of the map for ksymbol related dso/map.
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@redhat.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jesper Dangaard Brouer <hawk@kernel.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20200312195610.346362-15-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7eddf7e7
    • J
      perf tools: Synthesize bpf_trampoline/dispatcher ksymbol event · 943930e4
      Jiri Olsa 提交于
      Synthesize bpf images (trampolines/dispatchers) on start, as ksymbol
      events from /proc/kallsyms. Having this perf can recognize samples from
      those images and perf report and top shows them correctly.
      
      The rest of the ksymbol handling is already in place from for the bpf
      programs monitoring, so only the initial state was needed.
      
      perf report output:
      
        # Overhead  Command     Shared Object                  Symbol
      
          12.37%  test_progs  [kernel.vmlinux]                 [k] entry_SYSCALL_64
          11.80%  test_progs  [kernel.vmlinux]                 [k] syscall_return_via_sysret
           9.63%  test_progs  bpf_prog_bcf7977d3b93787c_prog2  [k] bpf_prog_bcf7977d3b93787c_prog2
           6.90%  test_progs  bpf_trampoline_24456             [k] bpf_trampoline_24456
           6.36%  test_progs  [kernel.vmlinux]                 [k] memcpy_erms
      
      Committer notes:
      
      Use scnprintf() instead of strncpy() to overcome this on fedora:32,
      rawhide and OpenMandriva Cooker:
      
          CC       /tmp/build/perf/util/bpf-event.o
        In file included from /usr/include/string.h:495,
                         from /git/linux/tools/lib/bpf/libbpf_common.h:12,
                         from /git/linux/tools/lib/bpf/bpf.h:31,
                         from util/bpf-event.c:4:
        In function 'strncpy',
            inlined from 'process_bpf_image' at util/bpf-event.c:323:2,
            inlined from 'kallsyms_process_symbol' at util/bpf-event.c:358:9:
        /usr/include/bits/string_fortified.h:106:10: error: '__builtin_strncpy' specified bound 256 equals destination size [-Werror=stringop-truncation]
          106 |   return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
              |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NSong Liu <songliubraving@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andriin@fb.com>
      Cc: Björn Töpel <bjorn.topel@intel.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David S. Miller <davem@redhat.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jesper Dangaard Brouer <hawk@kernel.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20200312195610.346362-14-jolsa@kernel.org/Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      943930e4
    • A
      perf stat: Honour --timeout for forked workloads · cfbd41b7
      Arnaldo Carvalho de Melo 提交于
      When --timeout is used and a workload is specified to be started by
      'perf stat', i.e.
      
        $ perf stat --timeout 1000 sleep 1h
      
      The --timeout wasn't being honoured, i.e. the workload, 'sleep 1h' in
      the above example, should be terminated after 1000ms, but it wasn't,
      'perf stat' was waiting for it to finish.
      
      Fix it by sending a SIGTERM when the timeout expires.
      
      Now it works:
      
        # perf stat -e cycles --timeout 1234 sleep 1h
        sleep: Terminated
      
         Performance counter stats for 'sleep 1h':
      
                 1,066,692      cycles
      
               1.234314838 seconds time elapsed
      
               0.000750000 seconds user
               0.000000000 seconds sys
      
        #
      
      Fixes: f1f8ad52 ("perf stat: Add support to print counts after a period of time")
      Reported-by: NKonstantin Kharlamov <hi-angel@yandex.ru>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207243Tested-by: NKonstantin Kharlamov <hi-angel@yandex.ru>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Tested-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: yuzhoujian <yuzhoujian@didichuxing.com>
      Link: https://lore.kernel.org/lkml/20200415153803.GB20324@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      cfbd41b7
  2. 14 4月, 2020 7 次提交
    • A
      tools headers: Synchronize linux/bits.h with the kernel sources · e3698b23
      Arnaldo Carvalho de Melo 提交于
      To pick up the changes in these csets:
      
        295bcca8 ("linux/bits.h: add compile time sanity check of GENMASK inputs")
        3945ff37 ("linux/bits.h: Extract common header for vDSO")
      
      To address this tools/perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/linux/bits.h' differs from latest version at 'include/linux/bits.h'
        diff -u tools/include/linux/bits.h include/linux/bits.h
      
      This clashes with usage of userspace's static_assert(), that, at least
      on glibc, is guarded by a ifnded/endif pair, do the same to our copy of
      build_bug.h and avoid that diff in check_headers.sh so that we continue
      checking for drifts with the kernel sources master copy.
      
      This will all be tested with the set of build containers that includes
      uCLibc, musl libc, lots of glibc versions in lots of distros and cross
      build environments.
      
      The tools/objtool, tools/bpf, etc were tested as well.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Rikard Falkeborn <rikard.falkeborn@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e3698b23
    • A
      tools headers: Update x86's syscall_64.tbl with the kernel sources · d8ed4d7a
      Arnaldo Carvalho de Melo 提交于
      To pick the changes from:
      
        d3b1b776 ("x86/entry/64: Remove ptregs qualifier from syscall table")
        cab56d34 ("x86/entry: Remove ABI prefixes from functions in syscall tables")
        27dd84fa ("x86/entry/64: Use syscall wrappers for x32_rt_sigreturn")
      
      Addressing this tools/perf build warning:
      
        Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
      
      That didn't result in any tooling changes, as what is extracted are just
      the first two columns, and these patches touched only the third.
      
        $ cp /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp
        $ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
        $ make -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory '/home/acme/git/perf/tools/perf'
          BUILD:   Doing 'make -j12' parallel build
          DESCEND  plugins
          CC       /tmp/build/perf/util/syscalltbl.o
          INSTALL  trace_plugins
          LD       /tmp/build/perf/util/perf-in.o
          LD       /tmp/build/perf/perf-in.o
          LINK     /tmp/build/perf/perf
        $ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp/syscalls_64.c
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d8ed4d7a
    • A
      tools headers UAPI: Sync linux/mman.h with the kernel · f60b3878
      Arnaldo Carvalho de Melo 提交于
      To get the changes in:
      
        e346b381 ("mm/mremap: add MREMAP_DONTUNMAP to mremap()")
      
      Add that to 'perf trace's mremap 'flags' decoder.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/mman.h' differs from latest version at 'include/uapi/linux/mman.h'
        diff -u tools/include/uapi/linux/mman.h include/uapi/linux/mman.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Brian Geffon <bgeffon@google.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f60b3878
    • A
      tools headers UAPI: Sync sched.h with the kernel · 027fa8fb
      Arnaldo Carvalho de Melo 提交于
      To get the changes in:
      
        ef2c41cf ("clone3: allow spawning processes into cgroups")
      
      Add that to 'perf trace's clone 'flags' decoder.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/sched.h' differs from latest version at 'include/uapi/linux/sched.h'
        diff -u tools/include/uapi/linux/sched.h include/uapi/linux/sched.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      027fa8fb
    • A
      tools headers: Update linux/vdso.h and grab a copy of vdso/const.h · ca64d84e
      Arnaldo Carvalho de Melo 提交于
      To get in line with:
      
        8165b57b ("linux/const.h: Extract common header for vDSO")
      
      And silence this tools/perf/ build warning:
      
        Warning: Kernel ABI header at 'tools/include/linux/const.h' differs from latest version at 'include/linux/const.h'
        diff -u tools/include/linux/const.h include/linux/const.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ca64d84e
    • J
      perf stat: Fix no metric header if --per-socket and --metric-only set · 8358f698
      Jin Yao 提交于
      We received a report that was no metric header displayed if --per-socket
      and --metric-only were both set.
      
      It's hard for script to parse the perf-stat output. This patch fixes this
      issue.
      
      Before:
      
        root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket
        ^C
         Performance counter stats for 'system wide':
      
        S0        8                  2.6
      
               2.215270071 seconds time elapsed
      
        root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket -I1000
        #           time socket cpus
             1.000411692 S0        8                  2.2
             2.001547952 S0        8                  3.4
             3.002446511 S0        8                  3.4
             4.003346157 S0        8                  4.0
             5.004245736 S0        8                  0.3
      
      After:
      
        root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket
        ^C
         Performance counter stats for 'system wide':
      
                                     CPI
        S0        8                  2.1
      
               1.813579830 seconds time elapsed
      
        root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket -I1000
        #           time socket cpus                  CPI
             1.000415122 S0        8                  3.2
             2.001630051 S0        8                  2.9
             3.002612278 S0        8                  4.3
             4.003523594 S0        8                  3.0
             5.004504256 S0        8                  3.7
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200331180226.25915-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8358f698
    • A
      perf python: Check if clang supports -fno-semantic-interposition · 9a00df31
      Arnaldo Carvalho de Melo 提交于
      The set of C compiler options used by distros to build python bindings
      may include options that are unknown to clang, we check for a variety of
      such options, add -fno-semantic-interposition to that mix:
      
      This fixes the build on, among others, Manjaro Linux:
      
          GEN      /tmp/build/perf/python/perf.so
        clang-9: error: unknown argument: '-fno-semantic-interposition'
        error: command 'clang' failed with exit status 1
        make: Leaving directory '/git/perf/tools/perf'
      
        [perfbuilder@602aed1c266d ~]$ gcc -v
        Using built-in specs.
        COLLECT_GCC=gcc
        COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/lto-wrapper
        Target: x86_64-pc-linux-gnu
        Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-pkgversion='Arch Linux 9.3.0-1' --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-shared --enable-threads=posix --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --enable-multilib --disable-werror --enable-checking=release --enable-default-pie --enable-default-ssp --enable-cet=auto gdc_include_dir=/usr/include/dlang/gdc
        Thread model: posix
        gcc version 9.3.0 (Arch Linux 9.3.0-1)
        [perfbuilder@602aed1c266d ~]$
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9a00df31
  3. 03 4月, 2020 22 次提交
    • A
      perf python: Fix clang detection to strip out options passed in $CC · 9ff76cea
      Arnaldo Carvalho de Melo 提交于
      The clang check in the python setup.py file expected $CC to be just the
      name of the compiler, not the compiler + options, i.e. all options were
      expected to be passed in $CFLAGS, this ends up making it fail in systems
      where CC is set to, e.g.:
      
       "aarch64-linaro-linux-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot"
      
      Like this:
      
        $ python3
        >>> from subprocess import Popen
        >>> a = Popen(["aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot", "-v"])
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
          File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
            restore_signals, start_new_session)
          File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
            raise child_exception_type(errno_num, err_msg, err_filename)
        FileNotFoundError: [Errno 2] No such file or directory: 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot': 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot'
        >>>
      
      Make it more robust, covering this case, by passing cc.split()[0] as the
      first arg to popen().
      
      Fixes: a7ffd416 ("perf python: Fix clang detection when using CC=clang-version")
      Reported-by: NDaniel Díaz <daniel.diaz@linaro.org>
      Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
      Tested-by: NDaniel Díaz <daniel.diaz@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ilie Halip <ilie.halip@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200401124037.GA12534@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9ff76cea
    • S
      perf tools: Support Python 3.8+ in Makefile · b9c9ce4e
      Sam Lunt 提交于
      Python 3.8 changed the output of 'python-config --ldflags' to no longer
      include the '-lpythonX.Y' flag (this apparently fixed an issue loading
      modules with a statically linked Python executable).  The libpython
      feature check in linux/build/feature fails if the Python library is not
      included in FEATURE_CHECK_LDFLAGS-libpython variable.
      
      This adds a check in the Makefile to determine if PYTHON_CONFIG accepts
      the '--embed' flag and passes that flag alongside '--ldflags' if so.
      
      tools/perf is the only place the libpython feature check is used.
      Signed-off-by: NSam Lunt <samuel.j.lunt@gmail.com>
      Tested-by: NHe Zhe <zhe.he@windriver.com>
      Link: http://lore.kernel.org/lkml/c56be2e1-8111-9dfe-8298-f7d0f9ab7431@windriver.comAcked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: trivial@kernel.org
      Cc: stable@kernel.org
      Link: http://lore.kernel.org/lkml/20200131181123.tmamivhq4b7uqasr@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b9c9ce4e
    • A
      perf script: Fix invalid read of directory entry after closedir() · 27486a85
      Andreas Gerstmayr 提交于
      closedir(lang_dir) frees the memory of script_dirent->d_name, which
      gets accessed in the next line in a call to scnprintf().
      
      Valgrind report:
      
        Invalid read of size 1
        ==413557==    at 0x483CBE6: strlen (vg_replace_strmem.c:461)
        ==413557==    by 0x4DD45FD: __vfprintf_internal (vfprintf-internal.c:1688)
        ==413557==    by 0x4DE6679: __vsnprintf_internal (vsnprintf.c:114)
        ==413557==    by 0x53A037: vsnprintf (stdio2.h:80)
        ==413557==    by 0x53A037: scnprintf (vsprintf.c:21)
        ==413557==    by 0x435202: get_script_path (builtin-script.c:3223)
        ==413557==  Address 0x52e7313 is 1,139 bytes inside a block of size 32,816 free'd
        ==413557==    at 0x483AA0C: free (vg_replace_malloc.c:540)
        ==413557==    by 0x4E303C0: closedir (closedir.c:50)
        ==413557==    by 0x4351DC: get_script_path (builtin-script.c:3222)
      Signed-off-by: NAndreas Gerstmayr <agerstmayr@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200402124337.419456-1-agerstmayr@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      27486a85
    • A
      perf script report: Fix SEGFAULT when using DWARF mode · 1a4025f0
      Andreas Gerstmayr 提交于
      When running perf script report with a Python script and a callgraph in
      DWARF mode, intr_regs->regs can be 0 and therefore crashing the regs_map
      function.
      
      Added a check for this condition (same check as in builtin-script.c:595).
      Signed-off-by: NAndreas Gerstmayr <agerstmayr@redhat.com>
      Tested-by: NKim Phillips <kim.phillips@amd.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: http://lore.kernel.org/lkml/20200402125417.422232-1-agerstmayr@redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      1a4025f0
    • I
      perf script: add -S/--symbols documentation · 628d736d
      Ian Rogers 提交于
      Capture both that this option exists and that symbols can be hexadecimal
      addresses.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200402174130.140319-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      628d736d
    • J
      perf pmu-events x86: Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metric · 8ed1faf0
      Jin Yao 提交于
      The kernel utilization metric does multiplexing currently and is somewhat
      unreliable. The problem is that it uses two instances of the fixed counter,
      and the kernel has to multipleplex which causes errors. So should use
      CPU_CLK_UNHALTED.THREAD instead.
      
      Before:
      
        # perf stat -M Kernel_Utilization -- sleep 1
      
        Performance counter stats for 'sleep 1':
      
                1,419,425      cpu_clk_unhalted.ref_tsc:k
            <not counted>      cpu_clk_unhalted.ref_tsc	(0.00%)
      
      After:
      
        # perf stat -M Kernel_Utilization -- sleep 1
      
        Performance counter stats for 'sleep 1':
      
                  746,688      cpu_clk_unhalted.thread:k #      0.7 Kernel_Utilization
                1,088,348      cpu_clk_unhalted.thread
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Reviewed-by: NKan Liang <kan.liang@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200309013125.7559-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8ed1faf0
    • A
      perf events parser: Add missing Intel CPU events to parser · 47327f56
      Adrian Hunter 提交于
      perf list expects CPU events to be parseable by name, e.g.
      
          # perf list | grep el-capacity-read
            el-capacity-read OR cpu/el-capacity-read/          [Kernel PMU event]
      
      But the event parser does not recognize them that way, e.g.
      
          # perf test -v "Parse event"
          <SNIP>
          running test 54 'cycles//u'
          running test 55 'cycles:k'
          running test 0 'cpu/config=10,config1,config2=3,period=1000/u'
          running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u'
          running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/'
          running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp'
          -> cpu/event=0,umask=0x11/
          -> cpu/event=0,umask=0x13/
          -> cpu/event=0x54,umask=0x1/
          failed to parse event 'el-capacity-read:u,cpu/event=el-capacity-read/u', err 1, str 'parser error'
          event syntax error: 'el-capacity-read:u,cpu/event=el-capacity-read/u'
                                 \___ parser error test child finished with 1
          ---- end ----
          Parse event definition strings: FAILED!
      
      This happens because the parser splits names by '-' in order to deal
      with cache events. For example 'L1-dcache' is a token in
      parse-events.l which is matched to 'L1-dcache-load-miss' by the
      following rule:
      
          PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_event_config
      
      And so there is special handling for 2-part PMU names i.e.
      
          PE_PMU_EVENT_PRE '-' PE_PMU_EVENT_SUF sep_dc
      
      but no handling for 3-part names, which are instead added as tokens e.g.
      
          topdown-[a-z-]+
      
      While it would be possible to add a rule for 3-part names, that would
      not work if the first parts were also a valid PMU name e.g.
      'el-capacity-read' would be matched to 'el-capacity' before the parser
      reached the 3rd part.
      
      The parser would need significant change to rationalize all this, so
      instead fix for now by adding missing Intel CPU events with 3-part names
      to the event parser as tokens.
      
      Missing events were found by using:
      
          grep -r EVENT_ATTR_STR arch/x86/events/intel/core.c
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Link: http://lore.kernel.org/lkml/90c7ae07-c568-b6d3-f9c4-d0c1528a0610@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      47327f56
    • S
      perf script: Allow --symbol to accept hexadecimal addresses · d2bedb78
      Stephane Eranian 提交于
      This patch extends the perf script --symbols option to filter on
      hexadecimal addresses in addition to symbol names. This makes it easier
      to handle cases where symbols are aliased.
      
      With this patch, it is possible to mix and match symbols and hexadecimal
      addresses using the --symbols option.
      
        $ perf script --symbols=noploop,0x4007a0
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325220802.15039-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d2bedb78
    • A
      perf report/top TUI: Fix title line formatting · 376c3c22
      Arnaldo Carvalho de Melo 提交于
      In d10ec006 ("perf hists browser: Allow passing an initial hotkey")
      the hist_entry__title() call was cut'n'pasted to a function where the
      'title' variable is a pointer, not an array, so the sizeof(title)
      continues syntactically valid but ends up reducing the real size of the
      buffer where to format the first line in the screen to 8 bytes, which
      makes the formatting at the title at each refresh to produce just the
      string "Samples ", duh, fix it by passing the size of the buffer.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: d10ec006 ("perf hists browser: Allow passing an initial hotkey")
      Link: http://lore.kernel.org/lkml/20200330154314.GB4576@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      376c3c22
    • J
      perf top: Support hotkey to change sort order · 2605af0f
      Jin Yao 提交于
      It would be nice if we can use a hotkey in perf top browser to select a
      event for sorting.
      
      For example:
      
        perf top --group -e cycles,instructions,cache-misses
      
        Samples
                        Overhead  Shared Object             Symbol
          40.03%  45.71%   0.03%  div                       [.] main
          20.46%  14.67%   0.21%  libc-2.27.so              [.] __random_r
          20.01%  19.54%   0.02%  libc-2.27.so              [.] __random
           9.68%  10.68%   0.00%  div                       [.] compute_flag
           4.32%   4.70%   0.00%  libc-2.27.so              [.] rand
           3.84%   3.43%   0.00%  div                       [.] rand@plt
           0.05%   0.05%   2.33%  libc-2.27.so              [.] __strcmp_sse2_unaligned
           0.04%   0.08%   2.43%  perf                      [.] perf_hpp__is_dynamic_en
           0.04%   0.02%   6.64%  perf                      [.] rb_next
           0.04%   0.01%   3.87%  perf                      [.] dso__find_symbol
           0.04%   0.04%   1.77%  perf                      [.] sort__dso_cmp
      
      When user press hotkey '2' (event index, starting from 0), it indicates
      to sort output by the third event in group (cache-misses).
      
        Samples
                        Overhead  Shared Object               Symbol
           4.07%   1.28%   6.68%  perf                        [.] rb_next
           3.57%   3.98%   4.11%  perf                        [.] __hists__insert_output
           3.67%  11.24%   3.60%  perf                        [.] perf_hpp__is_dynamic_e
           3.67%   3.20%   3.20%  perf                        [.] hpp__sort_overhead
           0.81%   0.06%   3.01%  perf                        [.] dso__find_symbol
           1.62%   5.47%   2.51%  perf                        [.] hists__match
           2.70%   1.86%   2.47%  libc-2.27.so                [.] _int_malloc
           0.19%   0.00%   2.29%  [kernel]                    [k] copy_page
           0.41%   0.32%   1.98%  perf                        [.] hists__decay_entries
           1.84%   3.67%   1.68%  perf                        [.] sort__dso_cmp
           0.16%   0.00%   1.63%  [kernel]                    [k] clear_page_erms
      
      Now the output is sorted by cache-misses.
      
       v2:
       ---
       Zero the history if hotkey is pressed.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200324220711.6025-2-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2605af0f
    • J
      perf top: Support --group-sort-idx to change the sort order · df7deb2c
      Jin Yao 提交于
      'perf report' supports the option --group-sort-idx, which sorts the
      output by the event at the index n in event group.
      
      For example:
      
        perf record -e cycles,instructions,cache-misses
        perf report --group --group-sort-idx 2 --stdio
      
      The perf-report output is sorted by cache-misses.
      
      This patch supports --group-sort-idx in perf-top.
      
      For example:
      
        perf top --group -e cycles,instructions,cache-misses --group-sort-idx 2
      
      The perf-top output is sorted by cache-misses.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Suggested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200324220711.6025-1-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      df7deb2c
    • K
      perf symbols: Fix arm64 gap between kernel start and module end · 78886f3e
      Kemeng Shi 提交于
      During execution of command 'perf report' in my arm64 virtual machine,
      this error message is showed:
      
      failed to process sample
      
      __symbol__inc_addr_samples(860): ENOMEM! sym->name=__this_module,
          start=0x1477100, addr=0x147dbd8, end=0x80002000, func: 0
      
      The error is caused with path:
      cmd_report
       __cmd_report
        perf_session__process_events
         __perf_session__process_events
          ordered_events__flush
           __ordered_events__flush
            oe->deliver (ordered_events__deliver_event)
             perf_session__deliver_event
              machines__deliver_event
               perf_evlist__deliver_sample
                tool->sample (process_sample_event)
                 hist_entry_iter__add
                  iter->add_entry_cb(hist_iter__report_callback)
                   hist_entry__inc_addr_samples
                    symbol__inc_addr_samples
                     __symbol__inc_addr_samples
                      h = annotated_source__histogram(src, evidx) (NULL)
      
      annotated_source__histogram failed is caused with path:
      ...
       hist_entry__inc_addr_samples
        symbol__inc_addr_samples
         symbol__hists
          annotated_source__alloc_histograms
           src->histograms = calloc(nr_hists, sizeof_sym_hist) (failed)
      
      Calloc failed as the symbol__size(sym) is too huge. As show in error
      message: start=0x1477100, end=0x80002000, size of symbol is about 2G.
      
      This is the same problem as 'perf annotate: Fix s390 gap between kernel
      end and module start (b9c0a649)'. Perf gets symbol information from
      /proc/kallsyms in __dso__load_kallsyms. A part of symbol in /proc/kallsyms
      from my virtual machine is as follows:
       #cat /proc/kallsyms | sort
       ...
       ffff000001475080 d rpfilter_mt_reg      [ip6t_rpfilter]
       ffff000001475100 d $d   [ip6t_rpfilter]
       ffff000001475100 d __this_module        [ip6t_rpfilter]
       ffff000080080000 t _head
       ffff000080080000 T _text
       ffff000080080040 t pe_header
       ...
      
      Take line 'ffff000001475100 d __this_module [ip6t_rpfilter]' as example.
      The start and end of symbol are both set to ffff000001475100 in
      dso__load_all_kallsyms. Then symbols__fixup_end will set the end of symbol
      to next big address to ffff000001475100 in /proc/kallsyms, ffff000080080000
      in this example. Then sizeof of symbol will be about 2G and cause the
      problem.
      
      The start of module in my machine is
       ffff000000a62000 t $x   [dm_mod]
      
      The start of kernel in my machine is
       ffff000080080000 t _head
      
      There is a big gap between end of module and begin of kernel if a samll
      amount of memory is used by module. And the last symbol in module will
      have a large address range as caotaining the big gap.
      
      Give that the module and kernel text segment sequence may change in
      the future, fix this by limiting range of last symbol in module and kernel
      to 4K in arch arm64.
      Signed-off-by: NKemeng Shi <shikemeng@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hewenliang <hewenliang4@huawei.com>
      Cc: Hu Shiyuan <hushiyuan@huawei.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: http://lore.kernel.org/lkml/33fd24c4-0d5a-9d93-9b62-dffa97c992ca@huawei.com
      [ refreshed the patch on current codebase, added string.h include as strchr() is used ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      78886f3e
    • A
      perf build-test: Honour JOBS to override detection of number of cores · 7b1642f2
      Arnaldo Carvalho de Melo 提交于
      When one does:
      
        $ make -C tools/perf build-test
      
      The makefile in tools/perf/tests/ will, just like the main one, detect
      how many cores are in the system and use it with -j.
      
      Sometimes we may need to override that, for instance, when using
      icecream or distcc to use multiple machines in the build process, then
      we need to, as with the main makefile, use:
      
        $ make JOBS=N -C tools/perf build-test
      
      Fix the tests makefile to honour that.
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20200330130301.GA31702@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7b1642f2
    • N
      perf script: Add --show-cgroup-events option · 160d4af9
      Namhyung Kim 提交于
      The --show-cgroup-events option is to print CGROUP events in the
      output like others.
      
      Committer testing:
      
        [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.039 MB perf.data (487 samples) ]
        [root@seventh ~]# perf script --show-cgroup-events | grep PERF_RECORD_CGROUP -B2 -A2
                 swapper     0     0.000000: PERF_RECORD_CGROUP cgroup: 1 /
                    perf 12145 11200.440730:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                    perf 12145 11200.440733:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        --
                  cgtest 12145 11200.440739:     193472 cycles:  ffffffffb90f6fbc commit_creds+0x1fc (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12145 11200.440790:    2691608 cycles:      7fa2cb43019b _dl_sysdep_start+0x7cb (/usr/lib64/ld-2.29.so)
                  cgtest 12145 11200.440962: PERF_RECORD_CGROUP cgroup: 83 /sub
                  cgtest 12147 11200.441054:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12147 11200.441057:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        --
                  cgtest 12148 11200.441103:      10227 cycles:  ffffffffb9a0153d end_repeat_nmi+0x48 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12148 11200.441106:     273295 cycles:  ffffffffb99ecbc7 copy_page+0x7 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12147 11200.441133: PERF_RECORD_CGROUP cgroup: 88 /sub/cgrp1
                  cgtest 12147 11200.441143:    2788845 cycles:  ffffffffb94676c2 security_genfs_sid+0x102 (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
                  cgtest 12148 11200.441162: PERF_RECORD_CGROUP cgroup: 93 /sub/cgrp2
                  cgtest 12148 11200.441182:    2669546 cycles:            401020 _init+0x20 (/wb/cgtest)
                  cgtest 12149 11200.441247:          1 cycles:  ffffffffb900d58b __intel_pmu_enable_all.constprop.0+0x3b (/lib/modules/5.6.0-rc6-00008-gfe2413eefd7f/build/vmlinux)
        [root@seventh ~]#
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-10-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      160d4af9
    • N
      perf top: Add --all-cgroups option · f382842f
      Namhyung Kim 提交于
      The --all-cgroups option is to enable cgroup profiling support.  It
      tells kernel to record CGROUP events in the ring buffer so that 'perf
      top' can identify task/cgroup association later.
      
      Committer testing:
      
      Use:
      
        # perf top --all-cgroups -s cgroup_id,cgroup,pid
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-9-namhyung@kernel.org
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ Extracted the HAVE_FILE_HANDLE from the followup patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f382842f
    • N
      perf record: Add --all-cgroups option · 8fb4b679
      Namhyung Kim 提交于
      The --all-cgroups option is to enable cgroup profiling support.  It
      tells kernel to record CGROUP events in the ring buffer so that perf
      report can identify task/cgroup association later.
      
        [root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ]
        [root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 558  of event 'cycles'
        # Event count (approx.): 458017341
        #
        # Overhead  cgroup id (dev/inode)  Cgroup          Pid:Command
        # ........  .....................  ..........  ...............
        #
            33.15%  4/0xeffffffb           /sub           9615:looper0
            32.83%  4/0xf00002f5           /sub/cgrp2     9620:looper2
            32.79%  4/0xf00002f4           /sub/cgrp1     9619:looper1
             0.35%  4/0xf00002f5           /sub/cgrp2     9618:cgtest
             0.34%  4/0xf00002f4           /sub/cgrp1     9617:cgtest
             0.32%  4/0xeffffffb           /              9615:looper0
             0.11%  4/0xeffffffb           /sub           9617:cgtest
             0.10%  4/0xeffffffb           /sub           9618:cgtest
      
        #
        # (Tip: Sample related events with: perf record -e '{cycles,instructions}:S')
        #
        [root@seventh ~]#
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-8-namhyung@kernel.org
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ Extracted the HAVE_FILE_HANDLE from the followup patch ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      8fb4b679
    • N
      perf record: Support synthesizing cgroup events · ab64069f
      Namhyung Kim 提交于
      Synthesize cgroup events by iterating cgroup filesystem directories.
      The cgroup event only saves the portion of cgroup path after the mount
      point and the cgroup id (which actually is a file handle).
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-7-namhyung@kernel.org
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ Extracted the HAVE_FILE_HANDLE from the followup patch, added missing __maybe_unused ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ab64069f
    • N
      perf report: Add 'cgroup' sort key · b629f3e9
      Namhyung Kim 提交于
      The cgroup sort key is to show cgroup membership of each task.
      Currently it shows full path in the cgroupfs (not relative to the root
      of cgroup namespace) since it'd be more intuitive IMHO.  Otherwise root
      cgroup in different namespaces will all show same name - "/".
      
      The cgroup sort key should come before cgroup_id otherwise
      sort_dimension__add() will match it to cgroup_id as it only matches with
      the given substring.
      
      For example it will look like following.  Note that record patch adding
      --all-cgroups patch will come later.
      
        $ perf record -a --namespace --all-cgroups  cgtest
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.208 MB perf.data (4090 samples) ]
      
        $ perf report -s cgroup_id,cgroup,pid
        ...
        # Overhead  cgroup id (dev/inode)  Cgroup          Pid:Command
        # ........  .....................  ..........  ...............
        #
            93.96%  0/0x0                  /                 0:swapper
             1.25%  3/0xeffffffb           /               278:looper0
             0.86%  3/0xf000015f           /sub/cgrp1      280:cgtest
             0.37%  3/0xf0000160           /sub/cgrp2      281:cgtest
             0.34%  3/0xf0000163           /sub/cgrp3      282:cgtest
             0.22%  3/0xeffffffb           /sub            278:looper0
             0.20%  3/0xeffffffb           /               280:cgtest
             0.15%  3/0xf0000163           /sub/cgrp3      285:looper3
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-6-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b629f3e9
    • N
      perf cgroup: Maintain cgroup hierarchy · d1277aa3
      Namhyung Kim 提交于
      Each cgroup is kept in the perf_env's cgroup_tree sorted by the cgroup
      id.  Hist entries have cgroup id can compare it directly and later it
      can be used to find a group name using this tree.
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-5-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d1277aa3
    • N
      perf tools: Basic support for CGROUP event · ba78c1c5
      Namhyung Kim 提交于
      Implement basic functionality to support cgroup tracking.  Each cgroup
      can be identified by inode number which can be read from userspace too.
      The actual cgroup processing will come in the later patch.
      Reported-by: Nkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      [ fix perf test failure on sampling parsing ]
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200325124536.2800725-4-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ba78c1c5
    • N
      perf tools: Add file-handle feature test · 49f550ea
      Namhyung Kim 提交于
      The file handle (FHANDLE) support is configurable so some systems might not
      have it.  So add a config feature item to check it on build time so that we
      don't add the cgroup tracking feature based on that.
      
      Committer notes:
      
      Had to make the test use the same construct as its later use in
      synthetic-events.c, in the next patch in this series. i.e. make it be:
      
      	struct {
      		struct file_handle fh;
      		uint64_t cgroup_id;
      	} handle;
      
      To cope with:
      
          CC       /tmp/build/perf/util/cloexec.o
        util/synthetic-events.c:428:22: error: field 'fh' with   CC       /tmp/build/perf/util/call-path.o
        variable sized type 'struct file_handle' not at the end of a struct or class is a GNU
              extension [-Werror,-Wgnu-variable-sized-type-not-at-end]
                        struct file_handle fh;
                                           ^
        1 error generated.
      
      Deal with this at some point, i.e. investigate if the right thing is to
      remove that -Wgnu-variable-sized-type-not-at-end from our CFLAGS, for
      now do the test the same way as it is used looks more sensible.
      Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200402015249.3800462-1-namhyung@kernel.org
      [ split from a larger patch, removed blank line at EOF ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      49f550ea
    • A
      perf python: Include rwsem.c in the pythong biding · 460c3ed9
      Arnaldo Carvalho de Melo 提交于
      We'll need it for the cgroup patches, and its better to have it in a
      separate patch in case we need to later revert the cgroup patches.
      
      I.e. without this we have:
      
        [root@five ~]# perf test -v python
        19: 'import perf' in python                               :
        --- start ---
        test child forked, pid 148447
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.cpython-37m-x86_64-linux-gnu.so: undefined symbol: down_write
        test child finished with -1
        ---- end ----
        'import perf' in python: FAILED!
        [root@five ~]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20200403123606.GC23243@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      460c3ed9
  4. 27 3月, 2020 2 次提交
    • H
      perf script: Introduce --deltatime option · 26567ed7
      Hagen Paul Pfeifer 提交于
      For some kind of analysis a deltatime output is more human friendly and
      reduce the cognitive load for further analysis.
      
      The following output demonstrate the new option "deltatime": calculate
      the time difference in relation to the previous event.
      
        $ perf script --deltatime
        test  2525 [001]     0.000000:            sdt_libev:ev_add: (5635e72a5ebd)
        test  2525 [001]     0.000091:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000051: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000685:            sdt_libev:ev_add: (5635e72a5ebd)
        test  2525 [001]     0.000048:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000104: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.003895:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.996034: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000058:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     1.000004: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000064:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.999934: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
        test  2525 [001]     0.000056:  sdt_libev:epoll_wait_enter: (5635e72a76a9)
        test  2525 [001]     0.999930: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
      
      Committer testing:
      
      So go from default output to --reltime and then this new --deltatime, to
      contrast the various timestamp presentation modes for a random perf.data file I
      had laying around:
      
        [root@five ~]# perf script --reltime | head
           perf 442394 [000]     0.000000:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000004:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000006:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000009: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000036:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000038:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000040:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000041:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000044: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]# perf script --deltatime | head
           perf 442394 [000]     0.000000:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000001:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000001:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]     0.000002: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000027:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000002:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000001:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000001:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]     0.000002: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]# perf script | head
           perf 442394 [000]  7600.157861:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157864:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157866:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157867:  128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [000]  7600.157870: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157897:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157900:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157901:   16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157903:  224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
           perf 442394 [001]  7600.157906: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
        [root@five ~]#
      
      Andi suggested we better implement it as a new field, i.e. -F deltatime, like:
      
        [root@five ~]# perf script -F deltatime
        Invalid field requested.
      
         Usage: perf script [<options>]
            or: perf script [<options>] record <script> [<record-options>] <command>
            or: perf script [<options>] report <script> [script-args]
            or: perf script [<options>] <script> [<record-options>] <command>
            or: perf script [<options>] <top-script> [script-args]
      
            -F, --fields <str>    comma separated output fields prepend with 'type:'. +field to add and -field to remove.Valid types: hw,sw,trace,raw,synth. Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,srcline,period,iregs,uregs,brstack,brstacksym,flags,bpf-output,brstackinsn,brstackoff,callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc
        [root@five ~]#
      
      I.e. we have -F for maximum flexibility:
      
        [root@five ~]# perf script -F comm,pid,cpu,time | head
                  perf 442394 [000]  7600.157861:
                  perf 442394 [000]  7600.157864:
                  perf 442394 [000]  7600.157866:
                  perf 442394 [000]  7600.157867:
                  perf 442394 [000]  7600.157870:
                  perf 442394 [001]  7600.157897:
                  perf 442394 [001]  7600.157900:
                  perf 442394 [001]  7600.157901:
                  perf 442394 [001]  7600.157903:
                  perf 442394 [001]  7600.157906:
        [root@five ~]#
      
      But since we already have --reltime, having --deltatime, documented one after
      the other is sensible.
      Signed-off-by: NHagen Paul Pfeifer <hagen@jauu.net>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lore.kernel.org/lkml/20200204173709.489161-1-hagen@jauu.net
      [ Added 'perf script' man page entry for --deltatime ]
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      26567ed7
    • A
      perf test x86: Add CET instructions to the new instructions test · 26cec748
      Adrian Hunter 提交于
      Add to the "x86 instruction decoder - new instructions" test the
      following instructions:
      
      	incsspd
      	incsspq
      	rdsspd
      	rdsspq
      	saveprevssp
      	rstorssp
      	wrssd
      	wrssq
      	wrussd
      	wrussq
      	setssbsy
      	clrssbsy
      	endbr32
      	endbr64
      
      And the "notrack" prefix for indirect calls and jumps.
      
      For information about the instructions, refer Intel Control-flow
      Enforcement Technology Specification May 2019 (334525-003).
      
      Committer testing:
      
        $ perf test instr
        67: x86 instruction decoder - new instructions            : Ok
        $
      
      Then use verbose mode and check one of those new instructions:
      
        $ perf test -v instr |& grep saveprevssp
        Decoded ok: f3 0f 01 ea          	saveprevssp
        Decoded ok: f3 0f 01 ea          	saveprevssp
        $
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi v. Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20200204171425.28073-3-yu-cheng.yu@intel.comSigned-off-by: NYu-cheng Yu <yu-cheng.yu@intel.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      26cec748
  5. 26 3月, 2020 4 次提交