1. 12 5月, 2021 3 次提交
  2. 10 5月, 2021 7 次提交
    • A
      tools headers UAPI: Sync perf_event.h with the kernel sources · 71d7924b
      Arnaldo Carvalho de Melo 提交于
      To pick up the changes in:
      
        2b26f0aa ("perf: Support only inheriting events if cloned with CLONE_THREAD")
        2e498d0a ("perf: Add support for event removal on exec")
        547b6098 ("perf: aux: Add flags for the buffer format")
        55bcf6ef ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")
        7dde5176 ("perf: aux: Add CoreSight PMU buffer formats")
        97ba62b2 ("perf: Add support for SIGTRAP on perf events")
        d0d1dd62 ("perf core: Add PERF_COUNT_SW_CGROUP_SWITCHES event")
      
      Also change the expected sizeof(struct perf_event_attr) from 120 to 128 due to
      fields being added for the SIGTRAP changes.
      
      Addressing this perf build warning:
      
        Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
        diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
      
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Marco Elver <elver@google.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      71d7924b
    • A
      tools headers UAPI: Sync files changed by landlock, quotactl_path and mount_settattr new syscalls · f8bcb061
      Arnaldo Carvalho de Melo 提交于
      To pick the changes in these csets:
      
        a49f4f81 ("arch: Wire up Landlock syscalls")
        2a186721 ("fs: add mount_setattr()")
        fa8b9007 ("quota: wire up quotactl_path")
      
      That silences these perf build warnings and add support for those new
      syscalls in tools such as 'perf trace'.
      
      For instance, this is now possible:
      
        # ~acme/bin/perf trace -v -e landlock*
        event qualifier tracepoint filter: (common_pid != 129365 && common_pid != 3502) && (id == 444 || id == 445 || id == 446)
        ^C#
      
      That is tha filter expression attached to the raw_syscalls:sys_{enter,exit}
      tracepoints.
      
        $ grep landlock tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
        444	common	landlock_create_ruleset	sys_landlock_create_ruleset
        445	common	landlock_add_rule	sys_landlock_add_rule
        446	common	landlock_restrict_self	sys_landlock_restrict_self
        $
      
      This addresses these perf build warnings:
      
        Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
        diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
        Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
        Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
        diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
        Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
        diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
        Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
        diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
      
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: James Morris <jamorris@linux.microsoft.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mickaël Salaün <mic@linux.microsoft.com>
      Cc: Sascha Hauer <s.hauer@pengutronix.de>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f8bcb061
    • M
      perf tools: Fix a build error on arm64 with clang · a00b7e39
      Masami Hiramatsu 提交于
      Since clang's -Wmissing-field-initializers warns if a data
      structure is initialized with a signle NULL as below,
      
       ----
       tools/perf $ make CC=clang LLVM=1
       ...
       arch/arm64/util/kvm-stat.c:74:9: error: missing field 'ops' initializer [-Werror,-Wmissing-field-initializers]
               { NULL },
                      ^
       1 error generated.
       ----
      
      add another field initializer expressly as same as other
      arch's kvm-stat.c code.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Link: http://lore.kernel.org/lkml/162037767540.94840.15758657049033010518.stgit@devnote2Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a00b7e39
    • J
      perf tools: Fix dynamic libbpf link · ad1237c3
      Jiri Olsa 提交于
      Justin reported broken build with LIBBPF_DYNAMIC=1.
      
      When linking libbpf dynamically we need to use perf's
      hashmap object, because it's not exported in libbpf.so
      (only in libbpf.a).
      
      Following build is now passing:
      
        $ make LIBBPF_DYNAMIC=1
          BUILD:   Doing 'make -j8' parallel build
          ...
        $ ldd perf | grep libbpf
              libbpf.so.0 => /lib64/libbpf.so.0 (0x00007fa7630db000)
      
      Fixes: eee19501 ("perf tools: Grab a copy of libbpf's hashmap")
      Reported-by: NJustin M. Forbes <jforbes@redhat.com>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210508205020.617984-1-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ad1237c3
    • D
      perf session: Fix swapping of cpu_map and stat_config records · a11c9a6e
      Dmitry Koshelev 提交于
      'data' field in perf_record_cpu_map_data struct is 16-bit
      wide and so should be swapped using bswap_16().
      
      'nr' field in perf_record_stat_config struct should be
      swapped before being used for size calculation.
      Signed-off-by: NDmitry Koshelev <karaghiozis@gmail.com>
      Acked-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210506131244.13328-1-karaghiozis@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a11c9a6e
    • I
      perf jevents: Silence warning for ArchStd files · 7aa3c9ea
      Ian Rogers 提交于
      JSON files in the level 1 directory are used for ArchStd events (see
      preprocess_arch_std_files), as such they shouldn't be warned about.
      Signed-off-by: NIan Rogers <irogers@google.com>
      Reviewed-by: NJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210506225640.1461000-1-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      7aa3c9ea
    • N
      perf record: Disallow -c and -F option at the same time · e8c11676
      Namhyung Kim 提交于
      It's confusing which one is effective when the both options are given.
      The current code happens to use -c in this case but users might not be
      aware of it.  We can change it to complain about that instead of relying
      on the implicit priority.
      
      Before:
      
        $ perf record -c 111111 -F 99 true
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.031 MB perf.data (8 samples) ]
      
        $ perf evlist -F
        cycles: sample_period=111111
        $
      
      After:
        $ perf record -c 111111 -F 99 true
        cannot set frequency and period at the same time
        $
      
      So this change can break existing usages, but I think it's rare to have
      both options and it'd be better changing them.
      Suggested-by: NAlexey Alexandrov <aalexand@google.com>
      Signed-off-by: NNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20210402094020.28164-1-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      e8c11676
  3. 29 4月, 2021 30 次提交
    • A
      perf build: Defer printing detected features to the end of all feature checks · c6e3bf43
      Arnaldo Carvalho de Melo 提交于
      We were doing it in tools/build/Makefile.feature, after running the
      feature checks, but then in tools/perf/Makefile.config we can call more
      feature checks when we notice that some feature check failed, like when
      libbfd wasn't detected and we add libraries to the LDFLAGS of its
      feature check to try again, etc.
      Acked-by: NJiri Olsa <jolsa@kernel.org>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c6e3bf43
    • J
      perf build: Regenerate the FEATURE_DUMP file after extra feature checks · fbed59f8
      Jiri Olsa 提交于
      Feature detection is done in tools/build/Makefile.feature, we may exit
      there with some features not detected and then, in
      tools/perf/Makefile.config try adding extra libraries to link and then
      do extra feature checks to see if we now find the feature.
      
      This is the case with the disassembler-four-args that checks if the
      diassembler() function in libopcodes (binutils) has a signature with
      one or with four arguments, as this is not ABI and they changed it at
      some point.
      
      This is not a problem when doing normal builds, for instance:
      
        $ make -C tools/perf O=/tmp/build/perf
      
      As we don't use what is in FEATURE-DUMP at that point, but is a problem
      if we pass FEATURE_DUMP=/previously-detected-features as we do in
      'make -C tools/perf build-test' to reuse the feature detection in the
      many build combinations we test there.
      
      When that is done feature-disassembler-four-args will be set to 0, but
      opensuse 15.1 has the four arguments function signature in
      disassembler(). The build thus fails.
      
      Fix it by rewriting the FEATURE-DUMP file at the end of
      tools/perf/Makefile.config to register features we retested in that make
      file.
      Signed-off-by: NJiri Olsa <jolsa@redhat.com>
      Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      fbed59f8
    • L
      perf session: Dump PERF_RECORD_TIME_CONV event · 81e70d7e
      Leo Yan 提交于
      Now perf tool uses the common stub function process_event_op2_stub() for
      dumping TIME_CONV event, thus it doesn't output the clock parameters
      contained in the event.
      
      This patch adds the callback function for dumping the hardware clock
      parameters in TIME_CONV event.
      
      Before:
      
        # perf report -D
      
        0x978 [0x38]: event: 79
        .
        . ... raw event: size 56 bytes
        .  0000:  4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00  O.....8.........
        .  0010:  00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff  ..@........<BF><DF><FF><FF><FF>
        .  0020:  d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00  <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>.
        .  0030:  01 01 00 00 00 00 00 00                          ........
      
        0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV
        : unhandled!
      
        [...]
      
      After:
      
        # perf report -D
      
        0x978 [0x38]: event: 79
        .
        . ... raw event: size 56 bytes
        .  0000:  4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00  O.....8.........
        .  0010:  00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff  ..@........<BF><DF><FF><FF><FF>
        .  0020:  d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00  <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>.
        .  0030:  01 01 00 00 00 00 00 00                          ........
      
        0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV
        ... Time Shift      21
        ... Time Muliplier  20971520
        ... Time Zero       18446743935180835206
        ... Time Cycles     13852918225
        ... Time Mask       0xffffffffffffff
        ... Cap Time Zero   1
        ... Cap Time Short  1
        : unhandled!
      
        [...]
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Link: https://lore.kernel.org/r/20210428120915.7123-5-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      81e70d7e
    • L
      perf session: Add swap operation for event TIME_CONV · 050ffc44
      Leo Yan 提交于
      Since commit d110162c ("perf tsc: Support cap_user_time_short for
      event TIME_CONV"), the event PERF_RECORD_TIME_CONV has extended the data
      structure for clock parameters.
      
      To be backwards-compatible, this patch adds a dedicated swap operation
      for the event PERF_RECORD_TIME_CONV, based on checking if the event
      contains field "time_cycles", it can support both for the old and new
      event formats.
      
      Fixes: d110162c ("perf tsc: Support cap_user_time_short for event TIME_CONV")
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Link: https://lore.kernel.org/r/20210428120915.7123-4-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      050ffc44
    • L
      perf jit: Let convert_timestamp() to be backwards-compatible · aa616f5a
      Leo Yan 提交于
      Commit d110162c ("perf tsc: Support cap_user_time_short for
      event TIME_CONV") supports the extended parameters for event TIME_CONV,
      but it broke the backwards compatibility, so any perf data file with old
      event format fails to convert timestamp.
      
      This patch introduces a helper event_contains() to check if an event
      contains a specific member or not.  For the backwards-compatibility, if
      the event size confirms the extended parameters are supported in the
      event TIME_CONV, then copies these parameters.
      
      Committer notes:
      
      To make this compiler backwards compatible add this patch:
      
        -       struct perf_tsc_conversion tc = { 0 };
        +       struct perf_tsc_conversion tc = { .time_shift = 0, };
      
      Fixes: d110162c ("perf tsc: Support cap_user_time_short for event TIME_CONV")
      Signed-off-by: NLeo Yan <leo.yan@linaro.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steve MacLean <Steve.MacLean@Microsoft.com>
      Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
      Link: https://lore.kernel.org/r/20210428120915.7123-3-leo.yan@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      aa616f5a
    • M
      perf tools: Enable libtraceevent dynamic linking · 56d32d4c
      Michael Petlan 提交于
      Currently we support only static linking with kernel's libtraceevent
      (tools/lib/traceevent). This patch adds libtraceevent package detection
      and support to link perf with it dynamically.
      
        The libtraceevent package status is displayed with:
        $ make VF=1 LIBTRACEEVENT_DYNAMIC=1
        ...
        ...                 libtraceevent: [ on  ]
      
      Default behavior remains the same (static linking).
      
      Committer testing:
      
        $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent
        Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel.  Stop.
        $
      Signed-off-by: NMichael Petlan <mpetlan@redhat.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com
      Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      56d32d4c
    • J
      perf Documentation: Document intel-hybrid support · 2750ce1d
      Jin Yao 提交于
      Add some words and examples to help understanding of
      Intel hybrid perf support.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-27-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2750ce1d
    • J
      perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid · a37f3b88
      Jin Yao 提交于
      Currently we don't support shadow stat for hybrid.
      
        root@ssp-pwrt-002:~# ./perf stat -e cycles,instructions -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
            12,883,109,591      cpu_core/cycles/
             6,405,163,221      cpu_atom/cycles/
               555,553,778      cpu_core/instructions/
               841,158,734      cpu_atom/instructions/
      
               1.002644773 seconds time elapsed
      
      Now there is no shadow stat 'insn per cycle' reported. We will support
      it later and now just skip the 'perf stat metrics (shadow stat) test'.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-26-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      a37f3b88
    • J
      perf tests: Support 'Convert perf time to TSC' test for hybrid · d9da6f70
      Jin Yao 提交于
      Since for "cycles:u' on hybrid platform, it creates two "cycles".  So
      the second evsel in evlist also needs initialization.
      
      With this patch,
      
        # ./perf test 71
        71: Convert perf time to TSC                                        : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-25-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      d9da6f70
    • J
      perf tests: Support 'Session topology' test for hybrid · c1020388
      Jin Yao 提交于
      Force to create one event "cpu_core/cycles/" by default, otherwise in
      evlist__valid_sample_type, the checking of 'if (evlist->core.nr_entries
      == 1)' would be failed.
      
        # ./perf test 41
        41: Session topology                                                : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-24-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c1020388
    • J
      perf tests: Support 'Parse and process metrics' test for hybrid · 6081e876
      Jin Yao 提交于
      Some events are not supported. Only pick up some cases for hybrid.
      
        # ./perf test 68
        68: Parse and process metrics                                       : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-23-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      6081e876
    • J
      perf tests: Support 'Track with sched_switch' test for hybrid · 43eb05d0
      Jin Yao 提交于
      Since for "cycles:u' on hybrid platform, it creates two "cycles".
      So the number of events in evlist is not expected in next test
      steps. Now we just use one event "cpu_core/cycles:u/" for hybrid.
      
        # ./perf test 35
        35: Track with sched_switch                                         : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-22-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      43eb05d0
    • J
      perf tests: Skip 'Setup struct perf_event_attr' test for hybrid · f15da0b1
      Jin Yao 提交于
      For hybrid, the attr.type consists of pmu type id + original type.
      There will be much changes for this test. Now we temporarily
      skip this test case and TODO in future.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-21-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      f15da0b1
    • J
      perf tests: Add hybrid cases for 'Roundtrip evsel->name' test · afff9f31
      Jin Yao 提交于
      Since for one hw event, two hybrid events are created.
      
      For example,
      
      evsel->idx      evsel__name(evsel)
      0               cycles
      1               cycles
      2               instructions
      3               instructions
      ...
      
      So for comparing the evsel name on hybrid, the evsel->idx
      needs to be divided by 2.
      
        # ./perf test 14
        14: Roundtrip evsel->name                                           : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-20-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      afff9f31
    • J
      perf tests: Add hybrid cases for 'Parse event definition strings' test · 2541cb63
      Jin Yao 提交于
      Add basic hybrid test cases for 'Parse event definition strings' test.
      
        # perf test 6
         6: Parse event definition strings                                  : Ok
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-19-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      2541cb63
    • J
      perf record: Uniquify hybrid event name · 91c0f5ec
      Jin Yao 提交于
      For perf-record, it would be useful to tell user the pmu which the
      event belongs to.
      
      For example,
      
        # perf record -a -- sleep 1
        # perf report
      
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 106  of event 'cpu_core/cycles/'
        # Event count (approx.): 22043448
        #
        # Overhead  Command       Shared Object            Symbol
        # ........  ............  .......................  ............................
        #
        ...
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-18-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      91c0f5ec
    • J
      perf stat: Warn group events from different hybrid PMU · 660e533e
      Jin Yao 提交于
      If a group has events which are from different hybrid PMUs,
      shows a warning:
      
      "WARNING: events in group from different hybrid PMUs!"
      
      This is to remind the user not to put the core event and atom
      event into one group.
      
      Next, just disable grouping.
      
        # perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1
        WARNING: events in group from different hybrid PMUs!
        WARNING: grouped events cpus do not match, disabling group:
          anon group { cpu_core/cycles/, cpu_atom/cycles/ }
      
         Performance counter stats for 'system wide':
      
                 5,438,125      cpu_core/cycles/
                 3,914,586      cpu_atom/cycles/
      
               1.004250966 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-17-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      660e533e
    • J
      perf stat: Filter out unmatched aggregation for hybrid event · 92637cc7
      Jin Yao 提交于
      perf-stat has supported some aggregation modes, such as --per-core,
      --per-socket and etc. While for hybrid event, it may only available
      on part of cpus. So for --per-core, we need to filter out the
      unavailable cores, for --per-socket, filter out the unavailable
      sockets, and so on.
      
      Before:
      
        # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
        S0-D0-C0           2            479,530      cpu_core/cycles/
        S0-D0-C4           2            175,007      cpu_core/cycles/
        S0-D0-C8           2            166,240      cpu_core/cycles/
        S0-D0-C12          2            704,673      cpu_core/cycles/
        S0-D0-C16          2            865,835      cpu_core/cycles/
        S0-D0-C20          2          2,958,461      cpu_core/cycles/
        S0-D0-C24          2            163,988      cpu_core/cycles/
        S0-D0-C28          2            164,729      cpu_core/cycles/
        S0-D0-C32          0      <not counted>      cpu_core/cycles/
        S0-D0-C33          0      <not counted>      cpu_core/cycles/
        S0-D0-C34          0      <not counted>      cpu_core/cycles/
        S0-D0-C35          0      <not counted>      cpu_core/cycles/
        S0-D0-C36          0      <not counted>      cpu_core/cycles/
        S0-D0-C37          0      <not counted>      cpu_core/cycles/
        S0-D0-C38          0      <not counted>      cpu_core/cycles/
        S0-D0-C39          0      <not counted>      cpu_core/cycles/
      
               1.003597211 seconds time elapsed
      
      After:
      
        # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
        S0-D0-C0           2            210,428      cpu_core/cycles/
        S0-D0-C4           2            444,830      cpu_core/cycles/
        S0-D0-C8           2            435,241      cpu_core/cycles/
        S0-D0-C12          2            423,976      cpu_core/cycles/
        S0-D0-C16          2            859,350      cpu_core/cycles/
        S0-D0-C20          2          1,559,589      cpu_core/cycles/
        S0-D0-C24          2            163,924      cpu_core/cycles/
        S0-D0-C28          2            376,610      cpu_core/cycles/
      
               1.003621290 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Co-developed-by: NJiri Olsa <jolsa@redhat.com>
      Reviewed-by: NJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-16-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      92637cc7
    • J
      perf stat: Add default hybrid events · ac2dc29e
      Jin Yao 提交于
      Previously if '-e' is not specified in perf stat, some software events
      and hardware events are added to evlist by default.
      
      Before:
      
        # perf stat -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 24,044.40 msec cpu-clock                 #   23.946 CPUs utilized
                        99      context-switches          #    4.117 /sec
                        24      cpu-migrations            #    0.998 /sec
                         3      page-faults               #    0.125 /sec
                 7,000,244      cycles                    #    0.000 GHz
                 2,955,024      instructions              #    0.42  insn per cycle
                   608,941      branches                  #   25.326 K/sec
                    31,991      branch-misses             #    5.25% of all branches
      
               1.004106859 seconds time elapsed
      
      Among the events, cycles, instructions, branches and branch-misses
      are hardware events.
      
      One hybrid platform, two hardware events are created for one
      hardware event.
      
      cpu_core/cycles/,
      cpu_atom/cycles/,
      cpu_core/instructions/,
      cpu_atom/instructions/,
      cpu_core/branches/,
      cpu_atom/branches/,
      cpu_core/branch-misses/,
      cpu_atom/branch-misses/
      
      These events would be added to evlist on hybrid platform.
      
      Since parse_events() has been supported to create two hardware events
      for one event on hybrid platform, so we just use parse_events(evlist,
      "cycles,instructions,branches,branch-misses") to create the default
      events and add them to evlist.
      
      After:
      
        # perf stat -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                 24,043.99 msec cpu-clock                 #   23.991 CPUs utilized
                       139      context-switches          #    5.781 /sec
                        25      cpu-migrations            #    1.040 /sec
                         6      page-faults               #    0.250 /sec
                10,381,751      cpu_core/cycles/          #  431.782 K/sec
                 1,264,216      cpu_atom/cycles/          #   52.579 K/sec
                 3,406,958      cpu_core/instructions/    #  141.697 K/sec
                   414,588      cpu_atom/instructions/    #   17.243 K/sec
                   705,149      cpu_core/branches/        #   29.327 K/sec
                    82,358      cpu_atom/branches/        #    3.425 K/sec
                    40,821      cpu_core/branch-misses/   #    1.698 K/sec
                     9,086      cpu_atom/branch-misses/   #  377.891 /sec
      
               1.002228863 seconds time elapsed
      
      We can see two events are created for one hardware event.
      
      One TODO is, the shadow stats looks a bit different, now it's just
      'M/sec'.
      
      The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
      need to be improved in future if we want to get the original shadow
      stats.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-15-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      ac2dc29e
    • J
      perf record: Create two hybrid 'cycles' events by default · b53a0755
      Jin Yao 提交于
      When evlist is empty, for example no '-e' specified in perf record,
      one default 'cycles' event is added to evlist.
      
      While on hybrid platform, it needs to create two default 'cycles'
      events. One is for cpu_core, the other is for cpu_atom.
      
      This patch actually calls evsel__new_cycles() two times to create
      two 'cycles' events.
      
        # ./perf record -vv -a -- sleep 1
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          freq                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
        sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 6
        sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 7
        sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 9
        sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 10
        sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 11
        sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 12
        sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 13
        sys_perf_event_open: pid -1  cpu 8  group_fd -1  flags 0x8 = 14
        sys_perf_event_open: pid -1  cpu 9  group_fd -1  flags 0x8 = 15
        sys_perf_event_open: pid -1  cpu 10  group_fd -1  flags 0x8 = 16
        sys_perf_event_open: pid -1  cpu 11  group_fd -1  flags 0x8 = 17
        sys_perf_event_open: pid -1  cpu 12  group_fd -1  flags 0x8 = 18
        sys_perf_event_open: pid -1  cpu 13  group_fd -1  flags 0x8 = 19
        sys_perf_event_open: pid -1  cpu 14  group_fd -1  flags 0x8 = 20
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 21
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          { sample_period, sample_freq }   4000
          sample_type                      IP|TID|TIME|ID|CPU|PERIOD
          read_format                      ID
          disabled                         1
          inherit                          1
          freq                             1
          precise_ip                       3
          sample_id_all                    1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 22
        sys_perf_event_open: pid -1  cpu 17  group_fd -1  flags 0x8 = 23
        sys_perf_event_open: pid -1  cpu 18  group_fd -1  flags 0x8 = 24
        sys_perf_event_open: pid -1  cpu 19  group_fd -1  flags 0x8 = 25
        sys_perf_event_open: pid -1  cpu 20  group_fd -1  flags 0x8 = 26
        sys_perf_event_open: pid -1  cpu 21  group_fd -1  flags 0x8 = 27
        sys_perf_event_open: pid -1  cpu 22  group_fd -1  flags 0x8 = 28
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 29
        ------------------------------------------------------------
      
      We have to create evlist-hybrid.c otherwise due to the symbol
      dependency the perf test python would be failed.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      b53a0755
    • J
      perf parse-events: Support event inside hybrid pmu · 5e4edd1f
      Jin Yao 提交于
      On hybrid platform, user may want to enable events on one pmu.
      
      Following syntax are supported:
      
      cpu_core/<event>/
      cpu_atom/<event>/
      
      But the syntax doesn't work for cache event.
      
      Before:
      
        # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1
        event syntax error: 'cpu_core/LLC-loads/'
                                      \___ unknown term 'LLC-loads' for pmu 'cpu_core'
      
      Cache events are a bit complex. We can't create aliases for them.
      We use another solution. For example, if we use "cpu_core/LLC-loads/",
      in parse_events_add_pmu(), term->config is "LLC-loads".
      
      Then we create a new parser to scan "LLC-loads". The
      parse_events_add_cache() would be called during parsing.
      The parse_state->hybrid_pmu_name is used to identify the pmu
      where the event should be enabled on.
      
      After:
      
        # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1
      
         Performance counter stats for 'system wide':
      
                    24,593      cpu_core/LLC-loads/
      
               1.003911601 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-13-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      5e4edd1f
    • J
      perf parse-events: Compare with hybrid pmu name · c93afadc
      Jin Yao 提交于
      On hybrid platform, user may want to enable event only on one pmu.
      Following syntax will be supported:
      
      cpu_core/<event>/
      cpu_atom/<event>/
      
      For hardware event, hardware cache event and raw event, two events
      are created by default. We pass the specified pmu name in parse_state
      and it would be checked before event creation. So next only the
      event with the specified pmu would be created.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-12-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c93afadc
    • J
      perf parse-events: Create two hybrid raw events · 94da591b
      Jin Yao 提交于
      On hybrid platform, same raw event is possible to be available
      on both cpu_core pmu and cpu_atom pmu. It's supported to create
      two raw events for one event encoding. For raw events, the
      attr.type is PMU type.
      
        # perf stat -e r3c -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          type                             4
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             4
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          type                             8
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             8
          size                             120
          config                           0x3c
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        r3c: 0: 434449 1001412521 1001412521
        r3c: 1: 173162 1001482031 1001482031
        r3c: 2: 231710 1001524974 1001524974
        r3c: 3: 110012 1001563523 1001563523
        r3c: 4: 191517 1001593221 1001593221
        r3c: 5: 956458 1001628147 1001628147
        r3c: 6: 416969 1001715626 1001715626
        r3c: 7: 1047527 1001596650 1001596650
        r3c: 8: 103877 1001633520 1001633520
        r3c: 9: 70571 1001637898 1001637898
        r3c: 10: 550284 1001714398 1001714398
        r3c: 11: 1257274 1001738349 1001738349
        r3c: 12: 107797 1001801432 1001801432
        r3c: 13: 67471 1001836281 1001836281
        r3c: 14: 286782 1001923161 1001923161
        r3c: 15: 815509 1001952550 1001952550
        r3c: 0: 95994 1002071117 1002071117
        r3c: 1: 105570 1002142438 1002142438
        r3c: 2: 115921 1002189147 1002189147
        r3c: 3: 72747 1002238133 1002238133
        r3c: 4: 103519 1002276753 1002276753
        r3c: 5: 121382 1002315131 1002315131
        r3c: 6: 80298 1002248050 1002248050
        r3c: 7: 466790 1002278221 1002278221
        r3c: 6821369 16026754282 16026754282
        r3c: 1162221 8017758990 8017758990
      
         Performance counter stats for 'system wide':
      
                 6,821,369      cpu_core/r3c/
                 1,162,221      cpu_atom/r3c/
      
               1.002289965 seconds time elapsed
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-11-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      94da591b
    • J
      perf parse-events: Create two hybrid cache events · 30def61f
      Jin Yao 提交于
      For cache events, they have pre-defined configs. The kernel needs
      to know where the cache event comes from (e.g. from cpu_core pmu
      or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE
      can't carry pmu information.
      
      Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type.
      The PMU type ID is stored at attr.config[63:32].
      
      When enabling a hybrid cache event without specified pmu, such as,
      'perf stat -e LLC-loads -a', two events are created
      automatically. One is for atom, the other is for core.
      
        # perf stat -e LLC-loads -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x400000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x400000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x800000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          type                             3
          size                             120
          config                           0x800000002
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        LLC-loads: 0: 1507 1001800280 1001800280
        LLC-loads: 1: 666 1001812250 1001812250
        LLC-loads: 2: 3353 1001813453 1001813453
        LLC-loads: 3: 514 1001848795 1001848795
        LLC-loads: 4: 627 1001952832 1001952832
        LLC-loads: 5: 4399 1001451154 1001451154
        LLC-loads: 6: 1240 1001481052 1001481052
        LLC-loads: 7: 478 1001520348 1001520348
        LLC-loads: 8: 691 1001551236 1001551236
        LLC-loads: 9: 310 1001578945 1001578945
        LLC-loads: 10: 1018 1001594354 1001594354
        LLC-loads: 11: 3656 1001622355 1001622355
        LLC-loads: 12: 882 1001661416 1001661416
        LLC-loads: 13: 506 1001693963 1001693963
        LLC-loads: 14: 3547 1001721013 1001721013
        LLC-loads: 15: 1399 1001734818 1001734818
        LLC-loads: 0: 1314 1001793826 1001793826
        LLC-loads: 1: 2857 1001752764 1001752764
        LLC-loads: 2: 646 1001830694 1001830694
        LLC-loads: 3: 1612 1001864861 1001864861
        LLC-loads: 4: 2244 1001912381 1001912381
        LLC-loads: 5: 1255 1001943889 1001943889
        LLC-loads: 6: 4624 1002021109 1002021109
        LLC-loads: 7: 2703 1001959302 1001959302
        LLC-loads: 24793 16026838264 16026838264
        LLC-loads: 17255 8015078826 8015078826
      
         Performance counter stats for 'system wide':
      
                    24,793      cpu_core/LLC-loads/
                    17,255      cpu_atom/LLC-loads/
      
               1.001970988 seconds time elapsed
      
      0x4 in 0x400000002 indicates the cpu_core pmu.
      0x8 in 0x800000002 indicates the cpu_atom pmu.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      30def61f
    • J
      perf parse-events: Create two hybrid hardware events · 9cbfa2f6
      Jin Yao 提交于
      Current hardware events has special perf types PERF_TYPE_HARDWARE.
      But it doesn't pass the PMU type in the user interface. For a hybrid
      system, the perf kernel doesn't know which PMU the events belong to.
      
      So now this type is extended to be PMU aware type. The PMU type ID
      is stored at attr.config[63:32].
      
      PMU type ID is retrieved from sysfs.
      
        root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
        8
      
        root@lkp-adl-d01:/sys/devices/cpu_core# cat type
        4
      
      When enabling a hybrid hardware event without specified pmu, such as,
      'perf stat -e cycles -a', two events are created automatically. One
      is for atom, the other is for core.
      
        # perf stat -e cycles -a -vv -- sleep 1
        Control descriptor is not initialized
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x400000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
        ------------------------------------------------------------
        ...
        ------------------------------------------------------------
        perf_event_attr:
          size                             120
          config                           0x800000000
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
        cycles: 0: 836272 1001525722 1001525722
        cycles: 1: 628564 1001580453 1001580453
        cycles: 2: 872693 1001605997 1001605997
        cycles: 3: 70417 1001641369 1001641369
        cycles: 4: 88593 1001726722 1001726722
        cycles: 5: 470495 1001752993 1001752993
        cycles: 6: 484733 1001840440 1001840440
        cycles: 7: 1272477 1001593105 1001593105
        cycles: 8: 209185 1001608616 1001608616
        cycles: 9: 204391 1001633962 1001633962
        cycles: 10: 264121 1001661745 1001661745
        cycles: 11: 826104 1001689904 1001689904
        cycles: 12: 89935 1001728861 1001728861
        cycles: 13: 70639 1001756757 1001756757
        cycles: 14: 185266 1001784810 1001784810
        cycles: 15: 171094 1001825466 1001825466
        cycles: 0: 129624 1001854843 1001854843
        cycles: 1: 122533 1001840421 1001840421
        cycles: 2: 90055 1001882506 1001882506
        cycles: 3: 139607 1001896463 1001896463
        cycles: 4: 141791 1001907838 1001907838
        cycles: 5: 530927 1001883880 1001883880
        cycles: 6: 143246 1001852529 1001852529
        cycles: 7: 667769 1001872626 1001872626
        cycles: 6744979 16026956922 16026956922
        cycles: 1965552 8014991106 8014991106
      
         Performance counter stats for 'system wide':
      
                 6,744,979      cpu_core/cycles/
                 1,965,552      cpu_atom/cycles/
      
               1.001882711 seconds time elapsed
      
      0x4 in 0x400000000 indicates the cpu_core pmu.
      0x8 in 0x800000000 indicates the cpu_atom pmu.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      9cbfa2f6
    • J
      perf stat: Uniquify hybrid event name · 12279429
      Jin Yao 提交于
      It would be useful to let user know the pmu which the event belongs to.
      perf-stat has supported '--no-merge' option and it can print the pmu
      name after the event name, such as:
      
      "cycles [cpu_core]"
      
      Now this option is enabled by default for hybrid platform but change
      the format to:
      
      "cpu_core/cycles/"
      
      If user configs the name, we still use the user specified name.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      12279429
    • J
      perf pmu: Add hybrid helper functions · c5a26ea4
      Jin Yao 提交于
      The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu
      can be used to identify the hybrid platform and return the found
      hybrid cpu pmu. All the detected hybrid pmus have been saved in
      'perf_pmu__hybrid_pmus' list. So we just need to search this list.
      
      perf_pmu__hybrid_type_to_pmu converts the user specified string
      to hybrid pmu name. This is used to support the '--cputype' option
      in next patches.
      
      perf_pmu__has_hybrid checks the existing of hybrid pmu. Note that,
      we have to define it in pmu.c (make pmu-hybrid.c no more symbol
      dependency), otherwise perf test python would be failed.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-7-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      c5a26ea4
    • J
      perf pmu: Save detected hybrid pmus to a global pmu list · 44462430
      Jin Yao 提交于
      We identify the cpu_core pmu and cpu_atom pmu by explicitly
      checking following files:
      
      For cpu_core, checks:
      "/sys/bus/event_source/devices/cpu_core/cpus"
      
      For cpu_atom, checks:
      "/sys/bus/event_source/devices/cpu_atom/cpus"
      
      If the 'cpus' file exists and it has data, the pmu exists.
      
      But in order not to hardcode the "cpu_core" and "cpu_atom",
      and make the code in a generic way.
      
      So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the
      hybrid pmu exists. All the detected hybrid pmus are linked to a global
      list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the
      list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      44462430
    • J
      perf pmu: Save pmu name · 32705de7
      Jin Yao 提交于
      On hybrid platform, one event is available on one pmu
      (such as, available on cpu_core or on cpu_atom).
      
      This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
      Then next we can know the pmu which the event can be enabled on.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-5-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      32705de7
    • J
      perf pmu: Simplify arguments of __perf_pmu__new_alias · eab35953
      Jin Yao 提交于
      Simplify the arguments of __perf_pmu__new_alias() by passing the whole
      'struct pme_event' pointer.
      Signed-off-by: NJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: NJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210427070139.25256-4-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      eab35953