- 26 3月, 2022 1 次提交
-
-
由 Wei Li 提交于
We support short command 'rec*' for 'record' and 'rep*' for 'report' in lots of sub-commands, but the matching is not quite strict currnetly. It may be puzzling sometime, like we mis-type a 'recport' to report but it will perform 'record' in fact without any message. To fix this, add a check to ensure that the short cmd is valid prefix of the real command. Committer testing: [root@quaco ~]# perf c2c re sleep 1 Usage: perf c2c {record|report} -v, --verbose be more verbose (show counter open errors, etc) # perf c2c rec sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.038 MB perf.data (16 samples) ] # perf c2c recport sleep 1 Usage: perf c2c {record|report} -v, --verbose be more verbose (show counter open errors, etc) # perf c2c record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.038 MB perf.data (15 samples) ] # perf c2c records sleep 1 Usage: perf c2c {record|report} -v, --verbose be more verbose (show counter open errors, etc) # Signed-off-by: NWei Li <liwei391@huawei.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Rui Xiang <rui.xiang@huawei.com> Link: http://lore.kernel.org/lkml/20220325092032.2956161-1-liwei391@huawei.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 3月, 2022 1 次提交
-
-
由 Kan Liang 提交于
When analyzing with 'perf script', it's useful to understand the captured instruction and the next sequential instruction. To calculate the address of the next sequential instruction, the length of the captured instruction is required. For example, you can’t know the next sequential instruction after an unconditional branch unless you calculate that based on its length. For branch stacks, 'perf script' only prints the instruction bytes with 'brstackinsn', but lacks the instruction length. Add 'brstackinsnlen' to print the instruction length. $ perf script -F ip,brstackinsn,brstackinsnlen --xed 7fa555be8f75 _start: 00007fa555be8090 mov %rsp, %rdi ilen: 3 00007fa555be8093 callq 0x7fa555be8ea0 ilen: 5 # PRED 102 cycles [102] 0.02 IPC _dl_start+38: 00007fa555be8ec6 movq %rdx,0x227853(%rip) ilen: 7 00007fa555be8ecd leaq 0x227f94(%rip),%rdx ilen: 7 Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Cc: Ahmad Yasin <ahmad.yasin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/1647871212-184070-1-git-send-email-kan.liang@linux.intel.com [ Added the new field to tools/perf/Documentation/perf-script.txt ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 3月, 2022 2 次提交
-
-
由 James Clark 提交于
The type info is saved when using '-j save_type'. Output this in 'perf script' so it can be accessed by other tools or for debugging. It's appended to the end of the list of fields so any existing tools that split on / and access fields via an index are not affected. Also output '-' instead of 'N/A' when the branch type isn't saved because / is used as a field separator. Entries before this change look like this: 0xaaaadb350838/0xaaaadb3507a4/P/-/-/0 And afterwards like this: 0xaaaadb350838/0xaaaadb3507a4/P/-/-/0/CALL or this if no type info is saved: 0x7fb57586df6b/0x7fb5758731f0/P/-/-/143/- Signed-off-by: NJames Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220307171917.2555829-5-james.clark@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 James Clark 提交于
Remove duplicate code so that future changes to flags are always made to all 3 printing variations. Signed-off-by: NJames Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220307171917.2555829-4-james.clark@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 2月, 2022 1 次提交
-
-
由 German Gomez 提交于
In SPE traces the 'weight' field can't be printed in 'perf script' because the 'dummy:u' event doesn't have the WEIGHT attribute set. Use evsel__do_check_stype(..) to check this field, as it's done with other fields such as "phys_addr". Before: $ perf record -e arm_spe_0// -- sleep 1 $ perf script -F event,ip,weight Samples for 'dummy:u' event do not have WEIGHT attribute set. Cannot print 'weight' field. After: $ perf script -F event,ip,weight l1d-access: 12 ffffaf629d4cb320 tlb-access: 12 ffffaf629d4cb320 memory: 12 ffffaf629d4cb320 Fixes: b0fde9c6 ("perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT") Signed-off-by: NGerman Gomez <german.gomez@arm.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220221171707.62960-1-german.gomez@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 2月, 2022 3 次提交
-
-
由 Adrian Hunter 提交于
Amend the display to include D and t flags in the same way as the x flag. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-21-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Similar to other Intel PT synth events, display changes to the interrupt flag represented by the MODE.Exec packet. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-20-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Similar to other Intel PT synth events, display Event Trace events recorded by CFE / EVD packets. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-19-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 1月, 2022 1 次提交
-
-
由 Yao Jin 提交于
Perf script was failed to print the phys_addr for SPE profiling. One 'dummy' event is added by SPE profiling but it doesn't have PHYS_ADDR attribute set, perf script then exits with error. Now referring to 'addr', use evsel__do_check_stype() to check the type. Before: # perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\ store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3 # perf script -F pid,tid,addr,phys_addr Samples for 'dummy:u' event do not have PHYS_ADDR attribute set. Cannot print 'phys_addr' field. After: # perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\ store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3 # perf script -F pid,tid,addr,phys_addr 4064384/4064384 ffff802f921be0d0 2f921be0d0 4064384/4064384 ffff802f921be0d0 2f921be0d0 Reviewed-by: NGerman Gomez <german.gomez@arm.com> Signed-off-by: NYao Jin <jinyao5@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220121065954.2121900-1-liwei391@huawei.comSigned-off-by: NWei Li <liwei391@huawei.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 1月, 2022 3 次提交
-
-
由 Ian Rogers 提交于
A common problem is confusing CPU map indices with the CPU, by wrapping the CPU with a struct then this is avoided. This approach is similar to atomic_t. Committer notes: To make it build with BUILD_BPF_SKEL=1 these files needed the conversions to 'struct perf_cpu' usage: tools/perf/util/bpf_counter.c tools/perf/util/bpf_counter_cgroup.c tools/perf/util/bpf_ftrace.c Also perf_env__get_cpu() was removed back in "perf cpumap: Switch cpu_map__build_map to cpu function". Additionally these needed to be fixed for the ARM builds to complete: tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/pmu.c Suggested-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NIan Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Vineet Singh <vineet.singh@intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ian Rogers 提交于
perf_counts are accessed by the densely packed index. Signed-off-by: NIan Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Vineet Singh <vineet.singh@intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220105061351.120843-47-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ian Rogers 提交于
Use perf_cpu_map__for_each_cpu() to help with readability. Signed-off-by: NIan Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Vineet Singh <vineet.singh@intel.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220105061351.120843-33-irogers@google.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 12月, 2021 1 次提交
-
-
由 Adrian Hunter 提交于
CPU filtering was not being applied to a script's switch events. Fixes: 5bf83c29 ("perf script: Add scripting operation process_switch()") Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211215080636.149562-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 12月, 2021 2 次提交
-
-
由 Alexandre Truong 提交于
Enable dwarf_callchain_users on arm64 which will be needed to do a DWARF unwind in order to get the caller of the leaf frame. Reviewed-by: NJames Clark <james.clark@arm.com> Signed-off-by: NAlexandre Truong <alexandre.truong@arm.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20211217154521.80603-5-german.gomez@arm.comSigned-off-by: NGerman Gomez <german.gomez@arm.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Alexandre Truong 提交于
Refactoring script__setup_sample_type() by using callchain_param_setup() to replace the duplicate code for callchain parameter setting up. Reviewed-by: NJames Clark <james.clark@arm.com> Signed-off-by: NAlexandre Truong <alexandre.truong@arm.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20211217154521.80603-4-german.gomez@arm.comSigned-off-by: NGerman Gomez <german.gomez@arm.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 12月, 2021 1 次提交
-
-
由 German Gomez 提交于
When reading a perf.data file with register values, there is a mismatch between the names and the values of the registers because the tool is built using only the register names from the local architecture. Reading a perf.data file that was recorded on ARM64, gives the following erroneous output on an X86 machine: # perf report -i perf_arm64.data -D [...] 24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0 ... user regs: mask 0x1ffffffff ABI 64-bit .... AX 0x0000ffffd1515817 .... BX 0x0000ffffd1515480 .... CX 0x0000aaaadabf6c80 .... DX 0x000000000000002e .... SI 0x0000000040100401 .... DI 0x0040600200000080 .... BP 0x0000ffffd1510e10 .... SP 0x0000000000000000 .... IP 0x00000000000000dd .... FLAGS 0x0000ffffd1510cd0 .... CS 0x0000000000000000 .... SS 0x0000000000000030 .... DS 0x0000ffffa569a208 .... ES 0x0000000000000000 .... FS 0x0000000000000000 .... GS 0x0000000000000000 .... R8 0x0000aaaad3de9650 .... R9 0x0000ffffa57397f0 .... R10 0x0000000000000001 .... R11 0x0000ffffa57fd000 .... R12 0x0000ffffd1515817 .... R13 0x0000ffffd1515480 .... R14 0x0000aaaadabf6c80 .... R15 0x0000000000000000 .... unknown 0x0000000000000001 .... unknown 0x0000000000000000 .... unknown 0x0000000000000000 .... unknown 0x0000000000000000 .... unknown 0x0000000000000000 .... unknown 0x0000ffffd1510d90 .... unknown 0x0000ffffa5739b90 .... unknown 0x0000ffffd1510d80 .... XMM0 0x0000ffffa57392c8 ... thread: perf-exec:43239 ...... dso: [kernel.kallsyms] As can be seen, the register names correspond to X86 registers, even though the perf.data file was recorded on an ARM64 system. After this patch, the output of the command displays the correct register names: # perf report -i perf_arm64.data -D [...] 24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0 ... user regs: mask 0x1ffffffff ABI 64-bit .... x0 0x0000ffffd1515817 .... x1 0x0000ffffd1515480 .... x2 0x0000aaaadabf6c80 .... x3 0x000000000000002e .... x4 0x0000000040100401 .... x5 0x0040600200000080 .... x6 0x0000ffffd1510e10 .... x7 0x0000000000000000 .... x8 0x00000000000000dd .... x9 0x0000ffffd1510cd0 .... x10 0x0000000000000000 .... x11 0x0000000000000030 .... x12 0x0000ffffa569a208 .... x13 0x0000000000000000 .... x14 0x0000000000000000 .... x15 0x0000000000000000 .... x16 0x0000aaaad3de9650 .... x17 0x0000ffffa57397f0 .... x18 0x0000000000000001 .... x19 0x0000ffffa57fd000 .... x20 0x0000ffffd1515817 .... x21 0x0000ffffd1515480 .... x22 0x0000aaaadabf6c80 .... x23 0x0000000000000000 .... x24 0x0000000000000001 .... x25 0x0000000000000000 .... x26 0x0000000000000000 .... x27 0x0000000000000000 .... x28 0x0000000000000000 .... x29 0x0000ffffd1510d90 .... lr 0x0000ffffa5739b90 .... sp 0x0000ffffd1510d80 .... pc 0x0000ffffa57392c8 ... thread: perf-exec:43239 ...... dso: [kernel.kallsyms] Tester comments: Athira reports: "Looks good to me. Tested this patchset in powerpc by capturing regs in powerpc and doing perf report to read the data from x86." Reported-by: NAlexandre Truong <alexandre.truong@arm.com> Reviewed-by: NAthira Jajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: NGerman Gomez <german.gomez@arm.com> Tested-by: NAthira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20211207180653.1147374-4-german.gomez@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 11月, 2021 1 次提交
-
-
由 James Clark 提交于
Only perf report checked the validity of these arguments so apply the same check to all tools that read them for consistency. Signed-off-by: NJames Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Denis Nikitin <denik@chromium.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20211018134844.2627174-3-james.clark@arm.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 31 10月, 2021 2 次提交
-
-
由 Kan Liang 提交于
-F weight in perf script is broken. # ./perf mem record # ./perf script -F weight Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot print 'weight' field. The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The lower 32 bits are exactly the same for both sample type. The higher 32 bits may be different for different architecture. For a new kernel on x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other ARCHs, the PERF_SAMPLE_WEIGHT is used. With -F weight, current perf script will only check the input string "weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit ea8d0ed6 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a new kernel on x86, the check fails. Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to replace PERF_SAMPLE_WEIGHT Fixes: ea8d0ed6 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") Reported-by: NJoe Mario <jmario@redhat.com> Reviewed-by: NKajol Jain <kjain@linux.ibm.com> Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Tested-by: NJiri Olsa <jolsa@redhat.com> Tested-by: NJoe Mario <jmario@redhat.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NJoe Mario <jmario@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Song Liu 提交于
When perf.data is not written cleanly, we would like to process existing data as much as possible (please see f_header.data.size == 0 condition in perf_session__read_header). However, perf.data with partial data may crash perf. Specifically, we see crash in 'perf script' for NULL session->header.env.arch. Fix this by checking session->header.env.arch before using it to determine native_arch. Also split the if condition so it is easier to read. Committer notes: If it is a pipe, we already assume is a native arch, so no need to check session->header.env.arch. Signed-off-by: NSong Liu <songliubraving@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 28 10月, 2021 1 次提交
-
-
由 Kan Liang 提交于
The instruction latency information can be recorded on some platforms, e.g., the Intel Sapphire Rapids server. With both memory latency (weight) and the new instruction latency information, users can easily locate the expensive load instructions, and also understand the time spent in different stages. The users can optimize their applications in different pipeline stages. Add a new field "ins_lat" to filter the instruction latency information, which is available with sample type PERF_SAMPLE_WEIGHT_STRUCT. Signed-off-by: NKan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Mario <jmario@redhat.com> Link: https://lore.kernel.org/r/1632929894-102778-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 9月, 2021 1 次提交
-
-
由 Adrian Hunter 提交于
set_print_ip_opts() was not being called when type != attr->type because there is not a one-to-one relationship between output types and attr->type. That resulted in ip not printing. The attr_type() function is removed, and the match of attr->type to output type is corrected. Example on ADL using taskset to select an atom cpu: # perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ] Before: # perf script | head taskset 428 [-01] 10394.179041: 1 cpu_atom/cpu-cycles/: taskset 428 [-01] 10394.179043: 1 cpu_atom/cpu-cycles/: taskset 428 [-01] 10394.179044: 11 cpu_atom/cpu-cycles/: taskset 428 [-01] 10394.179045: 407 cpu_atom/cpu-cycles/: taskset 428 [-01] 10394.179046: 16789 cpu_atom/cpu-cycles/: taskset 428 [-01] 10394.179052: 676300 cpu_atom/cpu-cycles/: uname 428 [-01] 10394.179278: 4079859 cpu_atom/cpu-cycles/: After: # perf script | head taskset 428 10394.179041: 1 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms]) taskset 428 10394.179043: 1 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms]) taskset 428 10394.179044: 11 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms]) taskset 428 10394.179045: 407 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms]) taskset 428 10394.179046: 16789 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms]) taskset 428 10394.179052: 676300 cpu_atom/cpu-cycles/: 7f829ef73800 cfree+0x0 (/lib/libc-2.32.so) uname 428 10394.179278: 4079859 cpu_atom/cpu-cycles/: ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms]) Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Reviewed-by: NKan Liang <kan.liang@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210911133053.15682-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 9月, 2021 1 次提交
-
-
由 Stephen Brennan 提交于
perf_events may sometimes throttle an event due to creating too many samples during a given timer tick. As of now, the perf tool will not report on throttling, which means this is a silent error. Implement a callback for the throttle and unthrottle events within the Python scripting engine, which can allow scripts to detect and report when events may have been lost due to throttling. The simplest script to report throttle events is: def throttle(*args): print("throttle" + repr(args)) def unthrottle(*args): print("unthrottle" + repr(args)) Signed-off-by: NStephen Brennan <stephen.s.brennan@oracle.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210901210815.133251-1-stephen.s.brennan@oracle.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 8月, 2021 1 次提交
-
-
由 Adrian Hunter 提交于
machine_resolve() may have already been called. Test for that to avoid calling it again unnecessarily. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https //lore.kernel.org/r/20210811101036.17986-3-adrian.hunter@intel.com Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 8月, 2021 1 次提交
-
-
由 Namhyung Kim 提交于
The repipe argument is only used by perf inject and the all others passes 'false'. Let's remove it from the function signature and add __perf_session__new() to be called from perf inject directly. This is a preparation of the change the pipe input/output. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org [ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 7月, 2021 2 次提交
-
-
由 Riccardo Mancini 提交于
ASan reports several memory leaks while running: # perf test "82: Use vfs_getname probe to get syscall args filenames" Two of these are caused by some refcounts not being decreased on perf-script exit, namely script.threads and script.cpus. This patch adds the missing __put calls in a new perf_script__exit function, which is called at the end of cmd_script. This patch concludes the fixes of all remaining memory leaks in perf test "82: Use vfs_getname probe to get syscall args filenames". Signed-off-by: NRiccardo Mancini <rickyman7@gmail.com> Fixes: cfc8874a ("perf script: Process cpu/threads maps") Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/5ee73b19791c6fa9d24c4d57f4ac1a23609400d7.1626343282.git.rickyman7@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Riccardo Mancini 提交于
ASan reports several memory leak while running: # perf test "82: Use vfs_getname probe to get syscall args filenames" One of the leaks is caused by zstd data not being released on exit in perf-script. This patch adds the missing zstd_fini(). Signed-off-by: NRiccardo Mancini <rickyman7@gmail.com> Fixes: b13b04d9 ("perf script: Initialize zstd_data") Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/39388e8cc2f85ca219ea18697a17b7bd8f74b693.1626343282.git.rickyman7@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 10 7月, 2021 1 次提交
-
-
由 Jiri Olsa 提交于
Move evsel::leader to perf_evsel::leader, so we can move the group interface to libperf. Also add several evsel helpers to ease up the transition: struct evsel *evsel__leader(struct evsel *evsel); - get leader evsel bool evsel__has_leader(struct evsel *evsel, struct evsel *leader); - true if evsel has leader as leader bool evsel__is_leader(struct evsel *evsel); - true if evsel is itw own leader void evsel__set_leader(struct evsel *evsel, struct evsel *leader); - set leader for evsel Committer notes: Fix this when building with 'make BUILD_BPF_SKEL=1' tools/perf/util/bpf_counter.c - if (evsel->leader->core.nr_members > 1) { + if (evsel->core.leader->nr_members > 1) { Signed-off-by: NJiri Olsa <jolsa@kernel.org> Requested-by: NShunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 7月, 2021 4 次提交
-
-
由 Adrian Hunter 提交于
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can be repeated to pass more than 1 argument. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Add option --list-dlfilters to list dlfilters in the current directory or the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must come before option --list-dlfilters) to show long descriptions. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
filter_event_early() can be more than 30% faster than filter_event() because it is called before internal filtering. In other respects it is the same as filter_event(), except that it will be passed events that have yet to be filtered out. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
In some cases, users want to filter very large amounts of data (e.g. from AUX area tracing like Intel PT) looking for something specific. While scripting such as Python can be used, Python is 10 to 20 times slower than C. So define a C API so that custom filters can be written and loaded. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 6月, 2021 3 次提交
-
-
由 Adrian Hunter 提交于
Share the addr_location of 'addr' so that it need not be resolved more than once. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210621150514.32159-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
To make it possible to use filtering with scripts, move filtering before scripting. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210621150514.32159-3-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Generally, it should be more efficient if filter_cpu() comes before machine__resolve() because filter_cpu() is much less code than machine__resolve(). Example: $ perf record --sample-cpu -- make -C tools/perf >/dev/null Before: $ perf stat -- perf script -C 0 >/dev/null Performance counter stats for 'perf script -C 0': 116.94 msec task-clock # 0.992 CPUs utilized 2 context-switches # 17.103 /sec 0 cpu-migrations # 0.000 /sec 8,187 page-faults # 70.011 K/sec 478,351,812 cycles # 4.091 GHz 564,785,464 instructions # 1.18 insn per cycle 114,341,105 branches # 977.789 M/sec 2,615,495 branch-misses # 2.29% of all branches 0.117840576 seconds time elapsed 0.085040000 seconds user 0.032396000 seconds sys After: $ perf stat -- perf script -C 0 >/dev/null Performance counter stats for 'perf script -C 0': 107.45 msec task-clock # 0.992 CPUs utilized 3 context-switches # 27.919 /sec 0 cpu-migrations # 0.000 /sec 7,964 page-faults # 74.117 K/sec 438,417,260 cycles # 4.080 GHz 522,571,855 instructions # 1.19 insn per cycle 105,187,488 branches # 978.921 M/sec 2,356,261 branch-misses # 2.24% of all branches 0.108282546 seconds time elapsed 0.095935000 seconds user 0.011991000 seconds sys Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210621150514.32159-2-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 01 6月, 2021 2 次提交
-
-
由 Adrian Hunter 提交于
Factor out script_fetch_insn() so it can be reused. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-7-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
This is preparation for allowing a script to set the itrace options for the session if they have not already been set. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 5月, 2021 4 次提交
-
-
由 Adrian Hunter 提交于
Add auxtrace_error to general python scripting. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210525095112.1399-10-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Factor out perf_sample__sprintf_flags() so it can be reused. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210525095112.1399-5-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
If sample addr correlates to a symbol, add "addr_dso", "addr_symbol", and "addr_symoff" to python scripting. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210525095112.1399-4-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Allow perf script to find a script in the exec path. Example: Before: $ perf record -a -e intel_pt/branch=0/ sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.954 MB perf.data ] $ perf script intel-pt-events.py 2>&1 | head -3 Error: Couldn't find script `intel-pt-events.py' See perf script -l for available scripts. $ perf script -s intel-pt-events.py 2>&1 | head -3 Can't open python script "intel-pt-events.py": No such file or directory $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3 Error: Couldn't find script `/home/ahunter/libexec/perf-core/scripts/python/intel-pt-events.py' See perf script -l for available scripts. $ After: $ perf script intel-pt-events.py 2>&1 | head -3 Intel PT Power Events and PTWRITE perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) $ perf script -s intel-pt-events.py 2>&1 | head -3 Intel PT Power Events and PTWRITE perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3 Intel PT Power Events and PTWRITE perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) $ Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210524065718.11421-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-