- 20 3月, 2019 12 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Using gcc's ASan, Changbin reports: ================================================================= ==7494==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) #1 0x5625e5330a5e in zalloc util/util.h:23 #2 0x5625e5330a9b in perf_counts__new util/counts.c:10 #3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 #4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 #5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 #6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 #7 0x5625e51528e6 in run_test tests/builtin-test.c:358 #8 0x5625e5152baf in test_and_print tests/builtin-test.c:388 #9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 #10 0x5625e515572f in cmd_test tests/builtin-test.c:722 #11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 #15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 72 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) #1 0x5625e532560d in zalloc util/util.h:23 #2 0x5625e532566b in xyarray__new util/xyarray.c:10 #3 0x5625e5330aba in perf_counts__new util/counts.c:15 #4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 #5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 #6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 #7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 #8 0x5625e51528e6 in run_test tests/builtin-test.c:358 #9 0x5625e5152baf in test_and_print tests/builtin-test.c:388 #10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 #11 0x5625e515572f in cmd_test tests/builtin-test.c:722 #12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 #16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) His patch took care of evsel->prev_raw_counts, but the above backtraces are about evsel->counts, so fix that instead. Reported-by: NChangbin Du <changbin.du@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/n/tip-hd1x13g59f0nuhe4anxhsmfp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Changbin Du 提交于
The array str[] should have six elements. ================================================================= ==4322==ERROR: AddressSanitizer: global-buffer-overflow on address 0x56463844e300 at pc 0x564637e7ad0d bp 0x7f30c8c89d10 sp 0x7f30c8c89d00 READ of size 8 at 0x56463844e300 thread T9 #0 0x564637e7ad0c in __ordered_events__flush util/ordered-events.c:316 #1 0x564637e7b0e4 in ordered_events__flush util/ordered-events.c:338 #2 0x564637c6a57d in process_thread /home/changbin/work/linux/tools/perf/builtin-top.c:1073 #3 0x7f30d173a163 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8163) #4 0x7f30cfffbdee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11adee) 0x56463844e300 is located 32 bytes to the left of global variable 'flags' defined in 'util/trace-event-parse.c:229:26' (0x56463844e320) of size 192 0x56463844e300 is located 0 bytes to the right of global variable 'str' defined in 'util/ordered-events.c:268:28' (0x56463844e2e0) of size 32 SUMMARY: AddressSanitizer: global-buffer-overflow util/ordered-events.c:316 in __ordered_events__flush Shadow bytes around the buggy address: 0x0ac947081c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c50: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00 =>0x0ac947081c60:[f9]f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c70: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 0x0ac947081c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Thread T9 created by T0 here: #0 0x7f30d179de5f in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x4ae5f) #1 0x564637c6b954 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1253 #2 0x564637c7173c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642 #3 0x564637d85038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #4 0x564637d85577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #5 0x564637d8597b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #6 0x564637d860e9 in main /home/changbin/work/linux/tools/perf/perf.c:520 #7 0x7f30cff0509a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: NChangbin Du <changbin.du@gmail.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Jiri Olsa <jolsa@kernel.org> Fixes: 16c66bc1 ("perf top: Add processing thread") Fixes: 68ca5d07 ("perf ordered_events: Add ordered_events__flush_time interface") Link: http://lkml.kernel.org/r/20190316080556.3075-13-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Changbin Du 提交于
Add function __maps__purge_names() to purge all maps from the names tree. We need to cleanup the names tree in maps__exit(). Detected with gcc's ASan. Signed-off-by: NChangbin Du <changbin.du@gmail.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 1e628569 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190316080556.3075-12-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Changbin Du 提交于
There are two trees for each map inserted by maps__insert(), so remove it from the 'names' tree in __maps__remove(). Detected with gcc's ASan. Signed-off-by: NChangbin Du <changbin.du@gmail.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 1e628569 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190316080556.3075-11-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Changbin Du 提交于
We need to map__put() before returning from failure of sample__resolve_callchain(). Detected with gcc's ASan. Signed-off-by: NChangbin Du <changbin.du@gmail.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 9c68ae98 ("perf callchain: Reference count maps") Link: http://lkml.kernel.org/r/20190316080556.3075-10-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Changbin Du 提交于
Detected with gcc's ASan: Direct leak of 4356 byte(s) in 120 object(s) allocated from: #0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) #1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215 #2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339 #3 0x55719af66272 in print_events util/parse-events.c:2542 #4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 #5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520 #9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: NChangbin Du <changbin.du@gmail.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 40218dae ("perf list: Show SDT and pre-cached events") Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Changbin Du 提交于
Detected with gcc's ASan: Direct leak of 66 byte(s) in 5 object(s) allocated from: #0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) #1 0x560c8761034d in collect_config util/config.c:597 #2 0x560c8760d9cb in get_value util/config.c:169 #3 0x560c8760dfd7 in perf_parse_file util/config.c:285 #4 0x560c8760e0d2 in perf_config_from_file util/config.c:476 #5 0x560c876108fd in perf_config_set__init util/config.c:661 #6 0x560c87610c72 in perf_config_set__new util/config.c:709 #7 0x560c87610d2f in perf_config__init util/config.c:718 #8 0x560c87610e5d in perf_config util/config.c:730 #9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442 #10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: NChangbin Du <changbin.du@gmail.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Taeung Song <treeze.taeung@gmail.com> Fixes: 20105ca1 ("perf config: Introduce perf_config_set class") Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Changbin Du 提交于
Detected via gcc's ASan: Direct leak of 2048 byte(s) in 64 object(s) allocated from: 6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370) 7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43 8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85 9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250 10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382 11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514 12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520 17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: NChangbin Du <changbin.du@gmail.com> Reviewed-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 89896051 ("perf tools: Do not put a variable sized type not at the end of a struct") Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
The multiplexing scaling in perf stat mysteriously adds 0.5 to the value. This dates back to the original perf tool. Other scaling code doesn't use that strange convention. Remove the extra 0.5. Before: $ perf stat -e 'cycles,cycles,cycles,cycles,cycles,cycles' grep -rq foo Performance counter stats for 'grep -rq foo': 6,403,580 cycles (81.62%) 6,404,341 cycles (81.64%) 6,402,983 cycles (81.62%) 6,399,941 cycles (81.63%) 6,399,451 cycles (81.62%) 6,436,105 cycles (91.87%) 0.005843799 seconds time elapsed 0.002905000 seconds user 0.002902000 seconds sys After: $ perf stat -e 'cycles,cycles,cycles,cycles,cycles,cycles' grep -rq foo Performance counter stats for 'grep -rq foo': 6,422,704 cycles (81.68%) 6,401,842 cycles (81.68%) 6,398,432 cycles (81.68%) 6,397,098 cycles (81.68%) 6,396,074 cycles (81.67%) 6,434,980 cycles (91.62%) 0.005884437 seconds time elapsed 0.003580000 seconds user 0.002356000 seconds sys Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-10-andi@firstfloor.org Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
The -c option to enable multiplex scaling has been useless for quite some time because scaling is default. It's only useful as --no-scale to disable scaling. But the non scaling code path has bitrotted and doesn't print anything because perf output code relies on value run/ena information. Also even when we don't want to scale a value it's still useful to show its multiplex percentage. This patch: - Fixes help and documentation to show --no-scale instead of -c - Removes -c, only keeps the long option because -c doesn't support negatives. - Enables running/enabled even with --no-scale - And fixes some other problems in the no-scale output. Before: $ perf stat --no-scale -e cycles true Performance counter stats for 'true': <not counted> cycles 0.000984154 seconds time elapsed After: $ ./perf stat --no-scale -e cycles true Performance counter stats for 'true': 706,070 cycles 0.001219821 seconds time elapsed Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> LPU-Reference: 20190314225002.30108-9-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-xggjvwcdaj2aqy8ib3i4b1g6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Print [TID] tid %d instead of the crypted /tmp/perf-%d.map default. % cat >loop.java public class loop { public static void main(String[] args) { for (;;); } } ^D % javac loop.java % perf record java loop ^C Before: % perf report --stdio ... 56.09% java perf-34724.map [.] 0x00007fd5bd021896 19.12% java perf-34724.map [.] 0x00007fd5bd021887 9.79% java perf-34724.map [.] 0x00007fd5bd021783 8.97% java perf-34724.map [.] 0x00007fd5bd02175b After: % perf report --stdio ... 56.09% java [JIT] tid 34724 [.] 0x00007fd5bd021896 19.12% java [JIT] tid 34724 [.] 0x00007fd5bd021887 9.79% java [JIT] tid 34724 [.] 0x00007fd5bd021783 8.97% java [JIT] tid 34724 [.] 0x00007fd5bd02175b Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> LPU-Reference: 20190314225002.30108-7-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-r17l6py9g0sezb7mi1f286gt@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Show all the supported sort keys in the command line help output, so that it's not needed to refer to the manpage. Before: % perf report -h ... -s, --sort <key[,key2...]> sort by key(s): pid, comm, dso, symbol, parent, cpu, srcline, ... Please refer the man page for the complete list. After: % perf report -h ... -s, --sort <key[,key2...]> sort by key(s): overhead overhead_sys overhead_us overhead_guest_sys overhead_guest_us overhead_children sample period pid comm dso symbol parent cpu ... Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-5-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-9r3uz2ch4izoi1uln3f889co@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 3月, 2019 1 次提交
-
-
由 Andi Kleen 提交于
When doing long term recording and waiting for some event to snapshot on, we often only care about the last minute or so. The --switch-output command line option supports rotating the perf.data file when the size exceeds a threshold. But the disk would still be filled with unnecessary old files. Add a new option to only keep a number of rotated files, so that the disk space usage can be limited. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> LPU-Reference: 20190314225002.30108-3-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-y5u2lik0ragt4vlktz6qc9ks@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 3月, 2019 3 次提交
-
-
由 Andi Kleen 提交于
Now 'perf report' can show whole time periods with 'perf script', but the user still has to find individual samples of interest manually. It would be expensive and complicated to search for the right samples in the whole perf file. Typically users only need to look at a small number of samples for useful analysis. Also the full scripts tend to show samples of all CPUs and all threads mixed up, which can be very confusing on larger systems. Add a new --samples option to save a small random number of samples per hist entry. Use a reservoir sample technique to select a representatve number of samples. Then allow browsing the samples using 'perf script' as part of the hist entry context menu. This automatically adds the right filters, so only the thread or cpu of the sample is displayed. Then we use less' search functionality to directly jump the to the time stamp of the selected sample. It uses different menus for assembler and source display. Assembler needs xed installed and source needs debuginfo. Currently it only supports as many samples as fit on the screen due to some limitations in the slang ui code. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311174605.GA29294@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
The scripts menu traditionally only showed custom perf scripts. Allow to run standard perf script with useful default options too. - Normal perf script - perf script with assembler (needs xed installed) - perf script with source code output (needs debuginfo) - perf script with custom arguments Then we automatically select the right options to display the information in the perf.data file. For example with -b display branch contexts. It's not easily possible to check for xed's existence in advance. perf script usually gives sensible error messages when it's not available. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-7-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Add a time sort key to perf report to display samples for different time quantums separately. This allows easier analysis of workloads that change over time, and also will allow looking at the context of samples. % perf record ... % perf report --sort time,overhead,symbol --time-quantum 1ms --stdio ... 0.67% 277061.87300 [.] _dl_start 0.50% 277061.87300 [.] f1 0.50% 277061.87300 [.] f2 0.33% 277061.87300 [.] main 0.29% 277061.87300 [.] _dl_lookup_symbol_x 0.29% 277061.87300 [.] dl_main 0.29% 277061.87300 [.] do_lookup_x 0.17% 277061.87300 [.] _dl_debug_initialize 0.17% 277061.87300 [.] _dl_init_paths 0.08% 277061.87300 [.] check_match 0.04% 277061.87300 [.] _dl_count_modids 1.33% 277061.87400 [.] f1 1.33% 277061.87400 [.] f2 1.33% 277061.87400 [.] main 1.17% 277061.87500 [.] main 1.08% 277061.87500 [.] f1 1.08% 277061.87500 [.] f2 1.00% 277061.87600 [.] main 0.83% 277061.87600 [.] f1 0.83% 277061.87600 [.] f2 1.00% 277061.87700 [.] main Committer notes: Rename 'time' argument to hist_time() to htime to overcome this in older distros: cc1: warnings being treated as errors util/hist.c: In function 'hist_time': util/hist.c:251: error: declaration of 'time' shadows a global declaration /usr/include/time.h:186: error: shadowed declaration is here Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20190311144502.15423-4-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 3月, 2019 10 次提交
-
-
由 Jiri Olsa 提交于
Adding callback function to reader object so callers can process data in different ways. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-7-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The data files layout is described by HEADER_DIR_FORMAT feature. Currently it holds only version number (1): uint64_t version; The current version holds only version value (1) means that data files: - Follow the 'data.*' name format. - Contain raw events data in standard perf format as read from kernel (and need to be sorted) Future versions are expected to describe different data files layout according to special needs. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-6-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Make perf_data__size() return proper size for directory data, summing up all the individual file sizes. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Add perf_data__update_dir() to update the size for every file within the perf.data directory. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The caller needs to set 'struct perf_data::is_dir flag and the path will be treated as a directory. The 'struct perf_data::file' is initialized and open as 'path/header' file. Add a check to the direcory interface functions to check the is_dir flag. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190308134745.5057-2-jolsa@kernel.org [ Be consistent on how to signal failure, i.e. use -1 and let users check errno ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
Since commit 4d99e413 ("perf machine: Workaround missing maps for x86 PTI entry trampolines"), perf tools has been creating more than one kernel map, however 'perf probe' assumed there could be only one. Fix by using machine__kernel_map() to get the main kernel map. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NJoseph Qi <joseph.qi@linux.alibaba.com> Acked-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Xu Yu <xuyu@linux.alibaba.com> Fixes: 4d99e413 ("perf machine: Workaround missing maps for x86 PTI entry trampolines") Fixes: d83212d5 ("kallsyms, x86: Export addresses of PTI entry trampolines") Link: http://lkml.kernel.org/r/2ed432de-e904-85d2-5c36-5897ddc5b23b@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Many workloads change over time. 'perf report' currently aggregates the whole time range reported in perf.data. This patch adds an option for a time quantum to quantisize the perf.data over time. This just adds the option, will be used in follow on patches for a time sort key. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-6-andi@firstfloor.org [ Use NSEC_PER_[MU]SEC ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Add a utility function to print nanosecond timestamps. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-11-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Upcoming changes add timestamp output in perf report. Add a --ns argument similar to perf script to support nanoseconds resolution when needed. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-5-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
perf script -F +insn was only working for PT traces because the PT instruction decoder was filling in the insn/insn_len sample attributes. Support it for non PT samples too on x86 using the existing x86 instruction decoder. This adds some extra checking to ensure that we don't try to decode instructions when using perf.data from a different architecture. % perf record -a sleep 1 % perf script -F ip,sym,insn --xed ffffffff811704c9 remote_function movl %eax, 0x18(%rbx) ffffffff8100bb50 intel_bts_enable_local retq ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi) ffffffff810f1f79 generic_exec_single xor %eax, %eax ffffffff811704c9 remote_function movl %eax, 0x18(%rbx) ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx ffffffff81048610 native_apic_mem_write mov %edi, %edi ... Committer testing: Before: # perf script -F ip,sym,insn --xed | head -5 ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068804 native_write_msr addb %al, (%rax) ffffffffa4068806 native_write_msr addb %al, (%rax) ffffffffa4068806 native_write_msr addb %al, (%rax) # perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" # After: # perf script -F ip,sym,insn --xed | head -5 ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) # perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" | head -5 ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068804 native_write_msr wrmsr ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1) # More examples: # perf script -F ip,sym,insn --xed | grep -v native_write_msr | head ffffffffa416b90e tick_check_broadcast_expired btq %rax, 0x1a5f42a(%rip) ffffffffa4956bd0 nmi_cpu_backtrace pushq %r13 ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx ffffffffa4956bf3 nmi_cpu_backtrace popq %r12 ffffffffa4171d5c smp_call_function_single pause ffffffffa4956bdd nmi_cpu_backtrace mov %ebp, %r12d ffffffffa4797e4d menu_select cmp $0x190, %rax ffffffffa4171d5c smp_call_function_single pause ffffffffa405a7d8 nmi_cpu_backtrace_handler callq 0xffffffffa4956bd0 ffffffffa4797f7a menu_select shr $0x3, %rax # Which matches the annotate output modulo resolving callqs: # perf annotate --stdio2 nmi_cpu_backtrace_handler Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period] nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux Percent Disassembly of section .text: ffffffff8105a7d0 <nmi_cpu_backtrace_handler>: nmi_cpu_backtrace_handler(): nmi_trigger_cpumask_backtrace(mask, exclude_self, nmi_raise_cpu_backtrace); } static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs) { 24.45 → callq __fentry__ if (nmi_cpu_backtrace(regs)) mov %rsi,%rdi 75.55 → callq nmi_cpu_backtrace return NMI_HANDLED; movzbl %al,%eax return NMI_DONE; } ← retq # # perf annotate --stdio2 __hrtimer_next_event_base Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period] __hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux Percent Disassembly of section .text: ffffffff8115b910 <__hrtimer_next_event_base>: __hrtimer_next_event_base(): static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base, const struct hrtimer *exclude, unsigned int active, ktime_t expires_next) { → callq __fentry__ <SNIP> 4a: add $0x1,%r14 77.31 mov 0x18(%rax),%rdx shl $0x6,%r14 sub 0x38(%rbx,%r14,1),%rdx if (expires < expires_next) { cmp %r12,%rdx ↓ jge 68 <SNIP> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org [ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ] [ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 3月, 2019 8 次提交
-
-
由 Jiri Olsa 提交于
Making sure the data->file.path is zeroed on perf_data__open error path and in perf_data__close, so we don't double free it in case someone call it twice. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-9-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
We can't call perf_data__close and subsequently perf_session__delete, because it will call perf_data__close again and cause double free for data->file.path. $ perf report -i . incompatible file format (rerun with -v to learn more) free(): double free detected in tcache 2 Aborted (core dumped) In fact we don't need to call perf_data__close at all, because at the time the got out_close is reached, session->data is already initialized, so the perf_data__close call will be triggered from perf_session__delete. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Fixes: 2d4f2799 ("perf data: Add global path holder") Link: http://lkml.kernel.org/r/20190305152536.21035-8-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Currently we probe for precise_ip with user specified perf_event_attr, which might fail because of unsupported kernel features, which would get disabled during the open time anyway. Switching the probe to take place on simple hw cycles, so the following record sets proper precise_ip: # perf record -e cycles:P ls # perf evlist -v cycles:P: size: 112, ... precise_ip: 3, ... Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-7-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Read the caps/max_precise value and store it in struct perf_pmu to be used when setting the maximum precise_ip field in following patch. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-5-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
We can't allocate he->srcline unconditionaly, only when new hist_entry is created. Moving he->srcline allocation into hist_entry__init function. Original-patch-by: NJonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Suggested-by: NNamhyung Kim <namhyung@kernel.org> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nageswara R Sastry <nasastry@in.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-4-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
Adding error path into hist_entry__init to unify error handling, so every new member does not need to free everything else. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: nageswara r sastry <nasastry@in.ibm.com> Link: http://lkml.kernel.org/r/20190305152536.21035-3-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Add a utility function to fetch executable code. Convert one user over to it. There are more places doing that, but they do significantly different actions, so they are not easy to fit into a single library function. Committer changes: . No need to cast around, make 'buf' be a void pointer. . Rename it to thread__memcpy() to reflect the fact it is about copying a chunk of memory from a thread, i.e. from its address space. . No need to have it in a separate object file, move it to thread.[ch] . Check the return of map__load(), the original code didn't do it, but since we're moving this around, check that as well. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/r/20190305144758.12397-2-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We were hardcoding '6' as the max instruction name, and we have lots that are longer than that, see the diff from two 'P' printed TUI annotations for a libc function that uses instructions with long names, such as 'vpmovmskb' with its 9 chars: --- __strcmp_avx2.annotation.before 2019-03-06 16:31:39.368020425 -0300 +++ __strcmp_avx2.annotation 2019-03-06 16:32:12.079450508 -0300 @@ -2,284 +2,284 @@ Event: cycles:ppp Percent endbr64 - 0.10 mov %edi,%eax + 0.10 mov %edi,%eax - xor %edx,%edx + xor %edx,%edx - 3.54 vpxor %ymm7,%ymm7,%ymm7 + 3.54 vpxor %ymm7,%ymm7,%ymm7 - or %esi,%eax + or %esi,%eax - and $0xfff,%eax + and $0xfff,%eax - cmp $0xf80,%eax + cmp $0xf80,%eax - ↓ jg 370 + ↓ jg 370 - 27.07 vmovdqu (%rdi),%ymm1 + 27.07 vmovdqu (%rdi),%ymm1 - 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0 + 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0 - 2.15 vpminub %ymm1,%ymm0,%ymm0 + 2.15 vpminub %ymm1,%ymm0,%ymm0 - 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0 + 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0 - 0.43 vpmovmskb %ymm0,%ecx + 0.43 vpmovmskb %ymm0,%ecx - 1.53 test %ecx,%ecx + 1.53 test %ecx,%ecx - ↓ je b0 + ↓ je b0 - 5.26 tzcnt %ecx,%edx + 5.26 tzcnt %ecx,%edx - 18.40 movzbl (%rdi,%rdx,1),%eax + 18.40 movzbl (%rdi,%rdx,1),%eax - 7.09 movzbl (%rsi,%rdx,1),%edx + 7.09 movzbl (%rsi,%rdx,1),%edx - 3.34 sub %edx,%eax + 3.34 sub %edx,%eax 2.37 vzeroupper ← retq nop - 50: tzcnt %ecx,%edx + 50: tzcnt %ecx,%edx - movzbl 0x20(%rdi,%rdx,1),%eax + movzbl 0x20(%rdi,%rdx,1),%eax - movzbl 0x20(%rsi,%rdx,1),%edx + movzbl 0x20(%rsi,%rdx,1),%edx - sub %edx,%eax + sub %edx,%eax vzeroupper ← retq - data16 nopw %cs:0x0(%rax,%rax,1) + data16 nopw %cs:0x0(%rax,%rax,1) Reported-by: NTravis Downs <travis.downs@gmail.com> LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-89wsdd9h9g6bvq52sgp6d0u4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 3月, 2019 1 次提交
-
-
由 Yang Wei 提交于
Delete a superfluous semicolon in getBPFObjectFromModule(). Signed-off-by: NYang Wei <yang.wei9@zte.com.cn> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Wei <albin_yang@163.com> Link: http://lkml.kernel.org/r/1551710174-3349-1-git-send-email-albin_yang@163.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 3月, 2019 3 次提交
-
-
由 Adrian Hunter 提交于
The call_path can be used to find the parent symbol for a call but not the exact parent call. To do that add parent_id to the call_return export. This enables the creation of a call tree from the exported data. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lkml.kernel.org/n/tip-6j7tzdxo67cox6kan7k22oo6@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
When TSC is not available, "timeless" decoding is used but a divide by zero occurs if perf_time_to_tsc() is called. Ensure the divisor is not zero. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.9+ Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Adrian Hunter 提交于
The message does not indicate the possibility that the symbol is not found because the file does not exist. Before: $ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls Symbol 'strcmp' not found. Note that symbols must be functions. Failed to parse address filter: 'filter strcmp / strcpy @ foo ' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. After: $ perf record -e intel_pt//u --filter 'filter strcmp / strcpy @ foo ' ls File 'foo' not found or has no symbols. Symbol 'strcmp' not found. Note that symbols must be functions. Failed to parse address filter: 'filter strcmp / strcpy @ foo ' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. Reported-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lkml.kernel.org/n/tip-dvngzxd0jkplzw1ary69dilb@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 01 3月, 2019 2 次提交
-
-
由 Jin Yao 提交于
Jiri points out that we don't need any time checking and time string parsing if the --time option is not set. That makes sense. This patch refactors the time range parsing code, move the duplicated code from perf report and perf script to time_utils and check if --time option is set before parsing the time string. This patch is no logic change expected. So the usage of --time is same as before. For example: Select the first and second 10% time slices: perf report --time 10%/1,10%/2 perf script --time 10%/1,10%/2 Select the slices from 0% to 10% and from 30% to 40%: perf report --time 0%-10%,30%-40% perf script --time 0%-10%,30%-40% Select the time slices from timestamp 3971 to 3973 perf report --time 3971,3973 perf script --time 3971,3973 Committer testing: Using the above examples, check before and after to see if it remains the same: $ perf record -F 10000 -- find . -name "*.[ch]" -exec cat {} + > /dev/null [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 1.626 MB perf.data (42392 samples) ] $ $ perf report --time 10%/1,10%/2 > /tmp/report.before.1 $ perf script --time 10%/1,10%/2 > /tmp/script.before.1 $ perf report --time 0%-10%,30%-40% > /tmp/report.before.2 $ perf script --time 0%-10%,30%-40% > /tmp/script.before.2 $ perf report --time 180457.375844,180457.377717 > /tmp/report.before.3 $ perf script --time 180457.375844,180457.377717 > /tmp/script.before.3 For example, the 3rd test produces this slice: $ cat /tmp/script.before.3 cat 3147 180457.375844: 2143 cycles:uppp: 7f79362590d9 cfree@GLIBC_2.2.5+0x9 (/usr/lib64/libc-2.28.so) cat 3147 180457.375986: 2245 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat) cat 3147 180457.376012: 2164 cycles:uppp: 7f7936257430 _int_malloc+0x8c0 (/usr/lib64/libc-2.28.so) cat 3147 180457.376140: 2921 cycles:uppp: 558b70f3a554 [unknown] (/usr/bin/cat) cat 3147 180457.376296: 2844 cycles:uppp: 7f7936258abe malloc+0x4e (/usr/lib64/libc-2.28.so) cat 3147 180457.376431: 2717 cycles:uppp: 558b70f3b0ca [unknown] (/usr/bin/cat) cat 3147 180457.376667: 2630 cycles:uppp: 558b70f3d86e [unknown] (/usr/bin/cat) cat 3147 180457.376795: 2442 cycles:uppp: 7f79362bff55 read+0x15 (/usr/lib64/libc-2.28.so) cat 3147 180457.376927: 2376 cycles:uppp: ffffffff9aa00163 [unknown] ([unknown]) cat 3147 180457.376954: 2307 cycles:uppp: 7f7936257438 _int_malloc+0x8c8 (/usr/lib64/libc-2.28.so) cat 3147 180457.377116: 3091 cycles:uppp: 7f7936258a70 malloc+0x0 (/usr/lib64/libc-2.28.so) cat 3147 180457.377362: 2945 cycles:uppp: 558b70f3a3b0 [unknown] (/usr/bin/cat) cat 3147 180457.377517: 2727 cycles:uppp: 558b70f3a9aa [unknown] (/usr/bin/cat) $ Install 'coreutils-debuginfo' to see cat's guts (symbols), but then, the above chunk translates into this 'perf report' output: $ cat /tmp/report.before.3 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 13 of event 'cycles:uppp' (time slices: 180457.375844,180457.377717) # Event count (approx.): 33552 # # Overhead Command Shared Object Symbol # ........ ....... ................ ...................... # 17.69% cat libc-2.28.so [.] malloc 14.53% cat cat [.] 0x000000000000586e 13.33% cat libc-2.28.so [.] _int_malloc 8.78% cat cat [.] 0x00000000000023b0 8.71% cat cat [.] 0x0000000000002554 8.13% cat cat [.] 0x00000000000029aa 8.10% cat cat [.] 0x00000000000030ca 7.28% cat libc-2.28.so [.] read 7.08% cat [unknown] [k] 0xffffffff9aa00163 6.39% cat libc-2.28.so [.] cfree@GLIBC_2.2.5 # # (Tip: Order by the overhead of source file name and line number: perf report -s srcline) # $ Now lets see after applying this patch, nothing should change: $ perf report --time 10%/1,10%/2 > /tmp/report.after.1 $ perf script --time 10%/1,10%/2 > /tmp/script.after.1 $ perf report --time 0%-10%,30%-40% > /tmp/report.after.2 $ perf script --time 0%-10%,30%-40% > /tmp/script.after.2 $ perf report --time 180457.375844,180457.377717 > /tmp/report.after.3 $ perf script --time 180457.375844,180457.377717 > /tmp/script.after.3 $ diff -u /tmp/report.before.1 /tmp/report.after.1 $ diff -u /tmp/script.before.1 /tmp/script.after.1 $ diff -u /tmp/report.before.2 /tmp/report.after.2 --- /tmp/report.before.2 2019-03-01 11:01:53.526094883 -0300 +++ /tmp/report.after.2 2019-03-01 11:09:18.231770467 -0300 @@ -352,5 +352,5 @@ # -# (Tip: Generate a script for your data: perf script -g <lang>) +# (Tip: Treat branches as callchains: perf report --branch-history) # $ diff -u /tmp/script.before.2 /tmp/script.after.2 $ diff -u /tmp/report.before.3 /tmp/report.after.3 --- /tmp/report.before.3 2019-03-01 11:03:08.890045588 -0300 +++ /tmp/report.after.3 2019-03-01 11:09:40.660224002 -0300 @@ -22,5 +22,5 @@ # -# (Tip: Order by the overhead of source file name and line number: perf report -s srcline) +# (Tip: List events using substring match: perf list <keyword>) # $ diff -u /tmp/script.before.3 /tmp/script.after.3 $ Cool, just the 'perf report' tips changed, QED. Signed-off-by: NJin Yao <yao.jin@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1551435186-6008-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jakub Kicinski 提交于
For historical reasons the helper to loop over maps in an object is called bpf_map__for_each while it really should be called bpf_object__for_each_map. Rename and add a correctly named define for backward compatibility. Switch all in-tree users to the correct name (Quentin). Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: NQuentin Monnet <quentin.monnet@netronome.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-