- 07 1月, 2016 7 次提交
-
-
由 Namhyung Kim 提交于
Support '*' character for field name to add all (non-common) fields as sort keys easily. $ perf report -s 'switch.*' --stdio ... # Overhead prev_comm prev_pid prev_prio prev_state next_comm next_pid next_prio # ........ ........... ......... ......... .......... ............ ........ ......... # 3.82% swapper/0 0 120 0 netctl-auto 18711 120 3.75% netctl-auto 18711 120 1 swapper/0 0 120 2.24% swapper/1 0 120 0 netctl-auto 18709 120 2.24% netctl-auto 18709 120 1 swapper/1 0 120 1.80% swapper/2 0 120 0 rcu_preempt 7 120 1.80% swapper/2 0 120 0 netctl-auto 18711 120 1.80% rcu_preempt 7 120 1 swapper/2 0 120 1.80% netctl-auto 18711 120 1 swapper/2 0 120 ... Suggested-and-acked-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-11-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The dynamic sort key requires event name but specifying full event name is rather inconvenient. This patch adds more ways to identify the event in a more compact way. 1. If session has just one event, event name can be omitted. 2. Events can be accessed by index preceded by a percent sign. 3. A part of the name can be used, if it's not ambiguous. The partial name should not contain ':' in it. 4. Full system + event name is still used, it should contain ':'. So in the below example all does same thing: $ perf record -e sched:sched_switch -a sleep 1 $ perf report -s next_pid,next_comm $ perf report -s %1.next_pid,%1.next_comm $ perf report -s switch.next_pid,switch.next_comm $ perf report -s sched:sched_switch.next_pid,sched:sched_switch.next_comm Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-10-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The --raw-trace option allows disabling pretty printing by the event's print_fmt or plugin. Besides that, each dynamic sort key now can receive a 'raw' suffix separated by '/' to ask for the raw trace of a specific field. $ perf report -s comm,kmem:kmalloc.gfp_flags ... # Overhead Command gfp_flags # ........ ....... ................... # 99.89% perf GFP_NOFS|GFP_ZERO 0.06% sleep GFP_KERNEL 0.03% perf GFP_KERNEL|GFP_ZERO 0.01% perf GFP_KERNEL Now $ perf report -s comm,kmem:kmalloc.gfp_flags --raw-trace or $ perf report -s comm,kmem:kmalloc.gfp_flags/raw ... # Overhead Command gfp_flags # ........ ....... .......... # 99.89% perf 32848 0.06% sleep 208 0.03% perf 32976 0.01% perf 208 Suggested-and-Acked-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-9-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The 'trace' sort key is to show tracepoint event output using either print fmt or plugin. For example sched_switch event (using plugin) will show output like below: # perf record -e sched:sched_switch -a usleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.197 MB perf.data (69 samples) ] # $ perf report -s trace --stdio ... # Overhead Trace output # ........ ................................................... # 9.48% swapper/0:0 [120] R ==> transmission-gt:17773 [120] 9.48% transmission-gt:17773 [120] S ==> swapper/0:0 [120] 9.04% swapper/2:0 [120] R ==> transmission-gt:17773 [120] 8.92% transmission-gt:17773 [120] S ==> swapper/2:0 [120] 5.25% swapper/0:0 [120] R ==> kworker/0:1H:109 [100] 5.21% kworker/0:1H:109 [100] S ==> swapper/0:0 [120] 1.78% swapper/3:0 [120] R ==> transmission-gt:17773 [120] 1.78% transmission-gt:17773 [120] S ==> swapper/3:0 [120] 1.53% Xephyr:6524 [120] S ==> swapper/0:0 [120] 1.53% swapper/0:0 [120] R ==> Xephyr:6524 [120] 1.17% swapper/2:0 [120] R ==> irq/33-iwlwifi:233 [49] 1.13% irq/33-iwlwifi:233 [49] S ==> swapper/2:0 [120] Note that the 'trace' sort key works only for tracepoint events. If it's used to other type of events, just "N/A" will be printed. Suggested-and-acked-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-8-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Each tracepoint event has format string for print to improve readability. Try to parse the output and match the field name. If it finds one, use that for the result. If not, fallbacks to the original output. For example, sort on kmem:kmalloc.gfp_flags looks like below: (Note: libtraceevent plugins are not installed on my system. They might affect the output below) Before: # Overhead Command gfp_flags # ........ ....... .......... # 99.89% perf 32848 0.06% sleep 208 0.03% perf 32976 0.01% perf 208 After: # Overhead Command gfp_flags # ........ ....... ................... # 99.89% perf GFP_NOFS|GFP_ZERO 0.06% sleep GFP_KERNEL 0.03% perf GFP_KERNEL|GFP_ZERO 0.01% perf GFP_KERNEL Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-7-git-send-email-namhyung@kernel.org [ Fixed clash with earlier, updated patch in this patchkit ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The existing sort keys are less useful for tracepoint events in that they are always sampled at the same place, the function where the tracepoint is located. For example, a 'perf report' on sched:sched_switch event looks like the following: # Overhead Command Shared Object Symbol # ........ ............... ................ .............. # 47.22% swapper [kernel.vmlinux] [k] __schedule 21.67% transmission-gt [kernel.vmlinux] [k] __schedule 8.23% netctl-auto [kernel.vmlinux] [k] __schedule 5.53% kworker/0:1H [kernel.vmlinux] [k] __schedule 1.98% Xephyr [kernel.vmlinux] [k] __schedule 1.33% irq/33-iwlwifi [kernel.vmlinux] [k] __schedule 1.17% wpa_cli [kernel.vmlinux] [k] __schedule 1.13% rcu_preempt [kernel.vmlinux] [k] __schedule 0.85% ksoftirqd/0 [kernel.vmlinux] [k] __schedule 0.77% Timer [kernel.vmlinux] [k] __schedule In fact, tracepoints have meaningful information in their fields but there's no way to use in 'perf report' currently. The dynamic sort keys are introduced in this patc to overcome this limitation. The sched:sched_switch events have following fields: # sudo cat /sys/kernel/debug/tracing/events/sched/sched_switch/format name: sched_switch ID: 268 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[16]; offset:8; size:16; signed:1; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[16]; offset:40; size:16; signed:1; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, REC->prev_state & (2048-1) ? __print_flags(REC->prev_state & (2048-1), "|", { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" }, { 16, "Z" }, { 32, "X" }, { 64, "x" }, { 128, "K"}, { 256, "W" }, { 512, "P" }, { 1024, "N" }) : "R", REC->prev_state & 2048 ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio With dynamic sort keys, you can use <event.field> as a sort key. Those dynamic keys are checked and created on demand. For instance, below is to sort by next_pid field output on the same data file: $ perf report -s comm,sched:sched_switch.next_pid --stdio ... # Overhead Command next_pid # ........ ............... .......... # 21.23% transmission-gt 0 20.86% swapper 17773 6.62% netctl-auto 0 5.25% swapper 109 5.21% kworker/0:1H 0 1.98% Xephyr 0 1.98% swapper 6524 1.98% swapper 27478 1.37% swapper 27476 1.17% swapper 233 Multiple dynamic sort keys are also supported: $ perf report -s comm,sched:sched_switch.next_pid,sched:sched_switch.next_comm --stdio ... # Overhead Command next_pid next_comm # ........ ............... .......... ................ # 20.86% swapper 17773 transmission-gt 9.64% transmission-gt 0 swapper/0 9.16% transmission-gt 0 swapper/2 5.25% swapper 109 kworker/0:1H 5.21% kworker/0:1H 0 swapper/0 2.14% netctl-auto 0 swapper/2 1.98% netctl-auto 0 swapper/0 1.98% swapper 6524 Xephyr 1.98% swapper 27478 netctl-auto 1.78% transmission-gt 0 swapper/3 1.53% Xephyr 0 swapper/0 1.29% netctl-auto 0 swapper/1 1.29% swapper 27476 netctl-auto 1.21% netctl-auto 0 swapper/3 1.17% swapper 233 irq/33-iwlwifi Note that pid 0 exists for each cpu so have comm of 'swapper/N'. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-6-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
This is a preparation to support dynamic sort keys for tracepoint events. Dynamic sort keys can be created for specific fields in trace events so it needs the event information. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1450804030-29193-5-git-send-email-namhyung@kernel.org [ Moving the evlist creation earlier in top was split to a previous patch ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 10月, 2015 2 次提交
-
-
由 Jiri Olsa 提交于
This function will allow to register output column from ui code and respect taken sort/output dimensions. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444134312-29136-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
There's no need to call reset_dimensions within __setup_output_field function. It's already called in its caller setup_sorting right before perf_hpp__init, which will be changed in following patch to respect taken dimension. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444134312-29136-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 10月, 2015 1 次提交
-
-
由 Don Zickus 提交于
Sorting on 'symbol' gives to broad a resolution as it can cover a range of IP address. Use the iaddr instead to get proper sorting on IP addresses. Need to use the 'mem_sort' feature of perf record. New sort option is: symbol_iaddr, header label is 'Code Symbol'. $ perf mem report --stdio -F +symbol_iaddr # Overhead Samples Code Symbol Local Weight # ........ ............ ........................ ............ # 54.08% 1 [k] nmi_handle 192 4.51% 1 [k] finish_task_switch 16 3.66% 1 [.] malloc 13 3.10% 1 [.] __strcoll_l 11 Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444068369-20978-8-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 9月, 2015 1 次提交
-
-
由 Kan Liang 提交于
This patch enable perf report to sort by processor socket: $ perf report --stdio --sort socket,comm,dso,symbol # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 686 of event 'cycles' # Event count (approx.): 349215462 # # Overhead SOCKET Command Shared Object Symbol # ........ ...... ....... ................ ............................ # 97.05% 000 test test [.] plusB_c 0.98% 000 test test [.] plusA_c 0.93% 001 perf [kernel.vmlinux] [k] smp_call_function_single 0.19% 001 perf [kernel.vmlinux] [k] page_fault 0.19% 001 swapper [kernel.vmlinux] [k] pm_qos_request 0.16% 000 test [kernel.vmlinux] [k] add_mm_counter_fast Signed-off-by: NKan Liang <kan.liang@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1441377946-44429-2-git-send-email-kan.liang@intel.com [ Fix col calc, un-allcapsify col header & read the topology when not using perf.data ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 9月, 2015 1 次提交
-
-
由 Andi Kleen 提交于
When profiling the kernel with the 'srcfile' sort key it's common to "get stuck" in include. For example a lot of code uses current or other inlines, so they get accounted to some random include file. This is not very useful as a high level categorization. For example just profiling the idle loop usually shows mostly inlines, so you never see the actual cpuidle file. This patch changes the 'srcfile' sort key to always unwind the inline stack using BFD/DWARF. So we always account to the base function that called the inline. In a few cases include is still shown (for example for MSR accesses), but that is because they get inlining expanded as part of assigning to a global function pointer. For the majority it works fine though. v2: Use simpler while loop. Add maximum iteration count. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1441133239-31254-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 8月, 2015 1 次提交
-
-
由 Andi Kleen 提交于
Handle the SRCLINE_UNKNOWN case correctly when processing "srcfile". Commiter note: We can't just free it, as it was't allocated via malloc, its a guard variable. Reported-by: NNamhyung Kim <namhyung@kernel.org> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20150811133655.GC4524@tassilo.jf.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 8月, 2015 1 次提交
-
-
由 Andi Kleen 提交于
In some cases it's useful to characterize samples by file. This is useful to get a higher level categorization, for example to map cost to subsystems. Add a srcfile sort key to perf report. It builds on top of the existing srcline support. Commiter notes: E.g.: # perf record -F 10000 usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (13 samples) ] [root@zoo ~]# perf report -s srcfile --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File # ........ ........... 60.99% . 20.62% paravirt.h 14.23% rmap.c 4.04% signal.c 0.11% msr.h # The first line is collecting all the files for which srcfiles couldn't somehow get resolved to: # perf report -s srcfile,dso --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File Shared Object # ........ ........... ................ 40.97% . ld-2.20.so 20.62% paravirt.h [kernel.vmlinux] 20.02% . libc-2.20.so 14.23% rmap.c [kernel.vmlinux] 4.04% signal.c [kernel.vmlinux] 0.11% msr.h [kernel.vmlinux] # XXX: Investigate why that is not resolving on Fedora 21, Andi says he hasn't seen this on Fedora 22. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438988064-21834-1-git-send-email-andi@firstfloor.org [ Added column length update, from 0e65bdb3f90f ('perf hists: Update the column width for the "srcline" sort key') ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 8月, 2015 2 次提交
-
-
由 Andi Kleen 提交于
Display the cycles by default in branch sort mode. To make enough room for the new column I removed dso_to. It is usually redundant with dso_from. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1437233094-12844-9-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
cycles is a new branch_info field available on some CPUs that indicates the time deltas between branches in the LBR. Add a sort key and output code for the cycles to allow to display the basic block cycles individually in perf report. We also pass in the cycles for weight when LBRs are processed, which allows to get global and local weight, to get an estimate of the total cost. And also print the cycles information for perf report -D. I also added printing for the previously missing LBR flags (mispredict etc.) Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1437233094-12844-2-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 6月, 2015 1 次提交
-
-
由 Yannick Brosseau 提交于
When using a map file from a JIT, due to memory reuse, we can obtain multiple symbols with the same start address but a different length. The symbols__find does check for the end so not doing it in sort__sym_cmp was causing the hist_entry in the annotate part of a report to match to the wrong entry, causing a fatal error. Signed-off-by: NYannick Brosseau <scientist@fb.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@fb.com Link: http://lkml.kernel.org/r/1434584470-17771-1-git-send-email-scientist@fb.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 5月, 2015 1 次提交
-
-
由 Jiri Olsa 提交于
Currently the se_cmp and se_collapse use pointer comparison, which is ok for for testing equality of strings. It's not ok as comparing function for rbtree insertion, because it gives different results based on current pointer values. We saw test 32 (hists cumulation test) failing based on different environment setup. Having all sort functions straightened fix the test for us. Reported-by: NJan Stancek <jstancek@redhat.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jan Stancek <jstancek@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 2月, 2015 1 次提交
-
-
由 Kan Liang 提交于
Currently, the perf diff only works with same binaries. That's because it compares the symbol start address. It doesn't work if the perf.data comes from different binaries. This patch matches the symbol names. Actually, perf diff once intended to compare the symbol names. The commit as below can look for a pair by name. 604c5c92 (perf diff: Change the default sort order to "dso,symbol") However, at that time, perf diff used a global list of dsos. That means the binaries which has same name can only be loaded once. That's a problem for comparing different binaries. For example, we have an old binary and an updated binary. They very likely have same name and most of the functions, so only dsos from old binary will be loaded. When processing the data from updated binary, perf still use the symbol information from old binary. That's wrong. Then the commit as below used IP to replace symbol name. 9c443dfd ("perf diff: Fix support for all --sort combinations") >From that time, perf diff starts to compare the symbol address. The global dsos is discarded from a patch in 2010. a1645ce1 ("perf: 'perf kvm' tool for monitoring guest performance from host") However, at that time, perf diff already compared by address. So perf diff cannot work for different binaries as well. This patch actually rolls back the perf diff to original design. The document is also changed, so everybody knows the original design is to compare the symbol names. Here are some examples: The only difference between example_v1.c and example_v2.c is the location of f2 and f3. There is no change in behavior, but the previous perf diff display the wrong differential profile. example_v1.c noinline void f3(void) { volatile int i; for (i = 0; i < 10000;) { if(i%2) i++; else i++; } } noinline void f2(void) { volatile int a = 100, b, c; for (b = 0; b < 10000; b++) c = a * b; } noinline void f1(void) { f2(); f3(); } int main() { int i; for (i = 0; i < 100000; i++) f1(); } example_v2.c noinline void f2(void) { volatile int a = 100, b, c; for (b = 0; b < 10000; b++) c = a * b; } noinline void f3(void) { volatile int i; for (i = 0; i < 10000;) { if(i%2) i++; else i++; } } noinline void f1(void) { f2(); f3(); } int main() { int i; for (i = 0; i < 100000; i++) f1(); } [lk@localhost perf_diff]$ gcc example_v1.c -o example [lk@localhost perf_diff]$ perf record -o example_v1.data ./example [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ] [lk@localhost perf_diff]$ gcc example_v2.c -o example [lk@localhost perf_diff]$ perf record -o example_v2.data ./example [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ] Old perf diff result: [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data Event 'cycles' Baseline Delta Shared Object Symbol ........ ....... ................ ............................... [kernel.vmlinux] [k] __perf_event_task_sched_out 0.00% [kernel.vmlinux] [k] apic_timer_interrupt [kernel.vmlinux] [k] idle_cpu [kernel.vmlinux] [k] intel_pstate_timer_func [kernel.vmlinux] [k] native_read_msr_safe 0.00% [kernel.vmlinux] [k] native_read_tsc 0.00% [kernel.vmlinux] [k] native_write_msr_safe [kernel.vmlinux] [k] ntp_tick_length 0.00% [kernel.vmlinux] [k] rb_erase 0.00% [kernel.vmlinux] [k] tick_sched_timer 0.00% [kernel.vmlinux] [k] unmap_single_vma 0.00% [kernel.vmlinux] [k] update_wall_time 0.00% example [.] f1 46.24% example [.] f2 53.71% -7.55% example [.] f3 +53.81% example [.] f3 0.02% example [.] main New perf diff result: [lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data [kernel.vmlinux] [k] __perf_event_task_sched_out 0.00% [kernel.vmlinux] [k] apic_timer_interrupt [kernel.vmlinux] [k] idle_cpu [kernel.vmlinux] [k] intel_pstate_timer_func [kernel.vmlinux] [k] native_read_msr_safe 0.00% [kernel.vmlinux] [k] native_read_tsc 0.00% [kernel.vmlinux] [k] native_write_msr_safe [kernel.vmlinux] [k] ntp_tick_length 0.00% [kernel.vmlinux] [k] rb_erase 0.00% [kernel.vmlinux] [k] tick_sched_timer 0.00% [kernel.vmlinux] [k] unmap_single_vma 0.00% [kernel.vmlinux] [k] update_wall_time 0.00% example [.] f1 46.24% -0.08% example [.] f2 53.71% +0.11% example [.] f3 0.02% example [.] main Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 1月, 2015 1 次提交
-
-
由 Namhyung Kim 提交于
Currently ->cmp, ->collapse and ->sort callbacks doesn't pass corresponding fmt. But it'll be needed by upcoming changes in perf diff command. Suggested-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1420677949-6719-6-git-send-email-namhyung@kernel.org [ fix build by passing perf_hpp_fmt pointer to hist_entry__cmp_ methods ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 11月, 2014 1 次提交
-
-
由 Andi Kleen 提交于
When the source line is not found fall back to sym + offset. This is generally much more useful than a raw address. For this we need to pass in the symbol from the caller. For some callers it's awkward to compute, so we stay at the old behaviour. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-10-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 11月, 2014 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Problem introduced in: commit 5b591669 "perf report: Honor column width setting" Where the left justification signal was after the width, which ended up, when the width was, say, 11, always printing: %11.11-s Instead of src:line left justified and limited to 11 chars. Resulting in a like: 70.93% %11.11-s [.] f2 tcall When it should instead be: 70.93% tcall.c:5 [.] f2 tcall Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-2xnt0vqkoox52etq2qhyetr0@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 10月, 2014 7 次提交
-
-
由 Jiri Olsa 提交于
The branch field sorting code assumes hist_entry::branch_info is allocated, which is wrong and following perf session ends up with report segfault. $ perf record ls $ perf report -F dso_from perf: Segmentation fault Checking that hist_entry::branch_info is valid and display "N/A" string in snprint callback if it's not. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1413468427-31049-8-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The branch field sorting code assumes hist_entry::branch_info is allocated, which is wrong and following perf session ends up with report segfault. $ perf record ls $ perf report -F dso_to perf: Segmentation fault Checking that hist_entry::branch_info is valid and display "N/A" string in snprint callback if it's not. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1413468427-31049-7-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The branch field sorting code assumes hist_entry::branch_info is allocated, which is wrong and following perf session ends up with report segfault. $ perf record ls $ perf report -F symbol_from perf: Segmentation fault Checking that hist_entry::branch_info is valid and display "N/A" string in snprint callback if it's not. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1413468427-31049-6-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The branch field sorting code assumes hist_entry::branch_info is allocated, which is wrong and following perf session ends up with report segfault. $ perf record ls $ perf report -F symbol_to perf: Segmentation fault Checking that hist_entry::branch_info is valid and display "N/A" string in snprint callback if it's not. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1413468427-31049-5-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The branch field sorting code assumes hist_entry::branch_info is allocated, which is wrong and following perf session ends up with report segfault. $ perf record ls $ perf report -F mispredict perf: Segmentation fault Checking that hist_entry::branch_info is valid and display "N/A" string in snprint callback if it's not. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1413468427-31049-4-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The branch field sorting code assumes hist_entry::branch_info is allocated, which is wrong and following perf session ends up with report segfault. $ perf record ls $ perf report -F in_tx perf: Segmentation fault Checking that hist_entry::branch_info is valid and display "N/A" string in snprint callback if it's not. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1413468427-31049-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jiri Olsa 提交于
The branch field sorting code assumes hist_entry::branch_info is allocated, which is wrong and following perf session ends up with report segfault. $ perf record ls $ perf report -F abort perf: Segmentation fault Checking that hist_entry::branch_info is valid and display "N/A" string in snprint callback if it's not. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1413468427-31049-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 10 10月, 2014 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Not all tools need a hists instance per perf_evsel, so lets pave the way to remove evsel->hists while leaving a way to access the hists from a specially allocated evsel, one that comes with space at the end where lives the evsel. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qlktkhe31w4mgtbd84035sr2@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 9月, 2014 1 次提交
-
-
由 Jiri Olsa 提交于
Adding support to add field(s) to default sort order via using the '+' prefix, like for report: $ perf report Samples: 2K of event 'cycles', Event count (approx.): 882172583 Overhead Command Shared Object Symbol 7.39% swapper [kernel.kallsyms] [k] intel_idle 1.97% firefox libpthread-2.17.so [.] pthread_mutex_lock 1.39% firefox [snd_hda_intel] [k] azx_get_position 1.11% firefox libpthread-2.17.so [.] pthread_mutex_unlock $ perf report -s +cpu Samples: 2K of event 'cycles', Event count (approx.): 882172583 Overhead Command Shared Object Symbol CPU 2.89% swapper [kernel.kallsyms] [k] intel_idle 000 2.61% swapper [kernel.kallsyms] [k] intel_idle 002 1.20% swapper [kernel.kallsyms] [k] intel_idle 001 0.82% firefox libpthread-2.17.so [.] pthread_mutex_lock 002 Works in general for commands using --sort option. v2 with changes suggested: - Use dynamic memory instead static buffer - Fix error message typo Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20140823125948.GA1193@krava.brq.redhat.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 8月, 2014 1 次提交
-
-
由 Jiri Olsa 提交于
Adding support to add field(s) to default field order via using the '+' prefix, like for report: $ perf report Samples: 10 of event 'cycles', Event count (approx.): 4463799 Overhead Command Shared Object Symbol 32.40% ls [kernel.kallsyms] [k] filemap_fault 28.19% ls [kernel.kallsyms] [k] get_page_from_freelist 23.38% ls [kernel.kallsyms] [k] enqueue_entity 15.04% ls [kernel.kallsyms] [k] mmap_region $ perf report -F +period,sample Samples: 10 of event 'cycles', Event count (approx.): 4463799 Overhead Period Samples Command Shared Object Symbol 32.40% 1446493 1 ls [kernel.kallsyms] [k] filemap_fault 28.19% 1258486 1 ls [kernel.kallsyms] [k] get_page_from_freelist 23.38% 1043754 1 ls [kernel.kallsyms] [k] enqueue_entity 15.04% 671160 1 ls [kernel.kallsyms] [k] mmap_region Works in general for commands using --field option. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1408715919-25990-2-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 8月, 2014 4 次提交
-
-
由 Namhyung Kim 提交于
It makes the code a bit simpler and easier to debug IMHO. I guess it can also remove similar code in perf diff, but let's keep it for a future work. :) Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1406785662-5534-7-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Set column width and do not change it if user gives -w/--column-widths option. It'll truncate longer symbols than the width if exists. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1406785662-5534-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Save column length in the hpp format and pass it to print functions. This is a preparation for users to control column width in the output. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1406785662-5534-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Now perf left-aligns column headers but the contents does not. It should have same alignment. This requires a change in pid sort key - it consists of two part (pid and comm). As length of comm can be vary it'd be better to change the order of them. Thanks to Jiri Olsa for pointing this out. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1406785662-5534-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 7月, 2014 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Looks better and avoids it moving to the end of the screen as the column width changes over time in 'perf top'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-yc144ai5jye3yl3h5yxw0scd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 6月, 2014 1 次提交
-
-
由 Don Zickus 提交于
In perf's 'mem-mode', one can get access to a whole bunch of details specific to a particular sample instruction. A bunch of those details relate to the data address. One interesting thing you can do with data addresses is to convert them into a unique cacheline they belong too. Organizing these data cachelines into similar groups and sorting them can reveal cache contention. This patch creates an alogorithm based on various sample details that can help group entries together into data cachelines and allows 'perf report' to sort on it. The algorithm relies on having proper mmap2 support in the kernel to help determine if the memory map the data address belongs to is private to a pid or globally shared. The alogortithm is as follows: o group cpumodes together o group entries with discovered maps together o sort on major, minor, inode and inode generation numbers o if userspace anon, then sort on pid o sort on cachelines based on data addresses The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'. Sample output: # # Samples: 206 of event 'cpu/mem-loads/pp' # Total weight : 2534 # Sort order : dcacheline,pid # # Overhead Samples Data Cacheline Command: Pid # ........ ............ ...................................................................... .................. # 13.22% 1 [k] 0xffff88042f08ebc0 swapper: 0 9.27% 1 [k] 0xffff88082e8cea80 swapper: 0 3.59% 2 [k] 0xffffffff819ba180 swapper: 0 0.32% 1 [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0 swapper: 0 0.32% 1 [k] timekeeper_seq+0xfffffffffffffff8 swapper: 0 Note: Added a '+1' to symlen size in hists__calc_col_len to prevent the next column from prematurely tabbing over and mis-aligning. Not sure what the problem is. Signed-off-by: NDon Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
- 04 6月, 2014 2 次提交
-
-
由 Jiri Olsa 提交于
After output/sort fields refactoring, it's expensive to check the elide bool in its current location inside the 'struct sort_entry'. The perf_hpp__should_skip function gets highly noticable in workloads with high number of output/sort fields, like for: $ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio Performance report: 9.70% perf [.] perf_hpp__should_skip Moving the elide bool into the 'struct perf_hpp_fmt', which makes the perf_hpp__should_skip just single struct read. Got speedup of around 22% for my test perf.data workload. The change should not harm any other workload types. Performance counter stats for (10 runs): before: 358,319,732,626 cycles ( +- 0.55% ) 467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% ) 150.943975206 seconds time elapsed ( +- 0.62% ) now: 278,785,972,990 cycles ( +- 0.12% ) 370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% ) 116.416670507 seconds time elapsed ( +- 0.31% ) Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
由 Jiri Olsa 提交于
There's no need to setup elide of sort_dso sort entry again with symbol_conf.dso_list list. The only difference were list names of memory mode data, which does not make much sense to me. Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1400858147-7155-2-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-