- 15 11月, 2016 2 次提交
-
-
由 Jin Yao 提交于
Create some branch counters in per callchain list entry. Each counter is for a branch flag. For example, predicted_count counts all the *predicted* branches. The counters get updated by processing the callchain cursor nodes. It also provides functions to retrieve or print the values of counters in callchain list. Besides the counting for branch flags, it also counts and returns the average number of iterations. Signed-off-by: NYao Jin <yao.jin@linux.intel.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Linux-kernel@vger.kernel.org Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/1477876794-30749-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Jin Yao 提交于
Since the branch ip has been added to call stack for easier browsing, this patch adds more branch information. For example, add a flag to indicate if this ip is a branch, and also add with the branch flag. Then we can know if the cursor node represents a branch and know what the branch flag it has. The branch history code has a loop detection pass that removes loops. It would be nice for knowing how many loops were removed then in next steps, we can compute out the average number of iterations. For example: Before remove_loops(), entry0: from = 0x100, to = 0x200 entry1: from = 0x300, to = 0x250 entry2: from = 0x300, to = 0x250 entry3: from = 0x300, to = 0x250 entry4: from = 0x700, to = 0x800 After remove_loops() entry0: from = 0x100, to = 0x200 entry1: from = 0x300, to = 0x250 entry2: from = 0x700, to = 0x800 The original entry2 and entry3 are removed. So the number of iterations (from = 0x300, to = 0x250) is equal to removed number + 1 (2 + 1). iterations = removed number + 1; average iteractions = Sum(iteractions) / number of samples This formula ignores other cases, for example, iterations cross multiple buffers and one buffer contains 2+ loops. Because in practice, it's good enough. Signed-off-by: NYao Jin <yao.jin@linux.intel.com> Acked-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Linux-kernel@vger.kernel.org Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/n/1477876794-30749-2-git-send-email-yao.jin@linux.intel.com [ Renamed 'iter' to 'nr_loop_iter' for clarity ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 11月, 2016 1 次提交
-
-
由 Rabin Vincent 提交于
Since 841e3558 ("perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support"), --call-graph dwarf is allowed in 'perf record' even without unwind support. A couple of other places don't reflect this yet though: the help text should list dwarf as a valid record mode and the dump_size config should be respected too. Signed-off-by: NRabin Vincent <rabinv@axis.com> Cc: He Kuang <hekuang@huawei.com> Fixes: 841e3558 ("perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support") Link: http://lkml.kernel.org/r/1470837148-7642-1-git-send-email-rabin.vincent@axis.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 05 7月, 2016 1 次提交
-
-
由 Jiri Olsa 提交于
User's values from .perfconfig could overload the default callchain setup and cause this test to fail. Making sure the test is using default callchain_param values. Signed-off-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1467634583-29147-3-git-send-email-jolsa@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 30 5月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The tooling counterpart, now it is possible to do: # perf record -e sched:sched_switch/max-stack=10/ -e cycles/call-graph=dwarf,max-stack=4/ -e cpu-cycles/call-graph=dwarf,max-stack=1024/ usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.052 MB perf.data (5 samples) ] # perf evlist -v sched:sched_switch: type: 2, size: 112, config: 0x110, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, sample_max_stack: 10 cycles/call-graph=dwarf,max-stack=4/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 4 cpu-cycles/call-graph=dwarf,max-stack=1024/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 1024 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Using just /max-stack=N/ means /call-graph=fp,max-stack=N/, that should be further configurable by means of some .perfconfig knob. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 4月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To be able to call it outside option parsing, like when setting a default --call-graph parameter in 'perf trace' when just --min-stack is used. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 4月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The recent perf_evsel__fprintf_callchain() move to evsel.c added several new symbol requirements to the python binding, for instance: # perf test -v python 16: Try 'import perf' in python, checking link problems : --- start --- test child forked, pid 18030 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.so: undefined symbol: callchain_cursor test child finished with -1 ---- end ---- Try 'import perf' in python, checking link problems: FAILED! # This would require linking against callchain.c to access to the global callchain_cursor variables. Since lots of functions already receive as a parameter a callchain_cursor struct pointer, make that be the case for some more function so that we can start phasing out usage of yet another global variable. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 3月, 2016 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-w246stf7ponfamclsai6b9zo@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 1月, 2016 1 次提交
-
-
由 Namhyung Kim 提交于
It missed to decay periods in callchains when decaying hist entries. This resulted in more than 100 percent overhead in callchains in the fractal style output. Reported-by: NArnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1451963160-17196-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 11月, 2015 1 次提交
-
-
由 Jiri Olsa 提交于
Adding missing parent_val callchain_node initialization. It's causing segfault in perf top: $ sudo perf top -g perf: Segmentation fault -------- backtrace -------- free_callchain_node(+0x29) in perf [0x4a4b3e] free_callchain(+0x29) in perf [0x4a5a83] hist_entry__delete(+0x126) in perf [0x4c6649] hists__delete_entry(+0x6e) in perf [0x4c66dc] hists__decay_entries(+0x7d) in perf [0x4c6776] perf_top__sort_new_samples(+0x7c) in perf [0x436a78] hist_browser__run(+0xf2) in perf [0x507760] perf_evsel__hists_browse(+0x1da) in perf [0x507c8d] perf_evlist__tui_browse_hists(+0x3e) in perf [0x5088cf] display_thread_tui(+0x7f) in perf [0x437953] start_thread(+0xc5) in libpthread-2.21.so [0x7f7068fbb555] __clone(+0x6d) in libc-2.21.so [0x7f7066fc3b9d] [0x0] Reported-and-Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NNamhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 4b3a3212 ("perf hists browser: Support flat callchains") Link: http://lkml.kernel.org/r/20151121102355.GA17313@krava.localSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 11月, 2015 5 次提交
-
-
由 Namhyung Kim 提交于
The flat callchain mode is to print all chains in a single, simple hierarchy so make it easy to see. Currently perf report --tui doesn't show flat callchains properly. With flat callchains, only leaf nodes are added to the final rbtree so it should show entries in parent nodes. To do that, add parent_val list to struct callchain_node and show them along with the (normal) val list. For example, consider following callchains with '-g graph'. $ perf report -g graph - 39.93% swapper [kernel.vmlinux] [k] intel_idle intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle - cpu_startup_entry 28.63% start_secondary - 11.30% rest_init start_kernel x86_64_start_reservations x86_64_start_kernel Before: $ perf report -g flat - 39.93% swapper [kernel.vmlinux] [k] intel_idle 28.63% start_secondary - 11.30% rest_init start_kernel x86_64_start_reservations x86_64_start_kernel After: $ perf report -g flat - 39.93% swapper [kernel.vmlinux] [k] intel_idle - 28.63% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry start_secondary - 11.30% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry start_kernel x86_64_start_reservations x86_64_start_kernel Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-8-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Now -g/--call-graph option supports how to display callchain values. Possible values are 'percent', 'period' and 'count'. The percent is same as before and it's the default behavior. The period displays the raw period value rather than the percentage. The count displays the number of occurrences. $ perf report --no-children --stdio -g percent ... 39.93% swapper [kernel.vmlinux] [k] intel_idel | ---intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--28.63%-- start_secondary | --11.30%-- rest_init $ perf report --no-children --show-total-period --stdio -g period ... 39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel | ---intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--9334403-- start_secondary | --3684302-- rest_init $ perf report --no-children --show-nr-samples --stdio -g count ... 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel | ---intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--57-- start_secondary | --23-- rest_init Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NBrendan Gregg <brendan.d.gregg@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-6-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
It's to track the count of occurrences of the callchains. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NBrendan Gregg <brendan.d.gregg@gmail.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
This is a preparation to support for printing other type of callchain value like count or period. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-4-git-send-email-namhyung@kernel.org [ renamed new _sprintf_ operation to _scnprintf_ ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Add new call chain option (-g) 'folded' to print callchains in a line. The callchains are separated by semicolons, and preceded by (absolute) percent values and a space. For example, the following 20 lines can be printed in 3 lines with the folded output mode: $ perf report -g flat --no-children | grep -v ^# | head -20 60.48% swapper [kernel.vmlinux] [k] intel_idle 54.60% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry start_secondary 5.88% intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry rest_init start_kernel x86_64_start_reservations x86_64_start_kernel $ perf report -g folded --no-children | grep -v ^# | head -3 60.48% swapper [kernel.vmlinux] [k] intel_idle 54.60% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary 5.88% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel This mode is supported only for --stdio now and intended to be used by some scripts like in FlameGraphs[1]. Support for other UI might be added later. [1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.htmlRequested-and-Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1447047946-1691-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 10月, 2015 4 次提交
-
-
由 Namhyung Kim 提交于
The --call-graph option is complex so we should provide better guide for users. Also change help message to be consistent with config option names. Now perf top will show help like below: $ perf top --call-graph Error: option `call-graph' requires a value Usage: perf top [<options>] --call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]> setup and enables call-graph (stack chain/backtrace): record_mode: call graph recording mode (fp|dwarf|lbr) record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>) default: 8192 (bytes) print_type: call graph printing style (graph|flat|fractal|none) threshold: minimum call graph inclusion threshold (<percent>) print_limit: maximum number of call graph entry (<number>) order: call graph order (caller|callee) sort_key: call graph sort key (function|address) branch: include last branch info to call graph (branch) Default: fp,graph,0.5,caller,function Requested-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
The caller callchain order is useful with --children option since it can show 'overview' style output, but other commands which don't use --children feature like 'perf script' or even 'perf report/top' without --children are better to keep callee order. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NBrendan Gregg <brendan.d.gregg@gmail.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445499946-29817-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
Currently 'perf top --call-graph' option is same as 'perf record'. But 'perf top' also need to receive display options in 'perf report'. To do that, change parse_callchain_report_opt() to allow record options too. Now perf top can receive display options like below: $ perf top --call-graph Error: option `call-graph' requires a value Usage: perf top [<options>] --call-graph <mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]> setup and enables call-graph (stack chain/backtrace) recording: fp dwarf lbr, output_type (graph, flat, fractal, or none), min percent threshold, optional print limit, callchain order, key (function or address), add branches $ perf top --call-graph callee,graph,fp Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445495330-25416-2-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
These messages will be used by 'perf top' in the next patch. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Chandler Carruth <chandlerc@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1445495330-25416-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 8月, 2015 1 次提交
-
-
由 Kan Liang 提交于
Move callchain option parse related code to util.c, to avoid dragging more object files into the python binding. Signed-off-by: NKan Liang <kan.liang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438890294-33409-1-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 8月, 2015 1 次提交
-
-
由 Kan Liang 提交于
Pass global callchain_param into parse_callchain_record_opt and perf_evsel__config_callgraph as parameter. So we can reuse these functions to parse/config local param for callchain. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438677022-34296-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 5月, 2015 1 次提交
-
-
由 Namhyung Kim 提交于
The has_children and unfolded fields don't belong to the struct map_symbol since they're used by the TUI only. Move those fields out of map_symbol since the struct is also used by other places. This will also help to compact the sizeof struct hist_entry. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1429687101-4360-11-git-send-email-namhyung@kernel.org Link: http://lkml.kernel.org/r/1430837746-5439-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 2月, 2015 1 次提交
-
-
由 Kan Liang 提交于
Currently, there are two call chain recording options, fp and dwarf. Haswell has a new feature that utilizes the existing LBR facility to record call chains. Kernel side LBR support code provides this as a third option to record call chains. This patch enables the lbr call stack support on the tooling side. LBR call stack has some limitations: - It reuses current LBR facility, so LBR call stack and branch record can not be enabled at the same time. - It is only available for user-space callchains. However, it also offers some advantages: - LBR call stack can work on user apps which don't have frame-pointers or dwarf debug info compiled. It is a good alternative when nothing else works. Tested-by: NJiri Olsa <jolsa@kernel.org> Signed-off-by: NKan Liang <kan.liang@intel.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Jacob Shin <jacob.w.shin@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masanari Iida <standby24x7@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Rodrigo Campos <rodrigo@sdfg.com.ar> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 08 1月, 2015 1 次提交
-
-
由 Namhyung Kim 提交于
Markus reported that "perf top -g" can leak ~300MB per second on his machine. This is partly because it missed to free callchains when hist entries are deleted. Fix it. Reported-by: NMarkus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20141230053813.GD6081@sejongSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 12月, 2014 1 次提交
-
-
由 Andi Kleen 提交于
Currently branch stacks can be only shown as edge histograms for individual branches. I never found this display particularly useful. This implements an alternative mode that creates histograms over complete branch traces, instead of individual branches, similar to how normal callgraphs are handled. This is done by putting it in front of the normal callgraph and then using the normal callgraph histogram infrastructure to unify them. This way in complex functions we can understand the control flow that lead to a particular sample, and may even see some control flow in the caller for short functions. Example (simplified, of course for such simple code this is usually not needed), please run this after the whole patchkit is in, as at this point in the patch order there is no --branch-history, that will be added in a patch after this one: tcall.c: volatile a = 10000, b = 100000, c; __attribute__((noinline)) f2() { c = a / b; } __attribute__((noinline)) f1() { f2(); f2(); } main() { int i; for (i = 0; i < 1000000; i++) f1(); } % perf record -b -g ./tsrc/tcall [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ] % perf report --no-children --branch-history ... 54.91% tcall.c:6 [.] f2 tcall | |--65.53%-- f2 tcall.c:5 | | | |--70.83%-- f1 tcall.c:11 | | f1 tcall.c:10 | | main tcall.c:18 | | main tcall.c:18 | | main tcall.c:17 | | main tcall.c:17 | | f1 tcall.c:13 | | f1 tcall.c:13 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:12 | | f1 tcall.c:12 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:11 | | | --29.17%-- f1 tcall.c:12 | f1 tcall.c:12 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:11 | f1 tcall.c:10 | main tcall.c:18 | main tcall.c:18 | main tcall.c:17 | main tcall.c:17 | f1 tcall.c:13 | f1 tcall.c:13 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:12 The default output is unchanged. This is only implemented in perf report, no change to record or anywhere else. This adds the basic code to report: - add a new "branch" option to the -g option parser to enable this mode - when the flag is set include the LBR into the callstack in machine.c. The rest of the history code is unchanged and doesn't know the difference between LBR entry and normal call entry. - detect overlaps with the callchain - remove small loop duplicates in the LBR Current limitations: - The LBR flags (mispredict etc.) are not shown in the history and LBR entries have no special marker. - It would be nice if annotate marked the LBR entries somehow (e.g. with arrows) v2: Various fixes. v3: Merge further patches into this one. Fix white space. v4: Improve manpage. Address review feedback. v5: Rename functions. Better error message without -g. Fix crash without -b. v6: Rebase v7: Rebase. Use NO_ENTRY in memset. v8: Port to latest tip. Move add_callchain_ip to separate patch. Skip initial entries in callchain. Minor cleanups. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 11月, 2014 1 次提交
-
-
由 Andi Kleen 提交于
For lbr-as-callgraph we need to see the line number in the history, because many LBR entries can be in a single function, and just showing the same function name many times is not useful. When the history code is configured to sort by address, also try to resolve the address to a file:srcline and display this in the browser. If that doesn't work still display the address. This can be also useful without LBRs for understanding which call in a large function (or in which inlined function) called something else. Contains fixes from Namhyung Kim v2: Refactor code into common function v3: Fix GTK build v4: Rebase Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-7-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 11月, 2014 1 次提交
-
-
由 Andi Kleen 提交于
Refactor the duplicated code to resolve the symbol name or the address of a symbol into a single function. Used in next patch to add common functionality. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-6-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 10月, 2014 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
So stop passing both machine and thread to several thread methods, reducing function signature length. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ckcy19dcp1jfkmdihdjcqdn1@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 10月, 2014 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
It was lost in hist.h, move it to where it belongs, callchain.h, as there are places that gets hist.h by means of evsel.h, and since evsel.h is being untangled from hist.h... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jean Pihet <jean.pihet@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-0rg7ji1jnbm6q6gj35j37jby@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 9月, 2014 3 次提交
-
-
由 Namhyung Kim 提交于
This patch adds support for following config options to ~/.perfconfig file. [call-graph] record-mode = dwarf dump-size = 8192 print-type = fractal order = callee threshold = 0.5 print-limit = 128 sort-key = function Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1411434104-5307-5-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
And rename record_callchain_parse() to parse_callchain_record_opt() in accordance to parse_callchain_report_opt(). Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1411434104-5307-4-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Namhyung Kim 提交于
So that all callchain config parameters can be read/written to a single place. It's a preparation to consolidate handling of all callchain options. Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Milian Wolff <mail@milianw.de> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1411434104-5307-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 6月, 2014 1 次提交
-
-
由 Sukadev Bhattiprolu 提交于
When saving the callchain on Power, the kernel conservatively saves excess entries in the callchain. A few of these entries are needed in some cases but not others. We should use the DWARF debug information to determine when the entries are needed. Eg: the value in the link register (LR) is needed only when it holds the return address of a function. At other times it must be ignored. If the unnecessary entries are not ignored, we end up with duplicate arcs in the call-graphs. Use the DWARF debug information to determine if any callchain entries should be ignored when building call-graphs. Callgraph before the patch: 14.67% 2234 sprintft libc-2.18.so [.] __random | --- __random | |--61.12%-- __random | | | |--97.15%-- rand | | do_my_sprintf | | main | | generic_start_main.isra.0 | | __libc_start_main | | 0x0 | | | --2.85%-- do_my_sprintf | main | generic_start_main.isra.0 | __libc_start_main | 0x0 | --38.88%-- rand | |--94.01%-- rand | do_my_sprintf | main | generic_start_main.isra.0 | __libc_start_main | 0x0 | --5.99%-- do_my_sprintf main generic_start_main.isra.0 __libc_start_main 0x0 Callgraph after the patch: 14.67% 2234 sprintft libc-2.18.so [.] __random | --- __random | |--95.93%-- rand | do_my_sprintf | main | generic_start_main.isra.0 | __libc_start_main | 0x0 | --4.07%-- do_my_sprintf main generic_start_main.isra.0 __libc_start_main 0x0 TODO: For split-debug info objects like glibc, we can only determine the call-frame-address only when both .eh_frame and .debug_info sections are available. We should be able to determin the CFA even without the .eh_frame section. Fix suggested by Anton Blanchard. Thanks to valuable input on DWARF debug information from Ulrich Weigand. Reported-by: NMaynard Johnson <maynard@us.ibm.com> Tested-by: NMaynard Johnson <maynard@us.ibm.com> Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20140625154903.GA29607@us.ibm.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
- 01 6月, 2014 2 次提交
-
-
由 Namhyung Kim 提交于
The callchain_cursor_snapshot() is for saving current status of the callchain. It'll be used to accumulate callchain information for each node. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArun Sharma <asharma@fb.com> Tested-by: NRodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-9-git-send-email-namhyung@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
由 Namhyung Kim 提交于
The cpumode and level in struct addr_localtion was set for a sample and but updated as cumulative callchains were added. This led to have non-matching symbol and cpumode in the output. Update it accordingly based on the fact whether the map is a part of the kernel or not. This is a reverse of what thread__find_addr_map() does. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArun Sharma <asharma@fb.com> Tested-by: NRodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-7-git-send-email-namhyung@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
- 05 5月, 2014 1 次提交
-
-
由 Jiri Olsa 提交于
Into util/callchain.h header where all callchain related structures should be. Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1399293219-8732-8-git-send-email-jolsa@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
- 22 4月, 2014 1 次提交
-
-
由 Don Zickus 提交于
This takes the parse_callchain_opt function and copies it into the callchain.c file. Now the c2c tool can use it too without duplicating. Update perf-report to use the new routine too. Signed-off-by: NDon Zickus <dzickus@redhat.com> Reviewed-by: NNamhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1396896924-129847-5-git-send-email-dzickus@redhat.com [ Adding missing braces to multiline if condition ] Signed-off-by: NJiri Olsa <jolsa@redhat.com>
-
- 16 1月, 2014 1 次提交
-
-
由 Namhyung Kim 提交于
The report__resolve_callchain() can be shared with perf top code as it doesn't really depend on the perf report code. Factor it out as sample__resolve_callchain(). The same goes to the hist_entry__append_ callchain() too. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Arun Sharma <asharma@fb.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rodrigo Campos <rodrigo@sdfg.com.ar> Link: http://lkml.kernel.org/r/1389677157-30513-3-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 12月, 2013 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Reduce typing, functions use class__method convention, so unlikely to clash with other libraries. This actually was discussed in the "Link:" referenced message below. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 10月, 2013 1 次提交
-
-
由 Jiri Olsa 提交于
Splitting -g and --call-graph for record command, so we could use '-g' with no option. The '-g' option now takes NO argument and enables the configured unwind method, which is currently the frame pointers method. It will be possible to configure unwind method via config file in upcoming patches. All current '-g' arguments is overtaken by --call-graph option. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Tested-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NIngo Molnar <mingo@kernel.org> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1382797536-32303-2-git-send-email-jolsa@redhat.com [ reordered -g/--call-graph on --help and expanded the man page according to comments by David Ahern and Namhyung Kim ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-