- 01 10月, 2015 1 次提交
-
-
由 Adrian Hunter 提交于
The --max_stack option was added as an optimization to reduce processing time, so people specifying --max-stack might get a increased processing time if combined with synthesized callchains, but otherwise no real harm. A warning about setting both --max_stack and the synthesized callchains max depth seems like overkill. Amend the documentation. Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/560A5155.4060105@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 9月, 2015 2 次提交
-
-
由 Kan Liang 提交于
Introduce --socket-filter option for 'perf report' to only show entries for a processor socket that match this filter. $ perf report --socket-filter 1 --stdio # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 752 of event 'cycles' # Event count (approx.): 350995599 # Processor Socket: 1 # # Overhead Command Shared Object Symbol # ........ ......... ................ ................................. # 97.02% test test [.] plusB_c 0.97% test test [.] plusA_c 0.23% swapper [kernel.vmlinux] [k] acpi_idle_do_entry 0.09% rcu_sched [kernel.vmlinux] [k] dyntick_save_progress_counter 0.01% swapper [kernel.vmlinux] [k] task_waking_fair 0.00% swapper [kernel.vmlinux] [k] run_timer_softirq Signed-off-by: NKan Liang <kan.liang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1441377946-44429-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Kan Liang 提交于
This patch enable perf report to sort by processor socket: $ perf report --stdio --sort socket,comm,dso,symbol # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 686 of event 'cycles' # Event count (approx.): 349215462 # # Overhead SOCKET Command Shared Object Symbol # ........ ...... ....... ................ ............................ # 97.05% 000 test test [.] plusB_c 0.98% 000 test test [.] plusA_c 0.93% 001 perf [kernel.vmlinux] [k] smp_call_function_single 0.19% 001 perf [kernel.vmlinux] [k] page_fault 0.19% 001 swapper [kernel.vmlinux] [k] pm_qos_request 0.16% 000 test [kernel.vmlinux] [k] add_mm_counter_fast Signed-off-by: NKan Liang <kan.liang@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1441377946-44429-2-git-send-email-kan.liang@intel.com [ Fix col calc, un-allcapsify col header & read the topology when not using perf.data ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 21 8月, 2015 1 次提交
-
-
由 Adrian Hunter 提交于
perf script, report and inject all have the same itrace options. Put them into an asciidoc include file. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-10-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 8月, 2015 1 次提交
-
-
由 Kan Liang 提交于
Introduce --show-ref-call-graph for perf report to print reference callgraph for no callgraph event. Here is an example. perf report --show-ref-call-graph --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 5 of event 'cpu/cpu-cycles,call-graph=fp/' # Event count (approx.): 144985 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 72.30% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--22.62%-- __GI___libc_nanosleep --77.38%-- [...] ...... # Samples: 6 of event 'cpu/instructions,call-graph=no/', show reference callgraph # Event count (approx.): 172780 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 73.16% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--31.44%-- __GI___libc_nanosleep --68.56%-- [...] Signed-off-by: NKan Liang <kan.liang@intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1439289050-40510-3-git-send-email-kan.liang@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 8月, 2015 1 次提交
-
-
由 Andi Kleen 提交于
In some cases it's useful to characterize samples by file. This is useful to get a higher level categorization, for example to map cost to subsystems. Add a srcfile sort key to perf report. It builds on top of the existing srcline support. Commiter notes: E.g.: # perf record -F 10000 usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data (13 samples) ] [root@zoo ~]# perf report -s srcfile --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File # ........ ........... 60.99% . 20.62% paravirt.h 14.23% rmap.c 4.04% signal.c 0.11% msr.h # The first line is collecting all the files for which srcfiles couldn't somehow get resolved to: # perf report -s srcfile,dso --stdio # Total Lost Samples: 0 # # Samples: 13 of event 'cycles' # Event count (approx.): 869878 # # Overhead Source File Shared Object # ........ ........... ................ 40.97% . ld-2.20.so 20.62% paravirt.h [kernel.vmlinux] 20.02% . libc-2.20.so 14.23% rmap.c [kernel.vmlinux] 4.04% signal.c [kernel.vmlinux] 0.11% msr.h [kernel.vmlinux] # XXX: Investigate why that is not resolving on Fedora 21, Andi says he hasn't seen this on Fedora 22. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438988064-21834-1-git-send-email-andi@firstfloor.org [ Added column length update, from 0e65bdb3f90f ('perf hists: Update the column width for the "srcline" sort key') ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 10 8月, 2015 1 次提交
-
-
由 Andi Kleen 提交于
For perf report/script srcline currently only the base file name of the source file is printed. This is a good default because it usually fits on the screen. But in some cases we want to know the full file name, for example to aggregate hits per file. In the later case we need more than the base file name to resolve file naming collisions: for example the kernel source has ~70 files named "core.c" It's also useful as input to post processing tools which want to point to the right file. Add a flag to allow full file name output. Add an option to perf report/script to enable this option. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1438986245-15191-1-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 8月, 2015 1 次提交
-
-
由 Andi Kleen 提交于
cycles is a new branch_info field available on some CPUs that indicates the time deltas between branches in the LBR. Add a sort key and output code for the cycles to allow to display the basic block cycles individually in perf report. We also pass in the cycles for weight when LBRs are processed, which allows to get global and local weight, to get an estimate of the total cost. And also print the cycles information for perf report -D. I also added printing for the previously missing LBR flags (mispredict etc.) Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1437233094-12844-2-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 5月, 2015 1 次提交
-
-
由 Namhyung Kim 提交于
The 'perf record -s' and 'perf report -T' should be used together to see per-thread event counts. Document the relation of these commands. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1431184784-30525-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 5月, 2015 1 次提交
-
-
由 Adrian Hunter 提交于
Add AUX area tracing option 'x' to synthesize events for transactions. This will be used by Intel PT to synthesize an event record for each TSX start, commit or abort. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430404667-10593-6-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 05 5月, 2015 1 次提交
-
-
由 Adrian Hunter 提交于
Unwittingly the itrace options for perf report ended up below the Overhead Calculation section. Move it back with the other options. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1430404667-10593-2-git-send-email-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 5月, 2015 1 次提交
-
-
由 Adrian Hunter 提交于
Add support for decoding an AUX area assuming it contains instruction tracing data. Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1429903807-20559-4-git-send-email-adrian.hunter@intel.com [ Do not use -Z as an alternative to --itrace ] [ Fixed initialization of itrace_synth_opts struct fields on older gcc versions ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 4月, 2015 1 次提交
-
-
由 Namhyung Kim 提交于
As the --children option changes the output of perf report (and perf top) it sometimes confuses users. Add more words and examples to help understanding of the option's behavior - and how to disable it ;-). Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Reviewed-by: NIngo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/1429684425-14987-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 3月, 2015 1 次提交
-
-
由 David Ahern 提交于
The 'record' and 'top' tools already allow a user to specify a CSV of pids and/or tids of tasks to collect data. Add those options to the 'report' and 'script' analysis commands to only consider samples related to the given pids/tids. This is also inline with the existing comm option. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1427212361-7066-1-git-send-email-dsahern@gmail.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 12月, 2014 2 次提交
-
-
由 Andi Kleen 提交于
Add a --branch-history option to perf report that changes all the settings necessary for using the branches in callstacks. This is just a short cut to make this nicer to use, it does not enable any functionality by itself. v2: Change sort order. Rename option to --branch-history to be less confusing. v3: Updates v4: Fix conflict with newer perf base v5: Port to latest tip v6: Add more comments. Remove CCKEY_ADDRESS setting. Remove unnecessary branch_mode setting. Use a boolean. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-5-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Andi Kleen 提交于
Currently branch stacks can be only shown as edge histograms for individual branches. I never found this display particularly useful. This implements an alternative mode that creates histograms over complete branch traces, instead of individual branches, similar to how normal callgraphs are handled. This is done by putting it in front of the normal callgraph and then using the normal callgraph histogram infrastructure to unify them. This way in complex functions we can understand the control flow that lead to a particular sample, and may even see some control flow in the caller for short functions. Example (simplified, of course for such simple code this is usually not needed), please run this after the whole patchkit is in, as at this point in the patch order there is no --branch-history, that will be added in a patch after this one: tcall.c: volatile a = 10000, b = 100000, c; __attribute__((noinline)) f2() { c = a / b; } __attribute__((noinline)) f1() { f2(); f2(); } main() { int i; for (i = 0; i < 1000000; i++) f1(); } % perf record -b -g ./tsrc/tcall [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ] % perf report --no-children --branch-history ... 54.91% tcall.c:6 [.] f2 tcall | |--65.53%-- f2 tcall.c:5 | | | |--70.83%-- f1 tcall.c:11 | | f1 tcall.c:10 | | main tcall.c:18 | | main tcall.c:18 | | main tcall.c:17 | | main tcall.c:17 | | f1 tcall.c:13 | | f1 tcall.c:13 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:12 | | f1 tcall.c:12 | | f2 tcall.c:7 | | f2 tcall.c:5 | | f1 tcall.c:11 | | | --29.17%-- f1 tcall.c:12 | f1 tcall.c:12 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:11 | f1 tcall.c:10 | main tcall.c:18 | main tcall.c:18 | main tcall.c:17 | main tcall.c:17 | f1 tcall.c:13 | f1 tcall.c:13 | f2 tcall.c:7 | f2 tcall.c:5 | f1 tcall.c:12 The default output is unchanged. This is only implemented in perf report, no change to record or anywhere else. This adds the basic code to report: - add a new "branch" option to the -g option parser to enable this mode - when the flag is set include the LBR into the callstack in machine.c. The rest of the history code is unchanged and doesn't know the difference between LBR entry and normal call entry. - detect overlaps with the callchain - remove small loop duplicates in the LBR Current limitations: - The LBR flags (mispredict etc.) are not shown in the history and LBR entries have no special marker. - It would be nice if annotate marked the LBR entries somehow (e.g. with arrows) v2: Various fixes. v3: Merge further patches into this one. Fix white space. v4: Improve manpage. Address review feedback. v5: Rename functions. Better error message without -g. Fix crash without -b. v6: Rebase v7: Rebase. Use NO_ENTRY in memset. v8: Port to latest tip. Move add_callchain_ip to separate patch. Skip initial entries in callchain. Minor cleanups. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 9月, 2014 1 次提交
-
-
由 Avi Kivity 提交于
Some Linux symbols (for example __vt_event_wait) are interpreted by the demangler as C++ mangled names, which of course they aren't. Disable kernel symbol demangling by default to avoid this, and allow enabling it with a new option --demangle-kernel for those who wish it. Reported-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NAvi Kivity <avi@cloudius-systems.com> Acked-by: NJiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1410581705-26968-1-git-send-email-avi@cloudius-systems.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 8月, 2014 1 次提交
-
-
由 Namhyung Kim 提交于
Add -w/--column-widths option like perf report does so that users are able to see symbols even with some very long C++ library/functions. It can be a list separated by comma for each column. $ perf top -w 0,20,30 The value of 0 means there's no limit. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1406785662-5534-6-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 6月, 2014 2 次提交
-
-
由 Don Zickus 提交于
In perf's 'mem-mode', one can get access to a whole bunch of details specific to a particular sample instruction. A bunch of those details relate to the data address. One interesting thing you can do with data addresses is to convert them into a unique cacheline they belong too. Organizing these data cachelines into similar groups and sorting them can reveal cache contention. This patch creates an alogorithm based on various sample details that can help group entries together into data cachelines and allows 'perf report' to sort on it. The algorithm relies on having proper mmap2 support in the kernel to help determine if the memory map the data address belongs to is private to a pid or globally shared. The alogortithm is as follows: o group cpumodes together o group entries with discovered maps together o sort on major, minor, inode and inode generation numbers o if userspace anon, then sort on pid o sort on cachelines based on data addresses The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'. Sample output: # # Samples: 206 of event 'cpu/mem-loads/pp' # Total weight : 2534 # Sort order : dcacheline,pid # # Overhead Samples Data Cacheline Command: Pid # ........ ............ ...................................................................... .................. # 13.22% 1 [k] 0xffff88042f08ebc0 swapper: 0 9.27% 1 [k] 0xffff88082e8cea80 swapper: 0 3.59% 2 [k] 0xffffffff819ba180 swapper: 0 0.32% 1 [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0 swapper: 0 0.32% 1 [k] timekeeper_seq+0xfffffffffffffff8 swapper: 0 Note: Added a '+1' to symlen size in hists__calc_col_len to prevent the next column from prematurely tabbing over and mis-aligning. Not sure what the problem is. Signed-off-by: NDon Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
由 Don Zickus 提交于
Add mem-mode sorting types and mem-mode itself to perf-report documentation. Signed-off-by: NDon Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/r/1400526833-141779-5-git-send-email-dzickus@redhat.comSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
- 01 6月, 2014 1 次提交
-
-
由 Namhyung Kim 提交于
The --children option is for showing accumulated overhead (period) value as well as self overhead. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Tested-by: NArun Sharma <asharma@fb.com> Tested-by: NRodrigo Campos <rodrigo@sdfg.com.ar> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1401335910-16832-16-git-send-email-namhyung@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
- 21 5月, 2014 2 次提交
-
-
由 Namhyung Kim 提交于
The -F/--fields option is to allow user setup output field in any order. It can receive any sort keys and following (hpp) fields: overhead, overhead_sys, overhead_us, sample and period If guest profiling is enabled, overhead_guest_{sys,us} will be available too. The output fields also affect sort order unless you give -s/--sort option. And any keys specified on -s option, will also be added to the output field list automatically. $ perf report -F sym,sample,overhead ... # Symbol Samples Overhead # .......................... ............ ........ # [.] __cxa_atexit 2 2.50% [.] __libc_csu_init 4 5.00% [.] __new_exitfn 3 3.75% [.] _dl_check_map_versions 1 1.25% [.] _dl_name_match_p 4 5.00% [.] _dl_setup_hash 1 1.25% [.] _dl_sysdep_start 1 1.25% [.] _init 5 6.25% [.] _setjmp 6 7.50% [.] a 8 10.00% [.] b 8 10.00% [.] brk 1 1.25% [.] c 8 10.00% Note that, the example output above is captured after applying next patch which fixes sort/comparing behavior. Requested-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NIngo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1400480762-22852-12-git-send-email-namhyung@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
由 Namhyung Kim 提交于
Add overhead{,_sys,_us,_guest_sys,_guest_us}, sample and period sort keys so that they can be selected with --sort/-s option. $ perf report -s period,comm --stdio ... # Overhead Period Command # ........ ............ ............... # 47.06% 152 swapper 13.93% 45 qemu-system-arm 12.38% 40 synergys 3.72% 12 firefox 2.48% 8 xchat Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NIngo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1400480762-22852-9-git-send-email-namhyung@kernel.orgSigned-off-by: NJiri Olsa <jolsa@kernel.org>
-
- 16 4月, 2014 1 次提交
-
-
由 Namhyung Kim 提交于
The --percentage option is for controlling overhead percentage displayed. It can only receive either of "relative" or "absolute". "relative" means it's relative to filtered entries only so that the sum of shown entries will be always 100%. "absolute" means it retains the original value before and after the filter is applied. $ perf report -s comm # Overhead Command # ........ ............ # 74.19% cc1 7.61% gcc 6.11% as 4.35% sh 4.14% make 1.13% fixdep ... $ perf report -s comm -c cc1,gcc --percentage absolute # Overhead Command # ........ ............ # 74.19% cc1 7.61% gcc $ perf report -s comm -c cc1,gcc --percentage relative # Overhead Command # ........ ............ # 90.69% cc1 9.31% gcc Note that it has zero effect if no filter was applied. Suggested-by: NArnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1397145720-8063-3-git-send-email-namhyung@kernel.orgSigned-off-by: NJiri Olsa <jolsa@redhat.com>
-
- 11 12月, 2013 1 次提交
-
-
由 Jiri Olsa 提交于
Currently the perf.data header is always displayed for stdio output, which is no always useful. Disabling header information by default and adding following options to control header output: --header - display header information (old default) --header-only - display header information only w/o further processing, forces stdio output Signed-off-by: NJiri Olsa <jolsa@redhat.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1386583370-1699-2-git-send-email-jolsa@redhat.com [ Added single line explaining talking about the new --header* options, to address David Ahern comment; better man page entry for the new options, from Namhyung Kim ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 10月, 2013 1 次提交
-
-
由 Waiman Long 提交于
When callgraph data was included in the perf data file, it may take a long time to scan all those data and merge them together especially if the stored callchains are long and the perf data file itself is large, like a Gbyte or so. The callchain stack is currently limited to PERF_MAX_STACK_DEPTH (127). This is a large value. Usually the callgraph data that developers are most interested in are the first few levels, the rests are usually not looked at. This patch adds a new --max-stack option to perf-report to limit the depth of callchain stack data to look at to reduce the time it takes for perf-report to finish its processing. It trades the presence of trailing stack information with faster speed. The following table shows the elapsed time of doing perf-report on a perf.data file of size 985,531,828 bytes. --max_stack Elapsed Time Output data size ----------- ------------ ---------------- not set 88.0s 124,422,651 64 87.5s 116,303,213 32 87.2s 112,023,804 16 86.6s 94,326,380 8 59.9s 33,697,248 4 40.7s 10,116,637 -g none 27.1s 2,555,810 Signed-off-by: NWaiman Long <Waiman.Long@hp.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Scott J Norton <scott.norton@hp.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382107129-2010-4-git-send-email-Waiman.Long@hp.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 10月, 2013 2 次提交
-
-
由 Andi Kleen 提交于
Add support for recording and displaying the transaction flags. They are essentially a new sort key. Also display them in a nice way to the user. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andi Kleen 提交于
Extend the perf branch sorting code to support sorting by in_tx or abort_tx qualifiers. Also print out those qualifiers. This also fixes up some of the existing sort key documentation. We do not support no_tx here, because it's simply not showing the in_tx flag. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NJiri Olsa <jolsa@redhat.com> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 7月, 2013 1 次提交
-
-
由 Andi Kleen 提交于
With programs with very large functions it can be useful to distinguish the callgraph nodes on more than just function names. So for example if you have multiple calls to the same function, it ends up being separate nodes in the chain. This patch adds a new key field to the callgraph options, that allows comparing nodes on functions (as today, default) and addresses. Longer term it would be nice to also handle src lines, but that would need more changes and address is a reasonable proxy for it today. I right now reference the global params, as there was no simple way to register a params pointer. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-0uskktybf0e7wrnoi5e9b9it@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 13 7月, 2013 1 次提交
-
-
由 Greg Price 提交于
For example, in an application with an expensive function implemented with deeply nested recursive calls, the default call-graph presentation is dominated by the different callchains within that function. By ignoring these callees, we can collect the callchains leading into the function and compactly identify what to blame for expensive calls. For example, in this report the callers of garbage_collect() are scattered across the tree: $ perf report -d ruby 2>- | grep -m10 ^[^#]*[a-z] 22.03% ruby [.] gc_mark --- gc_mark |--59.40%-- mark_keyvalue | st_foreach | gc_mark_children | |--99.75%-- rb_gc_mark | | rb_vm_mark | | gc_mark_children | | gc_marks | | |--99.00%-- garbage_collect If we ignore the callees of garbage_collect(), its callers are coalesced: $ perf report --ignore-callees garbage_collect -d ruby 2>- | grep -m10 ^[^#]*[a-z] 72.92% ruby [.] garbage_collect --- garbage_collect vm_xmalloc |--47.08%-- ruby_xmalloc | st_insert2 | rb_hash_aset | |--98.45%-- features_index_add | | rb_provide_feature | | rb_require_safe | | vm_call_method Signed-off-by: NGreg Price <price@mit.edu> Tested-by: NJiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20130623031720.GW22203@biohazard-cafe.mit.edu Link: http://lkml.kernel.org/r/20130708115746.GO22203@biohazard-cafe.mit.edu Cc: Fengguang Wu <fengguang.wu@intel.com> [ remove spaces at beginning of line, reported by Fengguang Wu ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 28 5月, 2013 1 次提交
-
-
由 Namhyung Kim 提交于
The --percent-limit option is for not showing small overhead entries in the output. Maybe we want to set a certain default value like 0.1. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NPekka Enberg <penberg@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1368497347-9628-7-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 01 4月, 2013 1 次提交
-
-
由 Andi Kleen 提交于
perf record has a new option -W that enables weightened sampling. Add sorting support in top/report for the average weight per sample and the total weight sum. This allows to both compare relative cost per event and the total cost over the measurement period. Add the necessary glue to perf report, record and the library. v2: Merge with new hist refactoring. v3: Fix manpage. Remove value check. Rename global_weight to weight and weight to local_weight. v4: Readd sort keys to manpage v5: Move weight to end v6: Move weight to template v7: Rename weight key. Original patch from Andi modified by Stephane Eranian <eranian@google.com> to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com [ committer note: changed to cope with fc5871ed and the hists_link perf test entry ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 27 3月, 2013 1 次提交
-
-
由 Namhyung Kim 提交于
It's sometimes useful to see undemangled raw symbol name for example other tools using the perf output to do manipulation of binaries. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Suggested-by: NWilliam Cohen <wcohen@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: William Cohen <wcohen@redhat.com> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=55571 Link: http://lkml.kernel.org/r/1364203098-17741-1-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 01 2月, 2013 1 次提交
-
-
由 Namhyung Kim 提交于
Add --group option to enable event grouping. When enabled, all the group members information will be shown together with the leader. $ perf report --group ... # group: {ref-cycles,cycles} # ======== # # Samples: 7K of event 'anon group { ref-cycles, cycles }' # Event count (approx.): 6876107743 # # Overhead Command Shared Object Symbol # ................ ....... ................. .......................... # 99.84% 99.76% noploop noploop [.] main 0.07% 0.00% noploop ld-2.15.so [.] strcmp 0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del 0.03% 0.03% noploop [kernel.kallsyms] [k] sched_clock_cpu 0.02% 0.00% noploop [kernel.kallsyms] [k] account_user_time 0.01% 0.00% noploop [kernel.kallsyms] [k] __alloc_pages_nodemask 0.00% 0.00% noploop [kernel.kallsyms] [k] native_write_msr_safe 0.00% 0.11% noploop [kernel.kallsyms] [k] _raw_spin_lock 0.00% 0.06% noploop [kernel.kallsyms] [k] find_get_page 0.00% 0.02% noploop [kernel.kallsyms] [k] rcu_check_callbacks 0.00% 0.02% noploop [kernel.kallsyms] [k] __current_kernel_time Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1358845787-1350-18-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 1月, 2013 1 次提交
-
-
由 Namhyung Kim 提交于
Add description of sort keys to the perf-report document and also add missing cpu and srcline keys to the command line help string. Signed-off-by: NNamhyung Kim <namhyung@kernel.org> Acked-by: NJiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1356599507-14226-11-git-send-email-namhyung@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 06 9月, 2012 1 次提交
-
-
由 Maciek Borzecki 提交于
When analyzing perf data from hosts of other architecture than one of the local host it's useful to call objdump that is part of a toolchain for that architecture. Instead of calling regular objdump, call one that user specified in command line. Signed-off-by: NMaciek Borzecki <maciek.borzecki@gmail.com> Acked-by: NDavid Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1346754750.16299.3.camel@localhost.localdomainSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 6月, 2012 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Using addr2line for now, requires debuginfo, needs more work to support detached debuginfo, aka foo-debuginfo packages. Example: [root@sandy ~]# perf record -a sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.555 MB perf.data (~24236 samples) ] [root@sandy ~]# perf report -s dso,srcline 2>&1 | grep -v ^# | head -5 22.41% [kernel.kallsyms] /home/git/linux/drivers/idle/intel_idle.c:280 4.79% [kernel.kallsyms] /home/git/linux/drivers/cpuidle/cpuidle.c:148 4.78% [kernel.kallsyms] /home/git/linux/arch/x86/include/asm/atomic64_64.h:121 4.49% [kernel.kallsyms] /home/git/linux/kernel/sched/core.c:1690 4.30% [kernel.kallsyms] /home/git/linux/include/linux/seqlock.h:90 [root@sandy ~]# [root@sandy ~]# perf top -U -s dso,symbol,srcline Samples: 1K of event 'cycles', Event count (approx.): 589617389 18.66% [kernel] [k] copy_user_generic_unrolled /home/git/linux/arch/x86/lib/copy_user_64.S:143 7.83% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:39 6.59% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:38 3.66% [kernel] [k] page_fault /home/git/linux/arch/x86/kernel/entry_64.S:1379 3.25% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:40 3.12% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:37 2.74% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:36 2.39% [kernel] [k] clear_page /home/git/linux/arch/x86/lib/clear_page_64.S:43 2.12% [kernel] [k] ioread32 /home/git/linux/lib/iomap.c:90 1.51% [kernel] [k] copy_user_generic_unrolled /home/git/linux/arch/x86/lib/copy_user_64.S:144 1.19% [kernel] [k] copy_user_generic_unrolled /home/git/linux/arch/x86/lib/copy_user_64.S:154 Suggested-by: NAndi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-pdmqbng9twz06jzkbgtuwbp8@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 3月, 2012 1 次提交
-
-
由 Pekka Enberg 提交于
This patch adds a simple GTK2-based browser to 'perf report' that's based on the TTY-based browser in builtin-report.c. To launch "perf report" using the new GTK interface just type: $ perf report --gtk The interface is somewhat limited in features at the moment: - No callgraph support - No KVM guest profiling support - No color coding for percentages - No sorting from the UI - ..and many, many more! That said, I think this patch a reasonable start to build future features on. Signed-off-by: NPekka Enberg <penberg@kernel.org> Cc: Colin Walters <walters@verbum.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202231952410.6689@tux.localdomain [ committer note: Added #pragma to make gtk no strict prototype problem go away as suggested by Colin Walters modulo avoiding push/pop ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 19 3月, 2012 1 次提交
-
-
由 Namhyung Kim 提交于
Add missing description of --symbol-filter in Documentation/perf-report.txt. Reported-by: NArnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: NNamhyung Kim <namhyung.kim@lge.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1332125628-23088-1-git-send-email-namhyung.kim@lge.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 3月, 2012 1 次提交
-
-
由 Stephane Eranian 提交于
This patch enhances perf report to auto-detect when the perf.data file contains samples with branch stacks. That way it is not necessary to use the -b option. To force branch view mode to off, simply use --no-branch-stack. Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-4-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-