- 09 3月, 2012 1 次提交
-
-
由 Roberto Agostino Vitillo 提交于
This patch adds: - ability to parse samples with PERF_SAMPLE_BRANCH_STACK - sort on branches (dso_from, symbol_from, dso_to, symbol_to, mispredict) - build histograms on branches Signed-off-by: NRoberto Agostino Vitillo <ravitillo@lbl.gov> Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: robert.richter@amd.com Cc: ming.m.lin@intel.com Cc: andi@firstfloor.org Cc: asharma@fb.com Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1328826068-11713-12-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 1月, 2012 1 次提交
-
-
由 Akihiro Nagai 提交于
Add the offset field specifier 'symoff' to show the offset from the symbols in the output of perf-script. We can get the more detailed address information. Output sample: ffffffff81467612 irq_return+0x0 => 301ec016b0 _start+0x0 ffffffff81467612 irq_return+0x0 => 301ec016b0 _start+0x0 301ec016b3 _start+0x3 => 301ec04b70 _dl_start+0x0 ffffffff81467612 irq_return+0x0 => 301ec04b70 _dl_start+0x0 ffffffff81467612 irq_return+0x0 => 301ec04b96 _dl_start+0x26 ffffffff81467612 irq_return+0x0 => 301ec04b9d _dl_start+0x2d 301ec04beb _dl_start+0x7b => 301ec04c0d _dl_start+0x9d 301ec04c11 _dl_start+0xa1 => 301ec04bf0 _dl_start+0x80 [snip] Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: yrl.pp-manager.tt@hitachi.com Link: http://lkml.kernel.org/r/20120130044314.2384.67094.stgit@linux3Signed-off-by: NAkihiro Nagai <akihiro.nagai.hw@hitachi.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 12月, 2011 1 次提交
-
-
由 Robert Richter 提交于
If filename is NULL there is an out-of-bound access to struct perf_session if it would be used with perf_session__open(). Shouldn't actually happen in current implementation as filename is always !NULL. Fixing this by always null-terminating filename. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-3-git-send-email-robert.richter@amd.comSigned-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 12月, 2011 1 次提交
-
-
由 Andrew Vagin 提交于
It's the counterpart of perf_session__parse_sample. v2: fixed mistakes found by David Ahern. v3: s/data/sample/ s/perf_event__change_sample/perf_event__synthesize_sample Reviewed-by: NDavid Ahern <dsahern@gmail.com> Cc: Arun Sharma <asharma@fb.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: devel@openvz.org Link: http://lkml.kernel.org/r/1323266161-394927-3-git-send-email-avagin@openvz.orgSigned-off-by: NAndrew Vagin <avagin@openvz.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 28 11月, 2011 6 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To better reflect that it became the base class for all tools, that must be in each tool struct and where common stuff will be put. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Reducing the exposure of perf_session further, so that we can use the classes in cases where no perf.data file is created. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
So that we don't need to have that many globals. Next steps will remove the 'session' pointer, that in most cases is not needed. Then we can rename perf_event_ops to 'perf_tool' that better describes this class hierarchy. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Since we have it in evsel->hists.callchain_cursor, remove it from perf_session. One more step in disentangling several places from requiring a perf_session pointer. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-rxr5dj3di7ckyfmnz0naku1z@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Removing another case where a perf_session is required when processing events. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ug1wtjbnva4bxwknflkkrlrh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The 'machine' abstraction was introduced with 'perf kvm' where we could have samples for the host and multiple guests, but at the time we ended up keeping the list of all machines threads all in session->host_machine. Move the threads rb_tree to struct machine to separate the namespaces. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-mdg7sm6j3va09vtgj49gbsrp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 02 11月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
So that for large perf.data files the user can have visual feedback that activity is being performed. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-3ysn01mpspfrbsy56gznzqqz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 10月, 2011 1 次提交
-
-
由 Stephane Eranian 提交于
The goal of this patch is to include more information about the host environment into the perf.data so it is more self-descriptive. Overtime, profiles are captured on various machines and it becomes hard to track what was recorded, on what machine and when. This patch provides a way to solve this by extending the perf.data file with basic information about the host machine. To add those extensions, we leverage the feature bits capabilities of the perf.data format. The change is backward compatible with existing perf.data files. We define the following useful new extensions: - HEADER_HOSTNAME: the hostname - HEADER_OSRELEASE: the kernel release number - HEADER_ARCH: the hw architecture - HEADER_CPUDESC: generic CPU description - HEADER_NRCPUS: number of online/avail cpus - HEADER_CMDLINE: perf command line - HEADER_VERSION: perf version - HEADER_TOPOLOGY: cpu topology - HEADER_EVENT_DESC: full event description (attrs) - HEADER_CPUID: easy-to-parse low level CPU identication The small granularity for the entries is to make it easier to extend without breaking backward compatiblity. Many entries are provided as ASCII strings. Perf report/script have been modified to print the basic information as easy-to-parse ASCII strings. Extended information about CPU and NUMA topology may be requested with the -I option. Thanks to David Ahern for reviewing and testing the many versions of this patch. $ perf report --stdio # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # ... $ perf report --stdio -I # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # sibling cores : 0-3 # sibling threads : 0 # sibling threads : 1 # sibling threads : 2 # sibling threads : 3 # node0 meminfo : total = 8320608 kB, free = 7571024 kB # node0 cpu list : 0-3 # ======== # ... Reviewed-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NDavid Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20110930134040.GA5575@quadSigned-off-by: NStephane Eranian <eranian@google.com> [ committer notes: Use --show-info in the tools as was in the docs, rename perf_header_fprintf_info to perf_file_section__fprintf_info, fixup conflict with f69b64f7 "perf: Support setting the disassembler style" ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 9月, 2011 1 次提交
-
-
由 David Ahern 提交于
Currently, analyzing PPC data files on x86 the cpu field is always 0 and the tid and pid are backwards. For example, analyzing a PPC file on PPC the pid/tid fields show: rsyslogd 1210/1212 and analyzing the same PPC file using an x86 perf binary shows: rsyslogd 1212/1210 The problem is that the swap_op method for samples is perf_event__all64_swap which assumes all elements in the sample_data struct are u64s. cpu, tid and pid are u32s and need to be handled individually. Given that the swap is done before the sample is parsed, the simplest solution is to undo the 64-bit swap of those elements when the sample is parsed and do the proper swap. The RAW data field is generic and perf cannot have programmatic knowledge of how to treat that data. Instead a warning is given to the user. Thanks to Anton Blanchard for providing a data file for a mult-CPU PPC system so I could verify the fix for the CPU fields. v3 -> v4: - fixed use of WARN_ONCE v2 -> v3: - used WARN_ONCE for message regarding raw data - removed struct wrapper around union - fixed whitespace issues v1 -> v2: - added a union for undoing the byte-swap on u64 and redoing swap on u32's to address compiler errors (see git commit 65014ab3) Cc: Anton Blanchard <anton@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1315321946-16993-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 21 7月, 2011 1 次提交
-
-
由 David Ahern 提交于
The perf_event_attr struct has two __u32's at the top and they need to be swapped individually. With this change I was able to analyze a perf.data collected in a 32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit binaries for the Intel analysis side; both read the PPC perf.data file correctly. -v2: - changed the existing perf_event__attr_swap() to swap only elements of perf_event_attr and exported it for use in swapping the attributes in the file header - updated swap_ops used for processing events Signed-off-by: NDavid Ahern <dsahern@gmail.com> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: acme@ghostprotocols.net Cc: peterz@infradead.org Cc: paulus@samba.org Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 7月, 2011 1 次提交
-
-
由 Anton Blanchard 提交于
Add an option to perf report/annotate/script to specify which CPUs to operate on. This enables us to take a single system wide profile and analyse each CPU (or group of CPUs) in isolation. This was useful when profiling a multiprocess workload where the bottleneck was on one CPU but this was hidden in the overall profile. Per process and per thread breakdowns didn't help because multiple processes were running on each CPU and no single process consumed an entire CPU. The patch converts the list of CPUs returned by cpu_map__new into a bitmap for fast lookup. I wanted to use -C to be consistent with perf top/record/stat, but unfortunately perf report already uses -C <comms>. v2: Incorporate suggestions from David Ahern: - Added -c to perf script - Check that SAMPLE_CPU is set when -c is used - Update documentation v3: Create perf_session__cpu_bitmap() Signed-off-by: NAnton Blanchard <anton@samba.org> Acked-by: NDavid Ahern <dsahern@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Link: http://lkml.kernel.org/r/20110704215750.11647eb9@krytenSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 6月, 2011 2 次提交
-
-
由 David Ahern 提交于
The 'sym' option displays both the function name and the DSO it comes from. Split the display of the dso into a separate option. This allows display of the ip address and symbol without the dso, thus shortening line lengths - and decluttering the output a bit. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
Currently the "sym" output field is used to dump instruction pointers and callchain stack. Sample addresses can also be converted to symbols, so the meaning of "sym" needs to be fixed. This patch adds an "ip" option and if it is selected the user can also opt to dump symbols for them. If the user opts to dump IP without syms only the address is shown. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 5月, 2011 1 次提交
-
-
由 Frederic Weisbecker 提交于
Check that the total size of the sample fields having a fixed size do not exceed the one of the whole event. This robustifies the sample parsing. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com>
-
- 20 4月, 2011 1 次提交
-
-
由 David Ahern 提交于
Check for required sample attributes using evsel rather than sample_type in the session header. If the attribute for a default field is not present for the event type (e.g., new command operating on file from older kernel) the field is removed from the output list. Expected event types must exist. For example, if a user specifies -f trace:time,trace -f sw:time,cpu,sym the perf.data file must contain both tracepoints and software events (ie., it is an error if either does not exist in the file). Attribute checking is done once at the beginning of perf-script rather than for each sample. v1 -> v2: - addressed comments from acme Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1302148460-570-1-git-send-email-daahern@cisco.comSigned-off-by: NDavid Ahern <daahern@cisco.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 3月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Resolving the sample->id to an evsel since the most advanced tools, report and annotate, and the others will too when they evolve to properly support multi-event perf.data files. Good also because it does an extra validation, checking that the ID is valid when present. When that is not the case, the overhead is just a branch + function call (perf_evlist__id2evsel). Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 3月, 2011 1 次提交
-
-
由 David Ahern 提交于
Add option to dump symbols found in events. e.g., perf script -f comm,pid,tid,time,trace,sym swapper 0/0 537.037184: prev_comm=swapper prev_pid=0 prev_prio=120... ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms]) ffffffff81382ac5 schedule ([kernel.kallsyms]) ffffffff8100134a cpu_idle ([kernel.kallsyms]) ffffffff81370b39 rest_init ([kernel.kallsyms]) ffffffff81696c23 start_kernel ([kernel.kallsyms].init.text) ffffffff816962af x86_64_start_reservations ([kernel.kallsyms].init.text) ffffffff816963b9 x86_64_start_kernel ([kernel.kallsyms].init.text) sshd 1675/1675 537.037309: prev_comm=sshd prev_pid=1675 prev_prio=120... ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms]) ffffffff81382ac5 schedule ([kernel.kallsyms]) ffffffff813837aa schedule_hrtimeout_range_clock ([kernel.kallsyms]) ffffffff81383886 schedule_hrtimeout_range ([kernel.kallsyms]) ffffffff8110c4f9 poll_schedule_timeout ([kernel.kallsyms]) ffffffff8110cd20 do_select ([kernel.kallsyms]) ffffffff8110ced8 core_sys_select ([kernel.kallsyms]) ffffffff8110d00d sys_select ([kernel.kallsyms]) ffffffff81002bc2 system_call ([kernel.kallsyms]) 7f1647e56e93 __GI_select (/lib64/libc-2.12.90.so) netstat 1692/1692 537.038664: prev_comm=netstat prev_pid=1692 prev_prio=... ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms]) ffffffff81382ac5 schedule ([kernel.kallsyms]) ffffffff81002c3a sysret_careful ([kernel.kallsyms]) 7f7a6cd1b210 __GI___libc_read (/lib64/libc-2.12.90.so) Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1299734608-5223-6-git-send-email-daahern@cisco.com> Signed-off-by: NDavid Ahern <daahern@cisco.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 10 3月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
So that we can reuse things like the id to attr lookup routine (perf_evlist__id2evsel) that uses a hash table instead of the linear lookup done in the older perf_header_attr routines, etc. Also to make evsels/evlist more pervasive an API, simplyfing using the emerging perf lib. cc: Arun Sharma <arun@sharma-home.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 07 3月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
By creating an perf_evlist out of the attributes in the perf.data file header, so that we can use evlists and evsels when reading recorded sessions in addition to when we record sessions. More work is needed to allow tools to allow the user to select which events are wanted when browsing sessions, be it just one or a subset of them, aggregated or showed at the same time but with different indications on the UI to allow seeing workloads thru different views at the same time. But the overall goal/trend is to more uniformly use evsels and evlists. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 30 1月, 2011 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
And move the event_t methods to the perf_event__ too. No code changes, just namespace consistency. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Making the namespace more uniform. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 1月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To avoid linking more stuff in the python binding I'm working on, future csets will make the sample type be taken from the evsel itself, but for that we need to first have one file per cpu and per sample_type, not a single perf.data file. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 23 1月, 2011 1 次提交
-
-
由 Frederic Weisbecker 提交于
The callchains are fed with an array of a fixed size. As a result we iterate over each callchains three times: - 1st to resolve symbols - 2nd to filter out context boundaries - 3rd for the insertion into the tree This also involves some pairs of memory allocation/deallocation everytime we insert a callchain, for the filtered out array of addresses and for the array of symbols that comes along. Instead, feed the callchains through a linked list with persistent allocations. It brings several pros like: - Merge the 1st and 2nd iterations in one. That was possible before but in a way that would involve allocating an array slightly taller than necessary because we don't know in advance the number of context boundaries to filter out. - Much lesser allocations/deallocations. The linked list keeps persistent empty entries for the next usages and is extendable at will. - Makes it easier for multiple sources of callchains to feed a stacktrace together. This is deemed to pave the way for cfi based callchains wherein traditional frame pointer based kernel stacktraces will precede cfi based user ones, producing an overall callchain which size is hardly predictable. This requirement makes the static array obsolete and makes a linked list based iterator a much more flexible fit. Basic testing on a big perf file containing callchains (~ 176 MB) has shown a throughput gain of about 11% with perf report. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 04 1月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Not really something to be exported from session.c. Rename it to 'readn' as others did in the past. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 22 12月, 2010 1 次提交
-
-
由 Ian Munsie 提交于
If we are running the new perf on an old kernel without support for sample_id_all, we should fall back to the old unordered processing of events. If we didn't than we would *always* process events without timestamps out of order, whether or not we hit a reordering race. In other words, instead of there being a chance of not attributing samples correctly, we would guarantee that samples would not be attributed. While processing all events without timestamps before events with timestamps may seem like an intuitive solution, it falls down as PERF_RECORD_EXIT events would also be processed before any samples. Even with a workaround for that case, samples before/after an exec would not be attributed correctly. This patch allows commands to indicate whether they need to fall back to unordered processing, so that commands that do not care about timestamps on every event will not be affected. If we do fallback, this will print out a warning if report -D was invoked. This patch adds the test in perf_session__new so that we only need to test once per session. Commands that do not use an event_ops (such as record and top) can simply pass NULL in it's place. Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <1291951882-sup-6069@au1.ibm.com> Signed-off-by: NIan Munsie <imunsie@au1.ibm.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 05 12月, 2010 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME: $ perf record -aT $ perf report -D | grep PERF_RECORD_ <SNIP> 3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3 3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3 3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811) 3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3 3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853 3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find 3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3 3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so 3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso] 3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1 3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so 3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so 3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1 3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3 <SNIP> First column is the cpu and the second the timestamp. That way we can investigate problems in the event stream. If the new perf binary is run on an older kernel, it will disable this feature automatically. Tested-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NIan Munsie <imunsie@au1.ibm.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
At perf_session__process_event, so that we reduce the number of lines in eache tool sample processing routine that now receives a sample_data pointer already parsed. This will also be useful in the next patch, where we'll allow sample the identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu, timestamp) just after before every event. Also validate callchains in perf_session__process_event, i.e. as early as possible, and keep a counter of the number of events discarded due to invalid callchains, warning the user about it if it happens. There is an assumption that was kept that all events have the same sample_type, that will be dealt with in the future, when this preexisting limitation will be removed. Tested-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NIan Munsie <imunsie@au1.ibm.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 01 12月, 2010 3 次提交
-
-
由 Thomas Gleixner 提交于
The ordered sample code allocates singular reference objects struct sample_queue which have 48byte size on 64bit and 20 bytes on 32bit. That's silly. Allocate ~64k sized chunks and hand them out. Performance gain: ~ 15% Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20101130163820.398713983@linutronix.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Thomas Gleixner 提交于
When the sample queue is flushed we free the sample reference objects. Though we need to malloc new objects when we process further. Stop the malloc/free orgy and cache the already allocated object for resuage. Only allocate when the cache is empty. Performance gain: ~ 10% Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20101130163820.338488630@linutronix.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Thomas Gleixner 提交于
The homebrewn sort algorithm fails to sort in time order. One of the problem spots is that it fails to deal with equal timestamps correctly. My first gut reaction was to replace the fancy list with an rbtree, but the performance is 3 times worse. Rewrite it so it works. Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20101130163819.908482530@linutronix.de> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 17 6月, 2010 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Move them to a session->dead_threads list just like we do with maps that are replaced, because we may have hist_entries pointing to them. This fixes a bug when inserting maps for a new thread that reused the TID, mixing maps for two different threads, causing an endless loop. The code for insering maps should be made more robust but for .35 this is the minimalistic patch. Reported-by: NIngo Molnar <mingo@elte.hu> Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 5月, 2010 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The changes made to support host and guest machines in a session, that started when the 'perf kvm' tool was introduced ended up introducing a bug where the host_machine was not having its DSOs traversed for build-id processing. Fix it by moving some methods to the right classes and considering the host_machine when processing build-ids. Reported-by: NTom Zanussi <tzanussi@gmail.com> Reported-by: NStephane Eranian <eranian@google.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 5月, 2010 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
This is one more thing that started global but are more useful per hist or per session. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 5月, 2010 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Those are really not specific to the newt code, can be used by other UI frontends. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
In cbbc79a5 we introduced support for multiple events by introducing a new "event_stat_id" struct and then made several perf_session methods receive a point to it instead of a pointer to perf_session, and kept the event_stats and hists rb_tree in perf_session. While working on the new newt based browser, I realised that it would be better to introduce a new class, "hists" (short for "histograms"), renaming the "event_stat_id" struct and the perf_session methods that were really "hists" methods, as they manipulate only struct hists members, not touching anything in the other perf_session members. Other optimizations, such as calculating the maximum lenght of a symbol name present in an hists instance will be possible as we add them, avoiding a re-traversal just for finding that information. The rationale for the name "hists" to replace "event_stat_id" is that we may have multiple sets of hists for the same event_stat id, as, for instance, the 'perf diff' tool has, so event stat id is not what characterizes what this struct and the functions that manipulate it do. Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 10 5月, 2010 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
We have just one host on a given session, and that is the most common setup right now, so embed a ->host_machine struct machine instance directly in the perf_session class, check if we're looking for it before going to the rb_tree. This also fixes a problem found when we try to process old perf.data files where we didn't have MMAP events for the kernel and modules and thus don't create the kernel maps, do it in event__preprocess_sample if it wasn't already. Reported-by: NIngo Molnar <mingo@elte.hu> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-