- 12 4月, 2012 1 次提交
-
-
由 Robert Richter 提交于
Use cpu-clock-tick sw counter for cpu-cycles only if there is no hw pmu available. This is the case if the syscall reports ENOENT. In other cases (e.g. invalid attributes) we don't want the sw counter to be used. Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1333643188-26895-5-git-send-email-robert.richter@amd.comSigned-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 3月, 2012 3 次提交
-
-
由 Stephane Eranian 提交于
This patch adds a new feature bit, namely, HEADER_BRANCH_STACK. When present, it indicates that sample records may contain branch stack. This could be useful to a viewer to switch to branch mode without having to parse all the samples or without a specific cmdline option. This will be used in a subsequent patch to enhance perf report with branch stacks. Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-3-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Stephane Eranian 提交于
This patch chanegs the logic of the -b, --branch-stack options of perf record. Based on users' request, the patch provides a default filter mode with the -b (or --branch-any) option. With the option, any type of taken branches is sampled. With -j (or --branch-filter), the user can specify any valid combination of branch types and privilege levels if supported by the underlying hardware. The -b (--branch any) is a shortcut for: --branch-filter any. $ perf record -b foo or: $ perf record --branch-filter any foo For more specific filtering: $ perf record --branch-filter ind_call,u foo Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: asharma@fb.com Cc: ravitillo@lbl.gov Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1331246868-19905-2-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Roberto Agostino Vitillo 提交于
This patch adds a new option to enable taken branch stack sampling, i.e., leverage the PERF_SAMPLE_BRANCH_STACK feature of perf_events. There is a new option to active this mode: -b. It is possible to pass a set of filters to select the type of branches to sample. The following filters are available: - any : any type of branches - any_call : any function call or system call - any_ret : any function return or system call return - any_ind : any indirect branch - u: only when the branch target is at the user level - k: only when the branch target is in the kernel - hv: only when the branch target is in the hypervisor Filters can be combined by passing a comma separated list to the option: $ perf record -b any_call,u -e cycles:u branchy Signed-off-by: NRoberto Agostino Vitillo <ravitillo@lbl.gov> Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: acme@redhat.com Cc: robert.richter@amd.com Cc: ming.m.lin@intel.com Cc: andi@firstfloor.org Cc: asharma@fb.com Cc: vweaver1@eecs.utk.edu Cc: khandual@linux.vnet.ibm.com Cc: dsahern@gmail.com Link: http://lkml.kernel.org/r/1328826068-11713-13-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 3月, 2012 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Just fall back to resetting those fields, if set, warning the user that that feature is not available. If guest samples appear they will just be discarded because no struct machine will be found and thus the event will be accounted as not handled and dropped, see 0c095715. Reported-by: NNamhyung Kim <namhyung@gmail.com> Tested-by: NJoerg Roedel <joerg.roedel@amd.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 David Ahern 提交于
A recent refactoring of perf-record introduced the following: perf record -a -B Couldn't generating buildids. Use --no-buildid to profile anyway. sleep: Terminated I believe the triple negative was meant to be only a double negative. :-) While I'm there, fixed the grammar on the error message. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 2月, 2012 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Instead of requiring that users of perf_record_opts set .sample_id_all_avail to true, just invert the logic, using .sample_id_all_missing, that doesn't need to be explicitely initialized since gcc will zero members ommitted in a struct initialization. Just like the newly introduced .exclude_{guest,host} feature test. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ab772uzk78cwybihf0vt7kxw@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Just fall back to resetting those fields, if set, warning the user that that feature is not available. If guest samples appear they will just be discarded because no struct machine will be found and thus the event will be accounted as not handled and dropped, see 0c095715. Reported-by: NNamhyung Kim <namhyung@gmail.com> Tested-by: NJoerg Roedel <joerg.roedel@amd.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 14 2月, 2012 1 次提交
-
-
由 David Ahern 提交于
Allow a user to collect events for multiple threads or processes using a comma separated list. e.g., collect data on a VM and its vhost thread: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 or monitoring vcpu threads perf top -t 21488,21489 perf stat -t 21488,21489 -ddd perf record -t 21488,21489 Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328718772-16688-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 09 2月, 2012 1 次提交
-
-
由 David Ahern 提交于
A recent refactoring of perf-record introduced the following: perf record -a -B Couldn't generating buildids. Use --no-buildid to profile anyway. sleep: Terminated I believe the triple negative was meant to be only a double negative. :-) While I'm there, fixed the grammar on the error message. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 03 2月, 2012 1 次提交
-
-
由 Robert Richter 提交于
Loop over all features to enable it instead of explicitly enabling every single feature. Reducing duplicate code and making it more robust to later changes e.g. when adding more features. Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/1323966762-8574-3-git-send-email-robert.richter@amd.comSigned-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 1月, 2012 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The new --uid command line option will show only the tasks for a given user, using the proc interface to figure out the existing tasks. Kernel work is needed to close races at startup, but this should already be useful in many use cases. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-bdnspm000gw2l984a2t53o8z@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 12月, 2011 3 次提交
-
-
由 Robert Richter 提交于
The features HEADER_TRACE_INFO and HEADER_BUILD_ID are handled different when writing the feature section. All other features are simply disabled on failure and writing the section goes on without returning an error. There is no reason for these special cases. This patch unifies handling of the features. This should be ok since all features can be parsed independently. Offset and size of a feature's block is stored in struct perf_file_ section right after the data block of perf.data (see perf_session__ write_header()). Thus, if a feature does not exist then other features can be processed anyway. Also moving special code for HEADER_BUILD_ID out to write_build_id(). v2: * perf record throws an error now if buildids may not be generated, which can be disabled with the --no-buildid option. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1323248577-11268-6-git-send-email-robert.richter@amd.comSigned-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Nelson Elhage 提交于
Now that we automatically point users at it, let's provide them some guidance so that they hopefully don't just get mysterious EINVAL's from the kernel. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1324301972-22740-4-git-send-email-nelhage@nelhage.comSigned-off-by: NNelson Elhage <nelhage@nelhage.com> [ committer note: Made it work after 50a682ce ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Nelson Elhage 提交于
This failure is most likely due to running up against the kernel.perf_event_mlock_kb sysctl, so we can tell the user what to do to fix the issue. Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1324301972-22740-3-git-send-email-nelhage@nelhage.comSigned-off-by: NNelson Elhage <nelhage@nelhage.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 20 12月, 2011 1 次提交
-
-
由 Andrew Vagin 提交于
The problem is that when SAMPLE_PERIOD is not set, the kernel generates a number of samples in proportion to an event's period. Number of these samples may be too big and the kernel throttles all samples above a defined limit. E.g.: I want to trace when a process sleeps. I created a process which sleeps for 1ms and for 4ms. perf got 100 events in both cases. swapper 0 [000] 1141.371830: sched_stat_sleep: comm=foo pid=1801 delay=1386750 [ns] swapper 0 [000] 1141.369444: sched_stat_sleep: comm=foo pid=1801 delay=4499585 [ns] In the first case a kernel want to send 4499585 events and in the second case it wants to send 1386750 events. perf-reports shows that process sleeps in both places equal time. Instead of this we can get only one sample with an attribute period. As result we have less data transferring between kernel and user-space and we avoid throttling of samples. The patch "events: Don't divide events if it has field period" added a kernel part of this functionality. Acked-by: NArun Sharma <asharma@fb.com> Cc: Arun Sharma <asharma@fb.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: devel@openvz.org Link: http://lkml.kernel.org/r/1324391565-1369947-1-git-send-email-avagin@openvz.orgSigned-off-by: NAndrew Vagin <avagin@openvz.org> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 29 11月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
At first tools were required to do that, but while writing the python bindings to simplify the API I made them auto-allocate when needed. This just makes record, stat and top use that auto allocation, simplifying them a bit. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-iokhcvkzzijr3keioubx8hlq@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 28 11月, 2011 8 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To better reflect that it became the base class for all tools, that must be in each tool struct and where common stuff will be put. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qgpc4msetqlwr8y2k7537cxe@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Reducing the exposure of perf_session further, so that we can use the classes in cases where no perf.data file is created. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-stua66dcscsezzrcdugvbmvd@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
So that we don't need to have that many globals. Next steps will remove the 'session' pointer, that in most cases is not needed. Then we can rename perf_event_ops to 'perf_tool' that better describes this class hierarchy. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-wp4djox7x6w1i2bab1pt4xxp@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Will be used in other tools to share the command line parsing code. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8x0yr77r6lrd2t699s499m8n@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Tools being developed will need this to allow the user to override this value. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-zydc1yhxfm0z35fuy95bsn1l@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Every tool that calls this and allows the user to override the value needs this logic. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-lwscxpg57xfzahz5dmdfp9uz@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
So that we can easily start a workload in other tools. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-zdsksd4aphu0nltg2lpwsw3x@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
Out of the code in 'perf record', so that we can share option parsing, etc. Eventually will be used by 'perf top', but first 'trace' will use it. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-hzjqsgnte1esk90ytq0ap98v@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 10月, 2011 2 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
As it will exit the tool after the user is notified. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vy06m8xzlvkhr8tk7nylhbng@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
The __perf_evsel__open routing was grouping just the threads for that specific events per cpu when we want to group all threads in all events to the first fd opened on that cpu. So pass the xyarray with the first event, where the other events will be able to get that first per cpu fd. At some point top and record will switch to using perf_evlist__open that takes care of this detail and probably will also handle the fallback from hw to soft counters, etc. Reported-by: NDeng-Cheng Zhu <dczhu@mips.com> Tested-by: NDeng-Cheng Zhu <dczhu@mips.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ebm34rh098i9y9v4cytfdp0x@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 08 10月, 2011 1 次提交
-
-
由 Stephane Eranian 提交于
The goal of this patch is to include more information about the host environment into the perf.data so it is more self-descriptive. Overtime, profiles are captured on various machines and it becomes hard to track what was recorded, on what machine and when. This patch provides a way to solve this by extending the perf.data file with basic information about the host machine. To add those extensions, we leverage the feature bits capabilities of the perf.data format. The change is backward compatible with existing perf.data files. We define the following useful new extensions: - HEADER_HOSTNAME: the hostname - HEADER_OSRELEASE: the kernel release number - HEADER_ARCH: the hw architecture - HEADER_CPUDESC: generic CPU description - HEADER_NRCPUS: number of online/avail cpus - HEADER_CMDLINE: perf command line - HEADER_VERSION: perf version - HEADER_TOPOLOGY: cpu topology - HEADER_EVENT_DESC: full event description (attrs) - HEADER_CPUID: easy-to-parse low level CPU identication The small granularity for the entries is to make it easier to extend without breaking backward compatiblity. Many entries are provided as ASCII strings. Perf report/script have been modified to print the basic information as easy-to-parse ASCII strings. Extended information about CPU and NUMA topology may be requested with the -I option. Thanks to David Ahern for reviewing and testing the many versions of this patch. $ perf report --stdio # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # ======== # ... $ perf report --stdio -I # ======== # captured on : Mon Sep 26 15:22:14 2011 # hostname : quad # os release : 3.1.0-rc4-tip # perf version : 3.1.0-rc4 # arch : x86_64 # nrcpus online : 4 # nrcpus avail : 4 # cpudesc : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz # cpuid : GenuineIntel,6,15,11 # total memory : 8105360 kB # cmdline : /home/eranian/perfmon/official/tip/build/tools/perf/perf record date # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 29, 30, 31, # sibling cores : 0-3 # sibling threads : 0 # sibling threads : 1 # sibling threads : 2 # sibling threads : 3 # node0 meminfo : total = 8320608 kB, free = 7571024 kB # node0 cpu list : 0-3 # ======== # ... Reviewed-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NDavid Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20110930134040.GA5575@quadSigned-off-by: NStephane Eranian <eranian@google.com> [ committer notes: Use --show-info in the tools as was in the docs, rename perf_header_fprintf_info to perf_file_section__fprintf_info, fixup conflict with f69b64f7 "perf: Support setting the disassembler style" ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 30 9月, 2011 1 次提交
-
-
由 Andi Kleen 提交于
When a program crashes under perf there is no message about it, unlike when running it from bash. This can be confusing and lead to wrong actions during debugging. Print fatal signals in perf stat/record. Thanks to Furat Afram for finding the problem originally Link: http://lkml.kernel.org/r/1316122302-24306-1-git-send-email-andi@firstfloor.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Stephane Eranian <eranian@google.com> Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 24 9月, 2011 1 次提交
-
-
由 David Ahern 提交于
perf-record currently creates events enabled. When doing a system wide collection (-a arg) this causes data collection for perf's initialization activities -- eg., perf_event__synthesize_threads(). For some events (e.g., context switch S/W event or tracepoints like syscalls) perf's initialization causes a lot of events to be captured frequently generating "Check IO/CPU overload!" warnings on larger systems (e.g., 2 socket, quad core, hyperthreading). perf's initialization phase can be skipped by creating events disabled and then enabling them once the initialization is done. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1314289075-14706-1-git-send-email-dsahern@gmail.comSigned-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 8月, 2011 1 次提交
-
-
由 Lin Ming 提交于
Group event scheduling command line option is missing in perf record/stat. Add it to perf record/stat, which is same as in perf top. Reported-by: NAndi Kleen <andi@firstfloor.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1313577727.2754.5.camel@hp6530sSigned-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 25 7月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
To remove the last case of access to the FD() macro outside the library. Inspired by a patch by Borislav that moved the FD() macro to util.h, for namespace concerns I rather preferred to constrain it to ev{sel,list}.c. Cc: Borislav Petkov <bp@amd64.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qn893qsstcg366tkucu649qj@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 21 7月, 2011 1 次提交
-
-
由 Jiri Olsa 提交于
Moving out the option parameter from parse_events function, and adding new parse_events_option function instead. The option parameter is used only to carry "struct perf_evlist" pointer for chaining new events. Putting it away, enable us to call parse_events from other places without using the option parameter. Signed-off-by: NJiri Olsa <jolsa@redhat.com> Cc: acme@redhat.com Cc: a.p.zijlstra@chello.nl Cc: paulus@samba.org Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 5月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Suggested-by: NIngo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 26 5月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
Perf uses /proc/modules to figure out where kernel modules are loaded. With the advent of kptr_restrict, non root users get zeroes for all module start addresses. So check if kptr_restrict is non zero and don't generate the syntethic PERF_RECORD_MMAP events for them. Warn the user about it in perf record and in perf report. In perf report the reference relocation symbol being zero means that kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't use it to fixup symbol addresses when using a valid kallsyms (in the buildid cache) or vmlinux (in the vmlinux path) build-id located automatically or specified by the user. Provide an explanation about it in 'perf report' if kernel samples were taken, checking if a suitable vmlinux or kallsyms was found/specified. Restricted /proc/kallsyms don't go to the buildid cache anymore. Example: [acme@emilia ~]$ perf record -F 100000 sleep 1 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ] [acme@emilia ~]$ [acme@emilia ~]$ perf report --stdio Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. If some relocation was applied (e.g. kexec) symbols may be misresolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ..................... # 20.24% sleep [kernel.kallsyms] [k] page_fault 20.04% sleep [kernel.kallsyms] [k] filemap_fault 19.78% sleep [kernel.kallsyms] [k] __lru_cache_add 19.69% sleep ld-2.12.so [.] memcpy 14.71% sleep [kernel.kallsyms] [k] dput 4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers 0.73% sleep [kernel.kallsyms] [k] perf_event_comm 0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ This is because it found a suitable vmlinux (build-id checked) in /lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long file name). If we remove that file from the vmlinux path: [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \ /lib/modules/2.6.39-rc7+/build/vmlinux.OFF [acme@emilia ~]$ perf report --stdio [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562 not found, continuing without symbols Kernel address maps (/proc/{kallsyms,modules}) were restricted, check /proc/sys/kernel/kptr_restrict before running 'perf record'. As no suitable kallsyms nor vmlinux was found, kernel samples can't be resolved. Samples in kernel modules can't be resolved as well. # Events: 13 cycles # # Overhead Command Shared Object Symbol # ........ ....... ................. ...... # 80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a 19.69% sleep ld-2.12.so [.] memcpy # # (For a higher level overview, try: perf report --sort comm,dso) # [acme@emilia ~]$ Reported-by: NStephane Eranian <eranian@google.com> Suggested-by: NDavid Miller <davem@davemloft.net> Cc: Dave Jones <davej@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kees Cook <kees.cook@canonical.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 5月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using --pid when monitoring multithreaded apps, as we can only share a ring buffer for events on the same thread if not doing per cpu. Fix it by using per thread ring buffers. Tested with: [root@felicio ~]# tuna -t 26131 -CP | nl 1 thread ctxt_switches 2 pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 3 26131 OTHER 0 0,1 10814276 2397830 chromium-browse 4 642 OTHER 0 0,1 14688 0 chromium-browse 5 26148 OTHER 0 0,1 713602 115479 chromium-browse 6 26149 OTHER 0 0,1 801958 2262 chromium-browse 7 26150 OTHER 0 0,1 1271128 248 chromium-browse 8 26151 OTHER 0 0,1 3 0 chromium-browse 9 27049 OTHER 0 0,1 36796 9 chromium-browse 10 618 OTHER 0 0,1 14711 0 chromium-browse 11 661 OTHER 0 0,1 14593 0 chromium-browse 12 29048 OTHER 0 0,1 28125 0 chromium-browse 13 26143 OTHER 0 0,1 2202789 781 chromium-browse [root@felicio ~]# So 11 threads under pid 26131, then: [root@felicio ~]# perf record -F 50000 --pid 26131 [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# 11 mmaps, one per thread since we didn't specify any CPU list, so we need one mmap per thread and: [root@felicio ~]# perf record -F 50000 --pid 26131 ^M ^C[ perf record: Woken up 79 times to write data ] [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 371310 26131 2 96516 26148 3 95694 26149 4 95203 26150 5 7291 26143 6 87 27049 7 76 661 8 60 29048 9 47 618 10 43 642 [root@felicio ~]# Ok, one of the threads, 26151 was quiescent, so no samples there, but all the others are there. Then, if I specify one CPU: [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 8444 26131 2 2584 26149 3 2518 26148 4 2324 26150 5 123 26143 6 9 661 7 9 29048 [root@felicio ~]# This machine has two cores, so fewer threads appeared on the radar, and: [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Just one mmap, as now we can use just one per-cpu buffer instead of the per-thread needed in the previous case. For global profiling: [root@felicio ~]# perf record -F 50000 -a ^C[ perf record: Woken up 26 times to write data ] [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ] [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# It uses per-cpu buffers. For just one thread: [root@felicio ~]# perf record -F 50000 --tid 26148 ^C[ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 9969 26148 [root@felicio ~]# [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Tested-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NLin Ming <ming.m.lin@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 15 4月, 2011 1 次提交
-
-
由 Arnaldo Carvalho de Melo 提交于
perf stat doesn't mmap and its perfectly fine for it to use task-bound counters with inheritance. So set the attr.inherit on the caller and leave the syscall itself to validate it. When the mmap fails perf_evlist__mmap will just emit a warning if this is the failure reason. Reported-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.netSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
- 31 3月, 2011 1 次提交
-
-
由 Frederic Weisbecker 提交于
The default setting of perf record is to mmap 128 pages if the user did not override with -m. However the page size may vary accross different architecture settings, giving different default size between each. Moreover the kernel side still has a default max number of mlocked pages of 512 kiB + 1 page for unprivileged users. 128 + 1 pages with page size > 4096 overlaps this threshold. Thus, better adapt to this limitation and set the default number of pages to fit those 512 kiB + 1 page. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1301535324-9735-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 3月, 2011 2 次提交
-
-
由 David Ahern 提交于
Resend of patch sent back in January 2011 in light of recent confusion around unsupported events for a given platform. Improve sys_perf_event_open ENOENT return handling in top and record, just like 5a3446bc does for stat. Retry of Arnaldo's patch using ui_warning instead of die which allows the fallback from hardware cycles to software clock. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org LKML-Reference: <1301080271-20945-1-git-send-email-daahern@cisco.com> Signed-off-by: NDavid Ahern <daahern@cisco.com> [ committer note: Some adjustments to make it apply to newer codebase ] Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Arnaldo Carvalho de Melo 提交于
We have to deal with the TUI mode in perf top, so that we don't end up with a garbled screen when, say, a non root user on a machine with a paranoid setting (the default) tries to use 'perf top'. Introduce a ui__warning_paranoid() routine shared by top and record that tells the user the valid values for /proc/sys/kernel/perf_event_paranoid. Cc: David Ahern <daahern@cisco.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-