- 29 1月, 2010 5 次提交
-
-
由 Peter Zijlstra 提交于
We need this to be u64 for direct assigment, but the bitmask functions all work on unsigned long, leading to cast heaven, solve this by using a union. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20100122155535.595961269@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Constraints gets defined an u64 but in long quantities and then cast to long. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20100122155535.504916780@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
GCC was complaining the stack usage was too large, so allocate the structure. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <20100122155535.411197266@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Stephane Eranian 提交于
Implement correct fastpath scheduling, i.e., reuse previous assignment. Signed-off-by: NStephane Eranian <eranian@google.com> [ split from larger patch] Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4b588464.1818d00a.4456.383b@mx.google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Stephane Eranian 提交于
This patch improves event scheduling by maximizing the use of PMU registers regardless of the order in which events are created in a group. The algorithm takes into account the list of counter constraints for each event. It assigns events to counters from the most constrained, i.e., works on only one counter, to the least constrained, i.e., works on any counter. Intel Fixed counter events and the BTS special event are also handled via this algorithm which is designed to be fairly generic. The patch also updates the validation of an event to use the scheduling algorithm. This will cause early failure in perf_event_open(). The 2nd version of this patch follows the model used by PPC, by running the scheduling algorithm and the actual assignment separately. Actual assignment takes place in hw_perf_enable() whereas scheduling is implemented in hw_perf_group_sched_in() and x86_pmu_enable(). Signed-off-by: NStephane Eranian <eranian@google.com> [ fixup whitespace and style nits as well as adding is_x86_event() ] Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <4b5430c6.0f975e0a.1bf9.ffff85fe@mx.google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 1月, 2010 1 次提交
-
-
由 Anton Blanchard 提交于
When running perf across all cpus with backtracing (-a -g), sometimes we get samples without associated backtraces: 23.44% init [kernel] [k] restore 11.46% init eeba0c [k] 0x00000000eeba0c 6.77% swapper [kernel] [k] .perf_ctx_adjust_freq 5.73% init [kernel] [k] .__trace_hcall_entry 4.69% perf libc-2.9.so [.] 0x0000000006bb8c | |--11.11%-- 0xfffa941bbbc It turns out the backtrace code has a check for the idle task and the IP sampling does not. This creates problems when profiling an interrupt heavy workload (in my case 10Gbit ethernet) since we get no backtraces for interrupts received while idle (ie most of the workload). Right now x86 and sh check that current is not NULL, which should never happen so remove that too. Idle task's exclusion must be performed from the core code, on top of perf_event_attr:exclude_idle. Signed-off-by: NAnton Blanchard <anton@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> LKML-Reference: <20100118054707.GT12666@kryten> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
-
- 13 1月, 2010 1 次提交
-
-
由 Frederic Weisbecker 提交于
The check that ignores the debug and nmi stack frames is useless now that we have a frame pointer that makes us start at the right place. We don't anymore have to deal with these. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1262235183-5320-2-git-send-regression-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 12月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
Pass the frame pointer from the regs of the interrupted path to dump_trace() while processing the stack trace. Currently, dump_trace() takes the current bp and starts the callchain from dump_trace() itself. This is wasteful because we need to walk through the entire NMI/DEBUG stack before retrieving the interrupted point. We can fix that by just using the frame pointer from the captured regs. It points exactly where we want to start. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1262235183-5320-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org>
-
- 17 12月, 2009 2 次提交
-
-
由 Frederic Weisbecker 提交于
perf events, x86/stacktrace: Fix performance/softlockup by providing a special frame pointer-only stack walker It's just wasteful for stacktrace users like perf to walk through every entries on the stack whereas these only accept reliable ones, ie: that the frame pointer validates. Since perf requires pure reliable stacktraces, it needs a stack walker based on frame pointers-only to optimize the stacktrace processing. This might solve some near-lockup scenarios that can be triggered by call-graph tracing timer events. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1261024834-5336-2-git-send-regression-fweisbec@gmail.com> [ v2: fix for modular builds and small detail tidyup ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Frederic Weisbecker 提交于
The current print_context_stack helper that does the stack walking job is good for usual stacktraces as it walks through all the stack and reports even addresses that look unreliable, which is nice when we don't have frame pointers for example. But we have users like perf that only require reliable stacktraces, and those may want a more adapted stack walker, so lets make this function a callback in stacktrace_ops that users can tune for their needs. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1261024834-5336-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 12月, 2009 2 次提交
-
-
由 Cyrill Gorcunov 提交于
Ralf Hildebrandt reported this boot warning: | Running a vanilla 2.6.32 as Xen DomU, I'm getting: | | [ 0.000999] CPU: Physical Processor ID: 0 | [ 0.000999] CPU: Processor Core ID: 1 | [ 0.000999] Performance Events: AMD PMU driver. | [ 0.000999] ------------[ cut here ]------------ | [ 0.000999] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_dummy So we need to check if APIC functionality is available, and not just in the P6 driver but elsewhere as well. Reported-by: NRalf Hildebrandt <Ralf.Hildebrandt@charite.de> Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091210165634.GF5086@lenovo> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Xiao Guangrong 提交于
Signed-off-by: NXiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <4B20BAA6.7010609@cn.fujitsu.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 06 12月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
Dumping the callchains from breakpoint events with perf gives strange results: 3.75% perf [kernel] [k] _raw_read_unlock | --- _raw_read_unlock perf_callchain perf_prepare_sample __perf_event_overflow perf_swevent_overflow perf_swevent_add perf_bp_event hw_breakpoint_exceptions_notify notifier_call_chain __atomic_notifier_call_chain atomic_notifier_call_chain notify_die do_debug debug munmap We are infected with all the debug stack. Like the nmi stack, the debug stack is undesired as it is part of the profiling path, not helpful for the user. Ignore it. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: "K. Prasad" <prasad@linux.vnet.ibm.com>
-
- 04 12月, 2009 1 次提交
-
-
由 André Goddard Rosa 提交于
That is "success", "unknown", "through", "performance", "[re|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: NAndré Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 25 11月, 2009 1 次提交
-
-
由 Stephane Eranian 提交于
The validate_event() was failing on valid event combinations. The function was assuming that if x86_schedule_event() returned 0, it meant error. But x86_schedule_event() returns the counter index and 0 is a perfectly valid value. An error is returned if the function returns a negative value. Furthermore, validate_event() was also failing for event groups because the event->pmu was not set until after hw_perf_event_init(). Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: paulus@samba.org Cc: perfmon2-devel@lists.sourceforge.net Cc: eranian@gmail.com LKML-Reference: <4b0bdf36.1818d00a.07cc.25ae@mx.google.com> Signed-off-by: NIngo Molnar <mingo@elte.hu> -- arch/x86/kernel/cpu/perf_event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
-
- 12 11月, 2009 1 次提交
-
-
由 Hiroshi Shimamoto 提交于
Annotate init functions and data with __init and __initconst. Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@gmail.com> LKML-Reference: <4AFB721E.8070203@ct.jp.nec.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 10月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
There was namespace overlap due to a rename i did - this caused the following build warning, reported by Stephen Rothwell against linux-next x86_64 allmodconfig: arch/x86/kernel/cpu/perf_event.c: In function 'intel_get_event_idx': arch/x86/kernel/cpu/perf_event.c:1445: warning: 'event_constraint' is used uninitialized in this function This is a real bug not just a warning: fix it by renaming the global event-constraints table pointer to 'event_constraints'. Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: Stephane Eranian <eranian@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20091013144223.369d616d.sfr@canb.auug.org.au> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 10月, 2009 3 次提交
-
-
由 Peter Zijlstra 提交于
Refuse to add events when the group wouldn't fit onto the PMU anymore. Naive implementation. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@gmail.com> LKML-Reference: <1254911461.26976.239.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Stephane Eranian 提交于
On some Intel processors, not all events can be measured in all counters. Some events can only be measured in one particular counter, for instance. Assigning an event to the wrong counter does not crash the machine but this yields bogus counts, i.e., silent error. This patch changes the event to counter assignment logic to take into account event constraints for Intel P6, Core and Nehalem processors. There is no contraints on Intel Atom. There are constraints on Intel Yonah (Core Duo) but they are not provided in this patch given that this processor is not yet supported by perf_events. As a result of the constraints, it is possible for some event groups to never actually be loaded onto the PMU if they contain two events which can only be measured on a single counter. That situation can be detected with the scaling information extracted with read(). Signed-off-by: NStephane Eranian <eranian@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254840129-6198-3-git-send-email-eranian@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Stephane Eranian 提交于
Intel fixed counters do not support all the filters possible with a generic counter. Thus, if a fixed counter event is passed but with certain filters set, then the fixed_mode_idx() function must fail and the event must be measured in a generic counter instead. Reject filters are: inv, edge, cnt-mask. Signed-off-by: NStephane Eranian <eranian@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1254840129-6198-2-git-send-email-eranian@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 9月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
Chris Malley reported that 'perf sched record' sometimes crashes his box with: [ 389.272175] BUG: unable to handle kernel paging request at ffffb300 [ 389.272294] IP: [<c011b0bd>] default_send_IPI_self+0x1d/0x50 [ 389.272366] *pde = 0073f067 *pte = 00000000 [ 389.274708] Call Trace: [ 389.274752] [<c010e3b4>] ? set_perf_event_pending+0x14/0x20 [ 389.274801] [<c01b9751>] ? perf_output_unlock+0x121/0x1a0 [ 389.274848] [<c01b981a>] ? perf_output_end+0x4a/0x70 [ 389.274893] [<c01ba690>] ? __perf_event_overflow+0x240/0x2f0 [ 389.274942] [<c030963e>] ? atomic64_cmpxchg+0x1e/0x30 [ 389.274988] [<c01ba8f4>] ? perf_swevent_ctx_event+0x1b4/0x1c0 [ 389.275035] [<c01ba773>] ? perf_swevent_ctx_event+0x33/0x1c0 [ 389.275081] [<c01ba9a7>] ? do_perf_sw_event+0xa7/0x160 [ 389.275127] [<c01baae2>] ? perf_tp_event+0x82/0xa0 [ 389.275174] [<c012e9c6>] ? ftrace_profile_sched_stat_runtime+0xe6/0x120 [ 389.275224] [<c012e8e0>] ? ftrace_profile_sched_stat_runtime+0x0/0x120 [ 389.275273] [<c013c85a>] ? update_curr+0x18a/0x230 [ 389.275318] [<c013cdc5>] ? put_prev_task_fair+0x155/0x160 [ 389.275366] [<c01618b5>] ? sched_clock_cpu+0xd5/0x110 [ 389.275413] [<c04e7525>] ? _spin_lock_irq+0x45/0x50 [ 389.275458] [<c04e424e>] ? schedule+0x20e/0xb10 The problem is that the box has no lapic enabled: [ 0.042445] Local APIC not detected. Using dummy APIC emulation. The below seems like the best fix. We disabled all lapic bits, except the self-IPI-resend logic. Reported-by: NChris Malley <mail@chrismalley.co.uk> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <7863dc4c0909221409v7893bfd3o4b590d5951a233ba@mail.gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 21 9月, 2009 4 次提交
-
-
由 Ingo Molnar 提交于
- provide compatibility Kconfig entry for existing PERF_COUNTERS .config's - provide courtesy copy of old perf_counter.h, for user-space projects - small indentation fixups - fix up MAINTAINERS - fix small x86 printout fallout - fix up small PowerPC comment fallout (use 'counter' as in register) Reviewed-by: NArjan van de Ven <arjan@linux.intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: NStephane Eranian <eranian@google.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NPaul Mackerras <paulus@samba.org> Reviewed-by: NArjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
In preparation to the renames, to avoid a namespace clash. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Dave noticed that we leak the PMU resource reservations when we fail the hardware counter init. Reported-by: NDavid Miller <davem@davemloft.net> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NDavid Miller <davem@davemloft.net> LKML-Reference: <1252483487.7746.164.camel@twins> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 9月, 2009 1 次提交
-
-
由 Markus Metzger 提交于
Draining the BTS buffer on a buffer overflow interrupt takes too long resulting in a kernel lockup when tracing the kernel. Restructure perf_counter sampling into sample creation and sample output. Prepare a single reference sample for BTS sampling and update the from and to address fields when draining the BTS buffer. Drain the entire BTS buffer between a single perf_output_begin() / perf_output_end() pair. Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090915130023.A16204@sedona.ch.intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 04 9月, 2009 3 次提交
-
-
Kernel BTS tracing generates too much data too fast for us to handle, causing the kernel to hang. Fail for BTS requests for kernel code. Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Acked-by: NPeter Zijlstra <a.p.zjilstra@chello.nl> LKML-Reference: <20090902140616.901253000@intel.com> [ This is really a workaround - but we want BTS tracing in .32 so make sure we dont regress. The lockup should be fixed ASAP. ] Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
On 32bit, pointers in the DS AREA configuration are cast to u64. The current (long) cast to avoid compiler warnings results in a signed 64bit address. Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090902140615.305889000@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
Reserve PERF_COUNT_HW_BRANCH_INSTRUCTIONS with sample_period == 1 for BTS tracing and fail, if BTS is not available. Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20090902140612.943801000@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 8月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
Johannes Stezenbach reported that his Pentium-M based laptop does not have the local APIC enabled by default, and hence perfcounters do not get initialized. Add a fallback for this case: allow non-sampled counters and return with an error on sampled counters. This allows 'perf stat' to work out of box - and allows 'perf top' and 'perf record' to fall back on a hrtimer based sampling method. ( Passing 'lapic' on the boot line will allow hardware sampling to occur - but if the APIC is disabled permanently by the hardware then this fallback still allows more systems to use perfcounters. ) Also decouple perfcounter support from X86_LOCAL_APIC. -v2: fix typo breaking counters on all other systems ... Reported-by: NJohannes Stezenbach <js@sig21.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 11 8月, 2009 2 次提交
-
-
由 Ingo Molnar 提交于
Johannes Stezenbach reported that 'perf stat' does not count cache-miss and cache-references events on his Pentium-M based laptop. This is because we left them blank in p6_perfmon_event_map[], fill them in. Reported-by: NJohannes Stezenbach <js@sig21.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Instead of this garbled bootup on UP Pentium-M systems: [ 0.015048] Performance Counters: [ 0.016004] no Local APIC, try rebooting with lapicno PMU driver, software counters only. Print: [ 0.015050] Performance Counters: [ 0.016004] no APIC, boot with the "lapic" boot parameter to force-enable it. [ 0.017003] no PMU driver, software counters only. Cf: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 09 8月, 2009 1 次提交
-
-
由 Markus Metzger 提交于
Implement a performance counter with: attr.type = PERF_TYPE_HARDWARE attr.config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS attr.sample_period = 1 Using branch trace store (BTS) on x86 hardware, if available. The from and to address for each branch can be sampled using: PERF_SAMPLE_IP for the from address PERF_SAMPLE_ADDR for the to address [ v2: address review feedback, fix bugs ] Signed-off-by: NMarkus Metzger <markus.t.metzger@intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 7月, 2009 1 次提交
-
-
由 Peter Zijlstra 提交于
Fix a gcc unused variables warning. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
-
- 13 7月, 2009 1 次提交
-
-
由 Daniel Qarras 提交于
I've attached a patch to remove the Pentium M special casing of EMON and as noticed at least with my Pentium M the hardware PMU now works: Performance counter stats for '/bin/ls /var/tmp': 1.809988 task-clock-msecs # 0.125 CPUs 1 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 M/sec 224 page-faults # 0.124 M/sec 1425648 cycles # 787.656 M/sec 912755 instructions # 0.640 IPC Vince suggested that this code was trying to address erratum Y17 in Pentium-M's: http://download.intel.com/support/processors/mobile/pm/sb/25266532.pdf But that erratum (related to IA32_MISC_ENABLES.7) does not affect perfcounters as we dont use this toggle to disable RDPMC and WRMSR/RDMSR access to performance counters. We keep cr4's bit 8 (X86_CR4_PCE) clear so unprivileged RDPMC access is not allowed anyway. Cc: Vince Weaver <vince@deater.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@googlemail.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 7月, 2009 3 次提交
-
-
由 Peter Zijlstra 提交于
Ingo noticed that both AMD and P6 call x86_pmu_disable_counter() on *_pmu_enable_counter(). This is because we rely on the side effect of that call to program the event config but not touch the EN bit. We change that for AMD by having enable_all() simply write the full config in, and for P6 by explicitly coding it. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
The P6 doesn't seem to support cache ref/hit/miss counts, so we extend the generic hardware event codes to have 0 and -1 mean the same thing as for the generic cache events. Furthermore, it turns out the 0 event does not count (that is, its reported that on PPro it actually does count something), therefore use a event configuration that's specified not to count to disable the counters. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Vince Weaver 提交于
Add basic P6 PMU support. The P6 uses the EVNTSEL0 EN bit to enable/disable both its counters. We use this for the global enable/disable, and clear all config bits (except EN) to disable individual counters. Actual ia32 hardware doesn't support lfence, so use a locked op without side-effect to implement a full barrier. perf stat and perf record seem to function correctly. [a.p.zijlstra@chello.nl: cleanups and complete the enable/disable code] Signed-off-by: NVince Weaver <vince@deater.net> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <Pine.LNX.4.64.0907081718450.2715@pianoman.cluster.toy> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 7月, 2009 1 次提交
-
-
由 Frederic Weisbecker 提交于
About every callchains recorded with perf record are filled up including the internal perfcounter nmi frame: perf_callchain perf_counter_overflow intel_pmu_handle_irq perf_counter_nmi_handler notifier_call_chain atomic_notifier_call_chain notify_die do_nmi nmi We want ignore this frame as it's not interesting for instrumentation. To solve this, we simply ignore every frames from nmi context. New example of "perf report -s sym -c" after this patch: 9.59% [k] search_by_key 4.88% search_by_key reiserfs_read_locked_inode reiserfs_iget reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 3.19% search_by_key search_by_entry_key reiserfs_find_entry reiserfs_lookup do_lookup __link_path_walk path_walk do_path_lookup user_path_at vfs_fstatat vfs_lstat sys_newlstat system_call_fastpath __lxstat 0x406fb1 [...] For now this patch only solves the problem in x86-64. Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Anton Blanchard <anton@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1246474930-6088-1-git-send-email-fweisbec@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 6月, 2009 1 次提交
-
-
由 Yinghai Lu 提交于
The print out should read the value before changing the value. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4A487017.4090007@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-