- 01 7月, 2011 2 次提交
-
-
由 Peter Zijlstra 提交于
The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Cyrill Gorcunov 提交于
Due to restriction and specifics of Netburst PMU we need a separated event for NMI watchdog. In particular every Netburst event consumes not just a counter and a config register, but also an additional ESCR register. Since ESCR registers are grouped upon counters (i.e. if ESCR is occupied for some event there is no room for another event to enter until its released) we need to pick up the "least" used ESCR (or the most available one) for nmi-watchdog purposes -- so MSR_P4_CRU_ESCR2/3 was chosen. With this patch nmi-watchdog and perf top should be able to run simultaneously. Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org> CC: Lin Ming <ming.m.lin@intel.com> CC: Arnaldo Carvalho de Melo <acme@redhat.com> CC: Frederic Weisbecker <fweisbec@gmail.com> Tested-and-reviewed-by: NDon Zickus <dzickus@redhat.com> Tested-and-reviewed-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110623124918.GC13050@sunSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 5月, 2011 1 次提交
-
-
由 Richard Weinberger 提交于
Both warning and warning_symbol are nowhere used. Let's get rid of them. Signed-off-by: NRichard Weinberger <richard@nod.at> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Soeren Sandmann Pedersen <ssp@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: x86 <x86@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Robert Richter <robert.richter@amd.com> Cc: Paul Mundt <lethal@linux-sh.org> Link: http://lkml.kernel.org/r/1305205872-10321-2-git-send-email-richard@nod.atSigned-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
-
- 27 4月, 2011 1 次提交
-
-
由 Don Zickus 提交于
It was noticed that P4 machines were generating double NMIs for each perf event. These extra NMIs lead to 'Dazed and confused' messages on the screen. I tracked this down to a P4 quirk that said the overflow bit had to be cleared before re-enabling the apic LVT mask. My first attempt was to move the un-masking inside the perf nmi handler from before the chipset NMI handler to after. This broke Nehalem boxes that seem to like the unmasking before the counters themselves are re-enabled. In order to keep this change simple for 2.6.39, I decided to just simply move the apic LVT un-masking to the beginning of all the chipset NMI handlers, with the exception of Pentium4's to fix the double NMI issue. Later on we can move the un-masking to later in the handlers to save a number of 'extra' NMIs on those particular chipsets. I tested this change on a P4 machine, an AMD machine, a Nehalem box, and a core2quad box. 'perf top' worked correctly along with various other small 'perf record' runs. Anything high stress breaks all the machines but that is a different problem. Thanks to various people for testing different versions of this patch. Reported-and-tested-by: NShaun Ruffell <sruffell@digium.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Link: http://lkml.kernel.org/r/1303900353-10242-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu> CC: Cyrill Gorcunov <gorcunov@gmail.com>
-
- 26 4月, 2011 1 次提交
-
-
由 Peter Zijlstra 提交于
Currently the x86 backend incorrectly assumes that any BRANCH_INSN with sample_period==1 is a BTS request. This is not true when we do frequency driven profiling such as 'perf record -e branches'. Solves this error: $ perf record -e branches ./array Error: sys_perf_event_open() syscall returned with 95 (Operation not supported). Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Reported-by: NIngo Molnar <mingo@elte.hu> Cc: "Metzger, Markus T" <markus.t.metzger@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/n/tip-rd2y4ct71hjawzz6fpvsy9hg@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 4月, 2011 1 次提交
-
-
由 Ingo Molnar 提交于
Andi Kleen pointed out that the Intel offcore support patches were merged without user-space tool support to the functionality: | | The offcore_msr perf kernel code was merged into 2.6.39-rc*, but the | user space bits were not. This made it impossible to set the extra mask | and actually do the OFFCORE profiling | Andi submitted a preliminary patch for user-space support, as an extension to perf's raw event syntax: | | Some raw events -- like the Intel OFFCORE events -- support additional | parameters. These can be appended after a ':'. | | For example on a multi socket Intel Nehalem: | | perf stat -e r1b7:20ff -a sleep 1 | | Profile the OFFCORE_RESPONSE.ANY_REQUEST with event mask REMOTE_DRAM_0 | that measures any access to DRAM on another socket. | But this kind of usability is absolutely unacceptable - users should not be expected to type in magic, CPU and model specific incantations to get access to useful hardware functionality. The proper solution is to expose useful offcore functionality via generalized events - that way users do not have to care which specific CPU model they are using, they can use the conceptual event and not some model specific quirky hexa number. We already have such generalization in place for CPU cache events, and it's all very extensible. "Offcore" events measure general DRAM access patters along various parameters. They are particularly useful in NUMA systems. We want to support them via generalized DRAM events: either as the fourth level of cache (after the last-level cache), or as a separate generalization category. That way user-space support would be very obvious, memory access profiling could be done via self-explanatory commands like: perf record -e dram ./myapp perf record -e dram-remote ./myapp ... to measure DRAM accesses or more expensive cross-node NUMA DRAM accesses. These generalized events would work on all CPUs and architectures that have comparable PMU features. ( Note, these are just examples: actual implementation could have more sophistication and more parameter - as long as they center around similarly simple usecases. ) Now we do not want to revert *all* of the current offcore bits, as they are still somewhat useful for generic last-level-cache events, implemented in this commit: e994d7d2: perf: Fix LLC-* events on Intel Nehalem/Westmere But we definitely do not yet want to expose the unstructured raw events to user-space, until better generalization and usability is implemented for these hardware event features. ( Note: after generalization has been implemented raw offcore events can be supported as well: there can always be an odd event that is marginally useful but not useful enough to generalize. DRAM profiling is definitely *not* such a category so generalization must be done first. ) Furthermore, PERF_TYPE_RAW access to these registers was not intended to go upstream without proper support - it was a side-effect of the above e994d7d2 commit, not mentioned in the changelog. As v2.6.39 is nearing release we go for the simplest approach: disable the PERF_TYPE_RAW offcore hack for now, before it escapes into a released kernel and becomes an ABI. Once proper structure is implemented for these hardware events and users are offered usable solutions we can revisit this issue. Reported-by: NAndi Kleen <ak@linux.intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1302658203-4239-1-git-send-email-andi@firstfloor.orgSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 4月, 2011 1 次提交
-
-
由 Robert Richter 提交于
Using ALTERNATIVE() when checking for X86_FEATURE_PERFCTR_CORE avoids an extra pointer chase and data cache hit. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1302913676-14352-4-git-send-email-robert.richter@amd.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 25 3月, 2011 1 次提交
-
-
由 Ingo Molnar 提交于
Eric Dumazet reported that hardware PMU events do not work on his system, due to the BIOS corrupting PMU state: Performance Events: PEBS fmt0+, Core2 events, Broken BIOS detected, using software events only. [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 186 is 43003c) Linus suggested that we continue in the face of such BIOS-induced CPU state corruption: http://lkml.org/lkml/2011/3/24/608 Such BIOSes will have to be fixed - Linux developers rely on a working and fully capable PMU and the BIOS interfering with the CPU's PMU state is simply not acceptable. So this patch changes perf to continue when it detects such BIOS interaction, some hardware events may be unreliable due to the BIOS writing and re-writing them - there's not much the kernel can do about that but to detect the corruption and report it. Reported-and-tested-by: NEric Dumazet <eric.dumazet@gmail.com> Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 3月, 2011 1 次提交
-
-
由 Stephane Eranian 提交于
The following patch solves the problems introduced by Robert's commit 41bf4989 and reported by Arun Sharma. This commit gets rid of the base + index notation for reading and writing PMU msrs. The problem is that for fixed counters, the new calculation for the base did not take into account the fixed counter indexes, thus all fixed counters were read/written from fixed counter 0. Although all fixed counters share the same config MSR, they each have their own counter register. Without: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 242202299 unhalted_core_cycles (0.00% scaling, ena=1000790892, run=1000790892) 2389685946 instructions_retired (0.00% scaling, ena=1000790892, run=1000790892) 49473 baclears (0.00% scaling, ena=1000790892, run=1000790892) With: $ task -e unhalted_core_cycles -e instructions_retired -e baclears noploop 1 noploop for 1 seconds 2392703238 unhalted_core_cycles (0.00% scaling, ena=1000840809, run=1000840809) 2389793744 instructions_retired (0.00% scaling, ena=1000840809, run=1000840809) 47863 baclears (0.00% scaling, ena=1000840809, run=1000840809) Signed-off-by: NStephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ming.m.lin@intel.com Cc: robert.richter@amd.com Cc: asharma@fb.com Cc: perfmon2-devel@lists.sf.net LKML-Reference: <20110319172005.GB4978@quad> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 3月, 2011 2 次提交
-
-
由 Namhyung Kim 提交于
Current stack dump code scans entire stack and check each entry contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it could mark whether the pointer is valid or not based on value of the frame pointer. Invalid entries could be preceded by '?' sign. However this was not going to happen because scan start point was always higher than the frame pointer so that they could not meet. Commit 9c0729dc ("x86: Eliminate bp argument from the stack tracing routines") delayed bp acquisition point, so the bp was read in lower frame, thus all of the entries were marked invalid. This patch fixes this by reverting above commit while retaining stack_frame() helper as suggested by Frederic Weisbecker. End result looks like below: before: [ 3.508329] Call Trace: [ 3.508551] [<ffffffff814f35c9>] ? panic+0x91/0x199 [ 3.508662] [<ffffffff814f3739>] ? printk+0x68/0x6a [ 3.508770] [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e [ 3.508876] [<ffffffff81a9821f>] ? mount_root+0x56/0x5a [ 3.508975] [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9 [ 3.509216] [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2 [ 3.509335] [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10 [ 3.509442] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.509542] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.509641] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 after: [ 3.522991] Call Trace: [ 3.523351] [<ffffffff814f35b9>] panic+0x91/0x199 [ 3.523468] [<ffffffff814f3729>] ? printk+0x68/0x6a [ 3.523576] [<ffffffff81a981b2>] mount_block_root+0x257/0x26e [ 3.523681] [<ffffffff81a9821f>] mount_root+0x56/0x5a [ 3.523780] [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9 [ 3.523885] [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2 [ 3.523987] [<ffffffff81003894>] kernel_thread_helper+0x4/0x10 [ 3.524228] [<ffffffff814f6880>] ? restore_args+0x0/0x30 [ 3.524345] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2 [ 3.524445] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10 -v5: * fix build breakage with oprofile -v4: * use 0 instead of regs->bp * separate out printk changes -v3: * apply comment from Frederic * add a couple of printk fixes Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NFrederic Weisbecker <fweisbec@gmail.com> Cc: Soren Sandmann <ssp@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <1300416006-3163-1-git-send-email-namhyung@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Lucas De Marchi 提交于
They were generated by 'codespell' and then manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 3月, 2011 1 次提交
-
-
由 Lin Ming 提交于
PEBS_EVENT_CONSTRAINT() is just a duplicate of INTEL_UEVENT_CONSTRAINT(). Remove it and use INTEL_UEVENT_CONSTRAINT() instead. Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299684089-22835-3-git-send-email-ming.m.lin@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 3月, 2011 1 次提交
-
-
由 Lin Ming 提交于
Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299119690-13991-5-git-send-email-ming.m.lin@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 04 3月, 2011 2 次提交
-
-
由 Andi Kleen 提交于
On Intel Nehalem and Westmere CPUs the generic perf LLC-* events count the L2 caches, not the real L3 LLC - this was inconsistent with behavior on other CPUs. Fixing this requires the use of the special OFFCORE_RESPONSE events which need a separate mask register. This has been implemented by the previous patch, now use this infrastructure to set correct events for the LLC-* on Nehalem and Westmere. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299119690-13991-3-git-send-email-ming.m.lin@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Andi Kleen 提交于
Change logs against Andi's original version: - Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra) - Fixed a major event scheduling issue. There cannot be a ref++ on an event that has already done ref++ once and without calling put_constraint() in between. (Stephane Eranian) - Use thread_cpumask for percore allocation. (Lin Ming) - Use MSR names in the extra reg lists. (Lin Ming) - Remove redundant "c = NULL" in intel_percore_constraints - Fix comment of perf_event_attr::config1 Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event that can be used to monitor any offcore accesses from a core. This is a very useful event for various tunings, and it's also needed to implement the generic LLC-* events correctly. Unfortunately this event requires programming a mask in a separate register. And worse this separate register is per core, not per CPU thread. This patch: - Teaches perf_events that OFFCORE_RESPONSE needs extra parameters. The extra parameters are passed by user space in the perf_event_attr::config1 field. - Adds support to the Intel perf_event core to schedule per core resources. This adds fairly generic infrastructure that can be also used for other per core resources. The basic code has is patterned after the similar AMD northbridge constraints code. Thanks to Stephane Eranian who pointed out some problems in the original version and suggested improvements. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NLin Ming <ming.m.lin@intel.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 3月, 2011 1 次提交
-
-
由 Lin Ming 提交于
This patch adds basic SandyBridge support, including hardware cache events and PEBS events support. It has been tested on SandyBridge CPUs with perf stat and also with PEBS based profiling - both work fine. The patch does not affect other models. v2 -> v3: - fix PEBS event 0xd0 with right umask combinations - move snb pebs constraint assignment to intel_pmu_init v1 -> v2: - add more raw and PEBS events constraints - use offcore events for LLC-* cache events - remove the call to Nehalem workaround enable_all function Signed-off-by: NLin Ming <ming.m.lin@intel.com> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1299072424.2175.24.camel@localhost> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 16 2月, 2011 4 次提交
-
-
由 Robert Richter 提交于
This patch adds support for AMD family 15h core counters. There are major changes compared to family 10h. First, there is a new perfctr msr range for up to 6 counters. Northbridge counters are separate now. This patch only adds support for core counters. Second, certain events may only be scheduled on certain counters. For this we need to extend the event scheduling and constraints. We use cpu feature flags to calculate family 15h msr address offsets. This way we later can implement a faster ALTERNATIVE() version for this. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110215135210.GB5874@erda.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Robert Richter 提交于
Instead of storing the base addresses we can store the counter's msr addresses directly in config_base/event_base of struct hw_perf_event. This avoids recalculating the address with each msr access. The addresses are configured one time. We also need this change to later modify the address calculation. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1296664860-10886-5-git-send-email-robert.richter@amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Robert Richter 提交于
This patch adds helper functions to calculate perfctr msr addresses. We need this to later add support for AMD family 15h cpus. For this we have to change the algorithms to generate the perfctr's msr addresses. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1296664860-10886-3-git-send-email-robert.richter@amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Robert Richter 提交于
Use helper function in x86_pmu_enable_all() to minimize access to x86_pmu.eventsel in the fast path. The counter's msr address is now calculated using struct hw_perf_event. Later we add code that calculates the msr addresses with a table lookup which shouldn't be done in the fast path. Signed-off-by: NRobert Richter <robert.richter@amd.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1296664860-10886-2-git-send-email-robert.richter@amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 1月, 2011 1 次提交
-
-
由 Yinghai Lu 提交于
init_hw_perf_events() is called via early_initcall now. x86_pmu_event_init is x86_pmu member function. So we can change them to static. Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> LKML-Reference: <4D3A16F9.109@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 1月, 2011 2 次提交
-
-
由 Don Zickus 提交于
With priorities in place and no one really understanding the difference between DIE_NMI and DIE_NMI_IPI, just remove DIE_NMI_IPI and convert everyone to DIE_NMI. This also simplifies default_do_nmi() a little bit. Instead of calling the die_notifier in both the if and else part, just pull it out and call it before the if-statement. This has the side benefit of avoiding a call to the ioport to see if there is an external NMI sitting around until after the (more frequent) internal NMIs are dealt with. Patch-Inspired-by: NHuang Ying <ying.huang@intel.com> Signed-off-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-5-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Don Zickus 提交于
In order to consolidate the NMI die_chain events, we need to setup the priorities for the die notifiers. I started by defining a bunch of common priorities that can be used by the notifier blocks. Then I modified the notifier blocks to use the newly created priorities. Now that the priorities are straightened out, it should be easier to remove the event DIE_NMI_IPI. Signed-off-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-4-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 30 12月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
Go through x86 code and replace __get_cpu_var and get_cpu_var instances that refer to a scalar and are not used for address determinations. Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NChristoph Lameter <cl@linux.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 16 12月, 2010 2 次提交
-
-
由 Peter Zijlstra 提交于
Extend the perf_pmu_register() interface to allow for named and dynamic pmu types. Because we need to support the existing static types we cannot use dynamic types for everything, hence provide a type argument. If we want to enumerate the PMUs they need a name, provide one. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20101117222056.259707703@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Some BIOSes use PMU resources, which can cause various bugs: - Non-working or erratic PMU based statistics - the PMU can end up counting the wrong thing, resulting in misleading statistics - Profiling can stop working or it can profile the wrong thing - A non-working or erratic NMI watchdog that cannot be relied on - The kernel may disturb whatever thing the BIOS tries to use the PMU for - possibly causing hardware malfunction in extreme cases. - ... and other forms of potential misbehavior Various forms of such misbehavior has been observed in practice - there are BIOSes that just corrupt the PMU state, consequences be damned. The PMU is a CPU resource that is handled by the kernel and the BIOS stealing+corrupting it is not acceptable nor robust, so we detect it, warn about it and further refuse to touch the PMU ourselves. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 11月, 2010 3 次提交
-
-
由 Peter Zijlstra 提交于
The perf hardware pmu got initialized at various points in the boot, some before early_initcall() some after (notably arch_initcall). The problem is that the NMI lockup detector is ran from early_initcall() and expects the hardware pmu to be present. Sanitize this by moving all architecture hardware pmu implementations to initialize at early_initcall() and move the lockup detector to an explicit initcall right after that. Cc: paulus <paulus@samba.org> Cc: davem <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Acked-by: NPaul Mundt <lethal@linux-sh.org> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1290707759.2145.119.camel@laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Franck Bui-Huu 提交于
and use it when appropriate. Signed-off-by: NFranck Bui-Huu <fbuihuu@gmail.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1290525705-6265-1-git-send-email-fbuihuu@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Don Zickus 提交于
In a kvm virt guests, the perf counters are not emulated. Instead they return zero on a rdmsrl. The perf nmi handler uses the fact that crossing a zero means the counter overflowed (for those counters that do not have specific interrupt bits). Therefore on kvm guests, perf will swallow all NMIs thinking the counters overflowed. This causes problems for subsystems like kgdb which needs NMIs to do its magic. This problem was discovered by running kgdb tests. The solution is to write garbage into a perf counter during the initialization and hopefully reading back the same number. On kvm guests, the value will be read back as zero and we disable perf as a result. Reported-by: NJason Wessel <jason.wessel@windriver.com> Patch-inspired-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NDon Zickus <dzickus@redhat.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1290462923-30734-1-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 11月, 2010 2 次提交
-
-
由 Soeren Sandmann Pedersen 提交于
The various stack tracing routines take a 'bp' argument in which the caller is supposed to provide the base pointer to use, or 0 if doesn't have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not defined, this means all callers in principle should either always pass 0, or be conditional on CONFIG_FRAME_POINTER. However, there are only really three use cases for stack tracing: (a) Trace the current task, including IRQ stack if any (b) Trace the current task, but skip IRQ stack (c) Trace some other task In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just be 0. If it _is_ defined, then - in case (a) bp should be gotten directly from the CPU's register, so the caller should pass NULL for regs, - in case (b) the caller should should pass the IRQ registers to dump_trace(), - in case (c) bp should be gotten from the top of the task's stack, so the caller should pass NULL for regs. Hence, the bp argument is not necessary because the combination of task and regs is sufficient to determine an appropriate value for bp. This patch introduces a new inline function stack_frame(task, regs) that computes the desired bp. This function is then called from the two versions of dump_stack(). Signed-off-by: NSoren Sandmann <ssp@redhat.com> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arjan van de Ven <arjan@infradead.org>, Cc: Frederic Weisbecker <fweisbec@gmail.com>, Cc: Arnaldo Carvalho de Melo <acme@redhat.com>, LKML-Reference: <m3oc9rop28.fsf@dhcp-100-3-82.bos.redhat.com>> Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
-
由 Don Zickus 提交于
Now that the bulk of the old nmi_watchdog is gone, remove all the stub variables and hooks associated with it. This touches lots of files mainly because of how the io_apic nmi_watchdog was implemented. Now that the io_apic nmi_watchdog is forever gone, remove all its fingers. Most of this code was not being exercised by virtue of nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to risky here. Signed-off-by: NDon Zickus <dzickus@redhat.com> Cc: fweisbec@gmail.com Cc: gorcunov@openvz.org LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 27 10月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Now that the KM_type stuff is history, clean up the compiler warning. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NChris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 10月, 2010 3 次提交
-
-
由 Peter Zijlstra 提交于
Now that reserve_ds_buffers() never fails, change it to return void and remove all code dealing with the error return. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NStephane Eranian <eranian@google.com> LKML-Reference: <20101019134808.462621937@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Currently PEBS/BTS buffers are allocated when we instantiate the first event, when this fails everything fails. This is a problem because esp. BTS tries to allocate a rather large buffer (64K), which can easily fail. This patch changes the logic such that when either buffer allocation fails, we simply don't allow events that would use these facilities, but continue functioning for all other events. This logic comes from a much larger patch proposed by Stephane. Suggested-by: NStephane Eranian <eranian@google.com> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NStephane Eranian <eranian@google.com> LKML-Reference: <20101019134808.354429461@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
In case we don't have PEBS, the LBR fixup doesn't make sense. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NStephane Eranian <eranian@google.com> LKML-Reference: <20101019134808.354429461@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 10月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
Provide a mechanism that allows running code in IRQ context. It is most useful for NMI code that needs to interact with the rest of the system -- like wakeup a task to drain buffers. Perf currently has such a mechanism, so extract that and provide it as a generic feature, independent of perf so that others may also benefit. The IRQ context callback is generated through self-IPIs where possible, or on architectures like powerpc the decrementer (the built-in timer facility) is set to generate an interrupt immediately. Architectures that don't have anything like this get to do with a callback from the timer tick. These architectures can call irq_work_run() at the tail of any IRQ handlers that might enqueue such work (like the perf IRQ handler) to avoid undue latencies in processing the work. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NKyle McMartin <kyle@mcmartin.ca> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> [ various fixes ] Signed-off-by: NHuang Ying <ying.huang@intel.com> LKML-Reference: <1287036094.7768.291.camel@yhuang-dev> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 10月, 2010 1 次提交
-
-
由 Andi Kleen 提交于
Just dead code I believe. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Cc: andi@firstfloor.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 24 9月, 2010 1 次提交
-
-
由 Robert Richter 提交于
Some cpus still deliver spurious interrupts after disabling a counter. This caused 'undelivered NMI' messages. This patch fixes this. Introduced by: 4177c42a: perf, x86: Try to handle unknown nmis with an enabled PMU Reported-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NRobert Richter <robert.richter@amd.com> Cc: Don Zickus <dzickus@redhat.com> Cc: gorcunov@gmail.com <gorcunov@gmail.com> Cc: fweisbec@gmail.com <fweisbec@gmail.com> Cc: ying.huang@intel.com <ying.huang@intel.com> Cc: ming.m.lin@intel.com <ming.m.lin@intel.com> Cc: yinghai@kernel.org <yinghai@kernel.org> Cc: andi@firstfloor.org <andi@firstfloor.org> Cc: eranian@google.com <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20100915162034.GO13563@erda.amd.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 9月, 2010 2 次提交
-
-
由 Peter Zijlstra 提交于
Neither the overcommit nor the reservation sysfs parameter were actually working, remove them as they'll only get in the way. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Peter Zijlstra 提交于
Replace pmu::{enable,disable,start,stop,unthrottle} with pmu::{add,del,start,stop}, all of which take a flags argument. The new interface extends the capability to stop a counter while keeping it scheduled on the PMU. We replace the throttled state with the generic stopped state. This also allows us to efficiently stop/start counters over certain code paths (like IRQ handlers). It also allows scheduling a counter without it starting, allowing for a generic frozen state (useful for rotating stopped counters). The stopped state is implemented in two different ways, depending on how the architecture implemented the throttled state: 1) We disable the counter: a) the pmu has per-counter enable bits, we flip that b) we program a NOP event, preserving the counter state 2) We store the counter state and ignore all read/overflow events Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-