- 03 8月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
As Andrew noted, my previous patch ("debug lockups: Improve lockup detection") broke/removed SysRq-L support from architecture that do not provide a __trigger_all_cpu_backtrace implementation. Restore a fallback path and clean up the SysRq-L machinery a bit: - Rename the arch method to arch_trigger_all_cpu_backtrace() - Simplify the define - Document the method a bit - in the hope of more architectures adding support for it. [ The patch touches Sparc code for the rename. ] Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> LKML-Reference: <20090802140809.7ec4bb6b.akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 02 8月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
When debugging a recent lockup bug i found various deficiencies in how our current lockup detection helpers work: - SysRq-L is not very efficient as it uses a workqueue, hence it cannot punch through hard lockups and cannot see through most soft lockups either. - The SysRq-L code depends on the NMI watchdog - which is off by default. - We dont print backtraces from the RCU code's built-in 'RCU state machine is stuck' debug code. This debug code tends to be one of the first (and only) mechanisms that show that a lockup has occured. This patch changes the code so taht we: - Trigger the NMI backtrace code from SysRq-L instead of using a workqueue (which cannot punch through hard lockups) - Trigger print-all-CPU-backtraces from the RCU lockup detection code Also decouple the backtrace printing code from the NMI watchdog: - Dont use variable size cpumasks (it might not be initialized and they are a bit more fragile anyway) - Trigger an NMI immediately via an IPI, instead of waiting for the NMI tick to occur. This is a lot faster and can produce more relevant backtraces. It will also work if the NMI watchdog is disabled. - Dont print the 'dazed and confused' message when we print a backtrace from the NMI - Do a show_regs() plus a dump_stack() to get maximum info out of the dump. Worst-case we get two stacktraces - which is not a big deal. Sometimes, if register content is corrupted, the precise stack walker in show_regs() wont give us a full backtrace - in this case dump_stack() will do it. Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <new-submission> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 10 7月, 2009 1 次提交
-
-
由 Andi Kleen 提交于
Drop the CONFIG_X86_NEW_MCE symbol and change all references to it to check for CONFIG_X86_MCE directly. No code changes Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 29 5月, 2009 1 次提交
-
-
由 Andi Kleen 提交于
The 64bit machine check code is in many ways much better than the 32bit machine check code: it is more specification compliant, is cleaner, only has a single code base versus one per CPU, has better infrastructure for recovery, has a cleaner way to communicate with user space etc. etc. Use the 64bit code for 32bit too. This is the second attempt to do this. There was one a couple of years ago to unify this code for 32bit and 64bit. Back then this ran into some trouble with K7s and was reverted. I believe this time the K7 problems (and some others) are addressed. I went over the old handlers and was very careful to retain all quirks. But of course this needs a lot of testing on old systems. On newer 64bit capable systems I don't expect much problems because they have been already tested with the 64bit kernel. I made this a CONFIG for now that still allows to select the old machine check code. This is mostly to make testing easier, if someone runs into a problem we can ask them to try with the CONFIG switched. The new code is default y for more coverage. Once there is confidence the 64bit code works well on older hardware too the CONFIG_X86_OLD_MCE and the associated code can be easily removed. This causes a behaviour change for 32bit installations. They now have to install the mcelog package to be able to log corrected machine checks. The 64bit machine check code only handles CPUs which support the standard Intel machine check architecture described in the IA32 SDM. The 32bit code has special support for some older CPUs which have non standard machine check architectures, in particular WinChip C3 and Intel P5. I made those a separate CONFIG option and kept them for now. The WinChip variant could be probably removed without too much pain, it doesn't really do anything interesting. P5 is also disabled by default (like it was before) because many motherboards have it miswired, but according to Alan Cox a few embedded setups use that one. Forward ported/heavily changed version of old patch, original patch included review/fixes from Thomas Gleixner, Bert Wesarg. Signed-off-by: NAndi Kleen <ak@linux.intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 21 4月, 2009 2 次提交
-
-
由 Rusty Russell 提交于
In theory (though not shown in practice) alloc_cpumask_var() doesn't zero memory, so CPUs might print an "NMI backtrace for cpu %d" once on boot. (Bug introduced in fcef8576). [ Impact: avoid theoretical syslog noise in rare configs ] Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Rusty Russell 提交于
fcef8576 converted backtrace_mask to a cpumask_var_t, and assumed check_nmi_watchdog was called before nmi_watchdog_tick was ever called. Steven's oops shows I was wrong. This is something of a bandaid: I'm not sure we *should* be calling nmi_watchdog_tick before check_nmi_watchdog. Note that gcc eliminates this test for the CONFIG_CPUMASK_OFFSTACK=n case. [ Impact: fix boot crash in rare configs ] Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> LKML-Reference: <alpine.DEB.2.00.0904202113520.10097@gandalf.stny.rr.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 4月, 2009 1 次提交
-
-
由 Jaswinder Singh Rajput 提交于
Impact: cleanup, no code changed - syscalls.h update declarations due to unifications - irq.c declare smp_generic_interrupt() before it gets used - process.c declare sys_fork() and sys_vfork() before they get used - tsc.c rename tsc_khz shadowed variable - apic/probe_32.c declare apic_default before it gets used - apic/nmi.c prev_nmi_count should be unsigned - apic/io_apic.c declare smp_irq_move_cleanup_interrupt() before it gets used - mm/init.c declare direct_gbpages and free_initrd_mem before they get used Signed-off-by: NJaswinder Singh Rajput <jaswinder@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 13 3月, 2009 1 次提交
-
-
由 Rusty Russell 提交于
Impact: cleanup, reduce memory usage for CONFIG_CPUMASK_OFFSTACK=y I *think* every path calls check_nmi_watchdog before using the watchdog, so that's the right place for the initialization. If that's wrong, we'll get a nice NULL-deref with CONFIG_CPUMASK_OFFSTACK=y, and have uncovered another bug. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
-
- 18 2月, 2009 2 次提交
-
-
由 Ingo Molnar 提交于
arch/x86/kernel/ is getting a bit crowded, and the APIC drivers are scattered into various different files. Move them to arch/x86/kernel/apic/*, and also remove the 'gen' prefix from those which had it. Also move APIC related functionality: the IO-APIC driver, the NMI and the IPI code. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Ingo Molnar 提交于
Impact: cleanup Remove genapic.h and remove all references to it. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 17 2月, 2009 1 次提交
-
-
由 Yinghai Lu 提交于
Impact: cleanup make it simpler, don't need have one extra struct. v2: fix the sgi_uv build Signed-off-by: NYinghai Lu <yinghai@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 1月, 2009 1 次提交
-
-
由 Ingo Molnar 提交于
We are getting rid of subarchitecture support - move the hook files to asm/. (These are now stale and should be replaced with more explicit runtime mechanisms - but the transition is simpler this way.) Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 1月, 2009 1 次提交
-
-
由 Brian Gerst 提交于
Signed-off-by: NBrian Gerst <brgerst@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 06 1月, 2009 1 次提交
-
-
由 Huang Weiyi 提交于
Removed duplicated #include's in: arch/x86/kernel/mpparse.c arch/x86/kernel/nmi.c Signed-off-by: NHuang Weiyi <weiyi.huang@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 03 1月, 2009 1 次提交
-
-
由 Jaswinder Singh Rajput 提交于
Impact: cleanup, fix style problems Fixes style problems: WARNING: Use #include <linux/smp.h> instead of <asm/smp.h> WARNING: Use #include <linux/nmi.h> instead of <asm/nmi.h> total: 0 errors, 2 warnings Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 10月, 2008 1 次提交
-
-
由 Cyrill Gorcunov 提交于
Impact: introduce nmi_watchdog=lapic and nmi_watchdog=ioapic aliases Add sensible names as "lapic" and "ioapic" to nmi_watchdog boot parameter. Sometimes it is not that easy to recall what exactly nmi_watchdog=1 does mean so we allow the using of symbolic names here. Old numeric values remain valid. Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 10月, 2008 2 次提交
-
-
由 Aristeu Rozanski 提交于
Impact: change NMI watchdog detection and disabling sequence Currently, if the NMI watchdog fails using IOAPIC method, it'll only disable interrupts on 8259 if the timer is passing thru it. This patch disables NMI delivery on LINT0 if the NMI watchdog initial test fails, just for safety. Signed-off-by: NAristeu Rozanski <aris@redhat.com> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Aristeu Rozanski 提交于
Impact: change/improve the way /proc/sys/kernel/nmi_watchdog works This patch adds support to enable/disable IOAPIC NMI watchdog in runtime via procfs. Signed-off-by: NAristeu Rozanski <aris@redhat.com> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 23 9月, 2008 1 次提交
-
-
由 Aristeu Rozanski 提交于
There's a small window when NMI watchdog is being set up that if any NMIs are triggered, the NMI code will make make use of not initalized wd_ops elements: void setup_apic_nmi_watchdog(void *unused) { if (__get_cpu_var(wd_enabled)) return; /* cheap hack to support suspend/resume */ /* if cpu0 is not active neither should the other cpus */ if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0) return; switch (nmi_watchdog) { case NMI_LOCAL_APIC: /* enable it before to avoid race with handler */ --> __get_cpu_var(wd_enabled) = 1; --> if (lapic_watchdog_init(nmi_hz) < 0) { (...) asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs) { (...) if (nmi_watchdog_tick(regs, reason)) return; (...) notrace __kprobes int nmi_watchdog_tick(struct pt_regs *regs, unsigned reason) { (...) if (!__get_cpu_var(wd_enabled)) return rc; switch (nmi_watchdog) { case NMI_LOCAL_APIC: rc |= lapic_wd_event(nmi_hz); (...) int lapic_wd_event(unsigned nmi_hz) { struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk); u64 ctr; --> rdmsrl(wd->perfctr_msr, ctr); and wd->*_msr will be initialized on each processor type specific setup, after enabling NMIs for PMIs. Since the counter was just set, the chances of an performance counter generated NMI is minimal, but any other unknown NMI would trigger the problem. This patch fixes the problem by setting everything up before enabling performance counter generated NMIs and will set wd_enabled using a callback function. Signed-off-by: NAristeu Rozanski <aris@redhat.com> Acked-by: NDon Zickus <dzickus@redhat.com> Acked-by: NPrarit Bhargava <prarit@redhat.com> Acked-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 8月, 2008 2 次提交
-
-
由 Ingo Molnar 提交于
clean up the failure message - and redirect people to bugzilla instead of lkml. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Aristeu Rozanski 提交于
> it just won't work at boot time - the second logic unit will be stuck: > > Booting processor 1/2 APIC 0x1 > Initializing CPU#1 > Calibrating delay using timer specific routine.. 5586.12 BogoMIPS (lpj=2793063) > CPU: Trace cache: 12K uops, L1 D cache: 16K > CPU: L2 cache: 1024K > CPU: Physical Processor ID: 0 > CPU: Processor Core ID: 1 > CPU1: Thermal monitoring enabled (TM1) > Intel(R) Pentium(R) D CPU 2.80GHz stepping 04 > Brought up 2 CPUs > testing NMI watchdog ... <4>WARNING: CPU#1: NMI appears to be stuck (0->0)! while at it... - fix that newline Signed-off-by: NAristeu Rozanski <aris@redhat.com> Cc: jvillalo@redhat.com Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 20 7月, 2008 1 次提交
-
-
由 Simon Arlott 提交于
It's not possible to enable the unknown_nmi_panic sysctl option until init is run. It's useful to be able to panic the kernel during boot too, this adds a parameter to enable this option. Signed-off-by: NSimon Arlott <simon@fire.lp0.eu> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 18 7月, 2008 1 次提交
-
-
由 Maciej W. Rozycki 提交于
Use alternatives to select the workaround for the 11AP Pentium erratum for the affected steppings on the fly rather than build time. Remove the X86_GOOD_APIC configuration option and replace all the calls to apic_write_around() with plain apic_write(), protecting accesses to the ESR as appropriate due to the 3AP Pentium erratum. Remove apic_read_around() and all its invocations altogether as not needed. Remove apic_write_atomic() and all its implementing backends. The use of ASM_OUTPUT2() is not strictly needed for input constraints, but I have used it for readability's sake. I had the feeling no one else was brave enough to do it, so I went ahead and here it is. Verified by checking the generated assembly and tested with both a 32-bit and a 64-bit configuration, also with the 11AP "feature" forced on and verified with gdb on /proc/kcore to work as expected (as an 11AP machines are quite hard to get hands on these days). Some script complained about the use of "volatile", but apic_write() needs it for the same reason and is effectively a replacement for writel(), so I have disregarded it. I am not sure what the policy wrt defconfig files is, they are generated and there is risk of a conflict resulting from an unrelated change, so I have left changes to them out. The option will get removed from them at the next run. Some testing with machines other than mine will be needed to avoid some stupid mistake, but despite its volume, the change is not really that intrusive, so I am fairly confident that because it works for me, it will everywhere. Signed-off-by: NMaciej W. Rozycki <macro@linux-mips.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 7月, 2008 1 次提交
-
-
由 Maciej W. Rozycki 提交于
In the course of the recent unification of the NMI watchdog an assignment to timer_ack to switch off unnecesary POLL commands to the 8259A in the case of a watchdog failure has been accidentally removed. The statement used to be limited to the 32-bit variation as since the rewrite of the timer code it has been relevant for the 82489DX only. This change brings it back. Signed-off-by: NMaciej W. Rozycki <macro@linux-mips.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 7月, 2008 3 次提交
-
-
由 Cyrill Gorcunov 提交于
Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Cyrill Gorcunov 提交于
There is no need to keep NMI_DISABLED definition and use it for nmi_watchdog by default. Here is the point why: - IO-APIC and APIC chips are programmed for nmi_watchdog support at very early stage of kernel booting and not having nmi_watchdog specified as boot option lead only to nmi_watchdog becomes to NMI_NONE anyway - enable nmi_watchdog thru /proc/sys/kernel/nmi if it was not specified at boot is not possible too (even having this sysfs entry) Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Cyrill Gorcunov 提交于
Since nmi_watchdog is unsigned variable we may safely remove the check for negative value. Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: macro@linux-mips.org Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 19 6月, 2008 1 次提交
-
-
由 Glauber Costa 提交于
Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 6月, 2008 1 次提交
-
-
由 Cyrill Gorcunov 提交于
The checking 'if nmi_watchdog > 0' (ie NMI_NONE) is quite fast but it has a side effect - it's taken even if nmi_watchdog = NMI_DISABLED. Nowadays nmi_watchdog is set up to NMI_NONE by default so this condition is properly taken most the time but we better show this explicitly. Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 05 6月, 2008 2 次提交
-
-
由 mingo@elte.hu 提交于
fix: arch/x86/kernel/built-in.o: In function `proc_nmi_enabled': : undefined reference to `nmi_watchdog_default' arch/x86/kernel/built-in.o: In function `native_smp_prepare_cpus': : undefined reference to `nmi_watchdog_default' Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Cyrill Gorcunov 提交于
64bit mode bootstrap code does set nmi_watchdog to NMI_NONE by default and doing the same on 32bit mode is safe too. Such an action saves us from several #ifdef. Btw, my previous commit commit 19ec673c Author: Cyrill Gorcunov <gorcunov@gmail.com> Date: Wed May 28 23:00:47 2008 +0400 x86: nmi - fix incorrect NMI watchdog used by default did not fix the problem completely, moreover it introduced additional bug - nmi_watchdog would be set to either NMI_LOCAL_APIC or NMI_IO_APIC _regardless_ to boot option if being enabled thru /proc/sys/kernel/nmi_watchdog. Sorry for that. Fix it too. Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: mingo@redhat.com Cc: hpa@zytor.com Cc: macro@linux-mips.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 02 6月, 2008 2 次提交
-
-
由 Ingo Molnar 提交于
apic.h needs to be included for the apic_write_around() definition.
-
由 Hiroshi Shimamoto 提交于
before total: 1 errors, 6 warnings, 534 lines checked after total: 0 errors, 1 warnings, 532 lines checked Signed-off-by: NHiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 29 5月, 2008 1 次提交
-
-
由 Cyrill Gorcunov 提交于
The commit commit 4b82b277 Author: Cyrill Gorcunov <gorcunov@gmail.com> Date: Sat May 24 19:36:35 2008 +0400 set nmi_watchdog to NMI_IO_APIC as by default. This causes hangs on some machines with buggy watchdogs. Fix it - i.e. restore old behaviour. Thanks to Sitsofe Wheeler and Adrian Bunk for catching the problem and Maciej W. Rozycki for explanation what is going on there. Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> CC: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 26 5月, 2008 6 次提交
-
-
由 Cyrill Gorcunov 提交于
Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: hpa@zytor.com Cc: mingo@redhat.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Cyrill Gorcunov 提交于
Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: hpa@zytor.com Cc: mingo@redhat.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Cyrill Gorcunov 提交于
Since cpu_online_map is touched (by for_each_online_cpu) at moment when cpu_callin_map is already filled up we can get rid of its checking at all Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: hpa@zytor.com Cc: mingo@redhat.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Cyrill Gorcunov 提交于
Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: hpa@zytor.com Cc: mingo@redhat.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Cyrill Gorcunov 提交于
apic_write_around will be expanded to apic_write in 64bit mode anyway. Only a few CPUs (well, old CPUs to be precise) requires such an action. In general it should not hurt and could be cleaned up for apic_write (just in case) Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: hpa@zytor.com Cc: mingo@redhat.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Cyrill Gorcunov 提交于
Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com> Cc: hpa@zytor.com Cc: mingo@redhat.com Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-