1. 22 12月, 2011 2 次提交
    • S
      x86: Allow NMIs to hit breakpoints in i386 · ccd49c23
      Steven Rostedt 提交于
      With i386, NMIs and breakpoints use the current stack and they
      do not reset the stack pointer to a fix point that might corrupt
      a previous NMI or breakpoint (as it does in x86_64). But NMIs are
      still not made to be re-entrant, and need to prevent the case that
      an NMI hitting a breakpoint (which does an iret), doesn't allow
      another NMI to run.
      
      The fix is to let the NMI be in 3 different states:
      
      1) not running
      2) executing
      3) latched
      
      When no NMI is executing on a given CPU, the state is "not running".
      When the first NMI comes in, the state is switched to "executing".
      On exit of that NMI, a cmpxchg is performed to switch the state
      back to "not running" and if that fails, the NMI is restarted.
      
      If a breakpoint is hit and does an iret, which re-enables NMIs,
      and another NMI comes in before the first NMI finished, it will
      detect that the state is not in the "not running" state and the
      current NMI is nested. In this case, the state is switched to "latched"
      to let the interrupted NMI know to restart the NMI handler, and
      the nested NMI exits without doing anything.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Paul Turner <pjt@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      ccd49c23
    • S
      x86: Keep current stack in NMI breakpoints · 228bdaa9
      Steven Rostedt 提交于
      We want to allow NMI handlers to have breakpoints to be able to
      remove stop_machine from ftrace, kprobes and jump_labels. But if
      an NMI interrupts a current breakpoint, and then it triggers a
      breakpoint itself, it will switch to the breakpoint stack and
      corrupt the data on it for the breakpoint processing that it
      interrupted.
      
      Instead, have the NMI check if it interrupted breakpoint processing
      by checking if the stack that is currently used is a breakpoint
      stack. If it is, then load a special IDT that changes the IST
      for the debug exception to keep the same stack in kernel context.
      When the NMI is done, it puts it back.
      
      This way, if the NMI does trigger a breakpoint, it will keep
      using the same stack and not stomp on the breakpoint data for
      the breakpoint it interrupted.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      228bdaa9
  2. 11 11月, 2011 1 次提交
  3. 10 11月, 2011 1 次提交
    • J
      x86/mrst: Avoid reporting wrong nmi status · 064a59b6
      Jacob Pan 提交于
      Moorestown/Medfield platform does not have port 0x61 to report
      NMI status, nor does it have external NMI sources. The only NMI
      sources are from lapic, as results of perf counter overflow or
      IPI, e.g. NMI watchdog or spin lock debug.
      
      Reading port 0x61 on Moorestown will return 0xff which misled
      NMI handlers to false critical errors such memory parity error.
      The subsequent ioport access for NMI handling can also cause
      undefined behavior on Moorestown.
      
      This patch allows kernel process NMI due to watchdog or backrace
      dump without unnecessary hangs.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      [hand applied]
      Signed-off-by: NAlan Cox <alan@linux.intel.com>
      064a59b6
  4. 01 11月, 2011 1 次提交
    • P
      x86: Fix files explicitly requiring export.h for EXPORT_SYMBOL/THIS_MODULE · 69c60c88
      Paul Gortmaker 提交于
      These files were implicitly getting EXPORT_SYMBOL via device.h
      which was including module.h, but that will be fixed up shortly.
      
      By fixing these now, we can avoid seeing things like:
      
      arch/x86/kernel/rtc.c:29: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
      arch/x86/kernel/pci-dma.c:20: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL’
      arch/x86/kernel/e820.c:69: warning: type defaults to ‘int’ in declaration of ‘EXPORT_SYMBOL_GPL’
      
      [ with input from Randy Dunlap <rdunlap@xenotime.net> and also
        from Stephen Rothwell <sfr@canb.auug.org.au> ]
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      69c60c88
  5. 10 10月, 2011 6 次提交
    • I
      x86, nmi, drivers: Fix nmi splitup build bug · d48b0e17
      Ingo Molnar 提交于
      nmi.c needs an #include <linux/mca.h>:
      
       arch/x86/kernel/nmi.c: In function ‘unknown_nmi_error’:
       arch/x86/kernel/nmi.c:286:6: error: ‘MCA_bus’ undeclared (first use in this function)
       arch/x86/kernel/nmi.c:286:6: note: each undeclared identifier is reported only once for each function it appears in
      
      Another one is the hpwdt driver:
      
       drivers/watchdog/hpwdt.c:507:9: error: ‘NMI_DONE’ undeclared (first use in this function)
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d48b0e17
    • D
      x86, nmi: Track NMI usage stats · efc3aac5
      Don Zickus 提交于
      Now that the NMI handler are broken into lists, increment the appropriate
      stats for each list.  This allows us to see what is going on when they
      get printed out in the next patch.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-6-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      efc3aac5
    • D
      x86, nmi: Add in logic to handle multiple events and unknown NMIs · b227e233
      Don Zickus 提交于
      Previous patches allow the NMI subsystem to process multipe NMI events
      in one NMI.  As previously discussed this can cause issues when an event
      triggered another NMI but is processed in the current NMI.  This causes the
      next NMI to go unprocessed and become an 'unknown' NMI.
      
      To handle this, we first have to flag whether or not the NMI handler handled
      more than one event or not.  If it did, then there exists a chance that
      the next NMI might be already processed.  Once the NMI is flagged as a
      candidate to be swallowed, we next look for a back-to-back NMI condition.
      
      This is determined by looking at the %rip from pt_regs.  If it is the same
      as the previous NMI, it is assumed the cpu did not have a chance to jump
      back into a non-NMI context and execute code and instead handled another NMI.
      
      If both of those conditions are true then we will swallow any unknown NMI.
      
      There still exists a chance that we accidentally swallow a real unknown NMI,
      but for now things seem better.
      
      An optimization has also been added to the nmi notifier rountine.  Because x86
      can latch up to one NMI while currently processing an NMI, we don't have to
      worry about executing _all_ the handlers in a standalone NMI.  The idea is
      if multiple NMIs come in, the second NMI will represent them.  For those
      back-to-back NMI cases, we have the potentail to drop NMIs.  Therefore only
      execute all the handlers in the second half of a detected back-to-back NMI.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b227e233
    • D
      x86, nmi: Wire up NMI handlers to new routines · 9c48f1c6
      Don Zickus 提交于
      Just convert all the files that have an nmi handler to the new routines.
      Most of it is straight forward conversion.  A couple of places needed some
      tweaking like kgdb which separates the debug notifier from the nmi handler
      and mce removes a call to notify_die.
      
      [Thanks to Ying for finding out the history behind that mce call
      
      https://lkml.org/lkml/2010/5/27/114
      
      And Boris responding that he would like to remove that call because of it
      
      https://lkml.org/lkml/2011/9/21/163]
      
      The things that get converted are the registeration/unregistration routines
      and the nmi handler itself has its args changed along with code removal
      to check which list it is on (most are on one NMI list except for kgdb
      which has both an NMI routine and an NMI Unknown routine).
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NCorey Minyard <minyard@acm.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Corey Minyard <minyard@acm.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      9c48f1c6
    • D
      x86, nmi: Create new NMI handler routines · c9126b2e
      Don Zickus 提交于
      The NMI handlers used to rely on the notifier infrastructure.  This worked
      great until we wanted to support handling multiple events better.
      
      One of the key ideas to the nmi handling is to process _all_ the handlers for
      each NMI.  The reason behind this switch is because NMIs are edge triggered.
      If enough NMIs are triggered, then they could be lost because the cpu can
      only latch at most one NMI (besides the one currently being processed).
      
      In order to deal with this we have decided to process all the NMI handlers
      for each NMI.  This allows the handlers to determine if they recieved an
      event or not (the ones that can not determine this will be left to fend
      for themselves on the unknown NMI list).
      
      As a result of this change it is now possible to have an extra NMI that
      was destined to be received for an already processed event.  Because the
      event was processed in the previous NMI, this NMI gets dropped and becomes
      an 'unknown' NMI.  This of course will cause printks that scare people.
      
      However, we prefer to have extra NMIs as opposed to losing NMIs and as such
      are have developed a basic mechanism to catch most of them.  That will be
      a later patch.
      
      To accomplish this idea, I unhooked the nmi handlers from the notifier
      routines and created a new mechanism loosely based on doIRQ.  The reason
      for this is the notifier routines have a couple of shortcomings.  One we
      could't guarantee all future NMI handlers used NOTIFY_OK instead of
      NOTIFY_STOP.  Second, we couldn't keep track of the number of events being
      handled in each routine (most only handle one, perf can handle more than one).
      Third, I wanted to eventually display which nmi handlers are registered in
      the system in /proc/interrupts to help see who is generating NMIs.
      
      The patch below just implements the new infrastructure but doesn't wire it up
      yet (that is the next patch).  Its design is based on doIRQ structs and the
      atomic notifier routines.  So the rcu stuff in the patch isn't entirely untested
      (as the notifier routines have soaked it) but it should be double checked in
      case I copied the code wrong.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c9126b2e
    • D
      x86, nmi: Split out nmi from traps.c · 1d48922c
      Don Zickus 提交于
      The nmi stuff is changing a lot and adding more functionality.  Split it
      out from the traps.c file so it doesn't continue to pollute that file.
      
      This makes it easier to find and expand all the future nmi related work.
      
      No real functional changes here.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-2-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1d48922c
  6. 18 2月, 2009 2 次提交
  7. 17 2月, 2009 1 次提交
  8. 29 1月, 2009 1 次提交
  9. 18 1月, 2009 1 次提交
  10. 06 1月, 2009 1 次提交
  11. 03 1月, 2009 1 次提交
  12. 31 10月, 2008 1 次提交
  13. 28 10月, 2008 2 次提交
  14. 23 9月, 2008 1 次提交
    • A
      x86, NMI watchdog: setup before enabling NMI watchdog · b3e15bde
      Aristeu Rozanski 提交于
      There's a small window when NMI watchdog is being set up that if any NMIs
      are triggered, the NMI code will make make use of not initalized wd_ops
      elements:
      	void setup_apic_nmi_watchdog(void *unused)
      	{
      		if (__get_cpu_var(wd_enabled))
      			return;
      
      		/* cheap hack to support suspend/resume */
      		/* if cpu0 is not active neither should the other cpus */
      		if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
      			return;
      
      		switch (nmi_watchdog) {
      		case NMI_LOCAL_APIC:
      			/* enable it before to avoid race with handler */
      -->			__get_cpu_var(wd_enabled) = 1;
      -->			if (lapic_watchdog_init(nmi_hz) < 0) {
      (...)
      	asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs)
      	{
      	(...)
      			if (nmi_watchdog_tick(regs, reason))
      				return;
      (...)
      	notrace __kprobes int
      	nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
      	{
      	(...)
      		if (!__get_cpu_var(wd_enabled))
      			return rc;
      		switch (nmi_watchdog) {
      		case NMI_LOCAL_APIC:
      			rc |= lapic_wd_event(nmi_hz);
      (...)
      int lapic_wd_event(unsigned nmi_hz)
      {
      	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
      	u64 ctr;
      
      -->	rdmsrl(wd->perfctr_msr, ctr);
      
      and wd->*_msr will be initialized on each processor type specific setup, after
      enabling NMIs for PMIs. Since the counter was just set, the chances of an
      performance counter generated NMI is minimal, but any other unknown NMI would
      trigger the problem. This patch fixes the problem by setting everything up
      before enabling performance counter generated NMIs and will set wd_enabled
      using a callback function.
      Signed-off-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NPrarit Bhargava <prarit@redhat.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b3e15bde
  15. 15 8月, 2008 2 次提交
    • I
      x86, nmi: clean UP NMI watchdog failure message · 8bb85190
      Ingo Molnar 提交于
      clean up the failure message - and redirect people to bugzilla
      instead of lkml.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8bb85190
    • A
      x86, NMI: fix watchdog failure message · 15636668
      Aristeu Rozanski 提交于
      > it just won't work at boot time - the second logic unit will be stuck:
      >
      > Booting processor 1/2 APIC 0x1
      > Initializing CPU#1
      > Calibrating delay using timer specific routine.. 5586.12 BogoMIPS (lpj=2793063)
      > CPU: Trace cache: 12K uops, L1 D cache: 16K
      > CPU: L2 cache: 1024K
      > CPU: Physical Processor ID: 0
      > CPU: Processor Core ID: 1
      > CPU1: Thermal monitoring enabled (TM1)
      >               Intel(R) Pentium(R) D CPU 2.80GHz stepping 04
      > Brought up 2 CPUs
      > testing NMI watchdog ... <4>WARNING: CPU#1: NMI appears to be stuck (0->0)!
      
      while at it... - fix that newline
      Signed-off-by: NAristeu Rozanski <aris@redhat.com>
      Cc: jvillalo@redhat.com
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      15636668
  16. 20 7月, 2008 1 次提交
  17. 18 7月, 2008 1 次提交
    • M
      x86: APIC: remove apic_write_around(); use alternatives · 593f4a78
      Maciej W. Rozycki 提交于
      Use alternatives to select the workaround for the 11AP Pentium erratum
      for the affected steppings on the fly rather than build time.  Remove the
      X86_GOOD_APIC configuration option and replace all the calls to
      apic_write_around() with plain apic_write(), protecting accesses to the
      ESR as appropriate due to the 3AP Pentium erratum.  Remove
      apic_read_around() and all its invocations altogether as not needed.
      Remove apic_write_atomic() and all its implementing backends.  The use of
      ASM_OUTPUT2() is not strictly needed for input constraints, but I have
      used it for readability's sake.
      
      I had the feeling no one else was brave enough to do it, so I went ahead
      and here it is.  Verified by checking the generated assembly and tested
      with both a 32-bit and a 64-bit configuration, also with the 11AP
      "feature" forced on and verified with gdb on /proc/kcore to work as
      expected (as an 11AP machines are quite hard to get hands on these days).
      Some script complained about the use of "volatile", but apic_write() needs
      it for the same reason and is effectively a replacement for writel(), so I
      have disregarded it.
      
      I am not sure what the policy wrt defconfig files is, they are generated
      and there is risk of a conflict resulting from an unrelated change, so I
      have left changes to them out.  The option will get removed from them at
      the next run.
      
      Some testing with machines other than mine will be needed to avoid some
      stupid mistake, but despite its volume, the change is not really that
      intrusive, so I am fairly confident that because it works for me, it will
      everywhere.
      Signed-off-by: NMaciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      593f4a78
  18. 12 7月, 2008 1 次提交
  19. 08 7月, 2008 3 次提交
  20. 19 6月, 2008 1 次提交
  21. 12 6月, 2008 1 次提交
  22. 05 6月, 2008 2 次提交
    • M
      x86, nmi: fix build · 75b9f5d2
      mingo@elte.hu 提交于
      fix:
      
      arch/x86/kernel/built-in.o: In function `proc_nmi_enabled':
      : undefined reference to `nmi_watchdog_default'
      arch/x86/kernel/built-in.o: In function `native_smp_prepare_cpus':
      : undefined reference to `nmi_watchdog_default'
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75b9f5d2
    • C
      x86: nmi - consolidate nmi_watchdog_default for 32bit mode · 3ed3f062
      Cyrill Gorcunov 提交于
      64bit mode bootstrap code does set nmi_watchdog to NMI_NONE
      by default and doing the same on 32bit mode is safe too.
      Such an action saves us from several #ifdef.
      
      Btw, my previous commit
      
      commit 19ec673c
      Author: Cyrill Gorcunov <gorcunov@gmail.com>
      Date:   Wed May 28 23:00:47 2008 +0400
      
          x86: nmi - fix incorrect NMI watchdog used by default
      
      did not fix the problem completely, moreover it
      introduced additional bug - nmi_watchdog would be
      set to either NMI_LOCAL_APIC or NMI_IO_APIC
      _regardless_ to boot option if being enabled thru
      /proc/sys/kernel/nmi_watchdog. Sorry for that.
      Fix it too.
      Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com>
      Cc: mingo@redhat.com
      Cc: hpa@zytor.com
      Cc: macro@linux-mips.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      3ed3f062
  23. 02 6月, 2008 2 次提交
  24. 29 5月, 2008 1 次提交
  25. 26 5月, 2008 3 次提交