1. 08 6月, 2012 1 次提交
    • D
      x86/nmi: Fix section mismatch warnings on 32-bit · eeaaa96a
      Don Zickus 提交于
      It was reported that compiling for 32-bit caused a bunch of
      section mismatch warnings:
      
       VDSOSYM arch/x86/vdso/vdso32-syms.lds
        LD      arch/x86/vdso/built-in.o
        LD      arch/x86/built-in.o
      
       WARNING: arch/x86/built-in.o(.data+0x5af0): Section mismatch in
       reference from the variable test_nmi_ipi_callback_na.10451 to
       the function .init.text:test_nmi_ipi_callback() [...]
      
       WARNING: arch/x86/built-in.o(.data+0x5b04): Section mismatch in
       reference from the variable nmi_unk_cb_na.10399 to the function
       .init.text:nmi_unk_cb() The variable nmi_unk_cb_na.10399
       references the function __init nmi_unk_cb() [...]
      
      Both of these are attributed to the internal representation of
      the nmiaction struct created during register_nmi_handler.  The
      reason for this is that those structs are not defined in the
      init section whereas the rest of the code in nmi_selftest.c is.
      
      To resolve this, I created a new #define,
      register_nmi_handler_initonly, that tags the struct as
      __initdata to resolve the mismatch.  This #define should only be
      used in rare situations where the register/unregister is called
      during init of the kernel.
      
      Big thanks to Jan Beulich for decoding this for me as I didn't
      have a clue what was going on.
      Reported-by: NWitold Baryluk <baryluk@smp.if.uj.edu.pl>
      Tested-by: NWitold Baryluk <baryluk@smp.if.uj.edu.pl>
      Cc: Jan Beulich <JBeulich@suse.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Link: http://lkml.kernel.org/r/1338991542-23000-1-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      eeaaa96a
  2. 07 5月, 2012 1 次提交
  3. 25 4月, 2012 2 次提交
    • L
      x86/nmi: Fix page faults by nmiaction if kmemcheck is enabled · 72b3fb24
      Li Zhong 提交于
      This patch tries to fix the problem of page fault exception
      caused by accessing nmiaction structure in nmi if kmemcheck
      is enabled.
      
      If kmemcheck is enabled, the memory allocated through slab are
      in pages that are marked non-present, so that some checks could
      be done in the page fault handling code ( e.g. whether the
      memory is read before written to ).
      
      As nmiaction is allocated in this way, so it resides in a
      non-present page. Then there is a page fault while the nmi code
      accessing the nmiaction structure, which would then cause a
      warning by WARN_ON_ONCE(in_nmi()) in kmemcheck_fault(), called
      by do_page_fault().
      
      This significantly simplifies the code as well, as the whole
      dynamic allocation dance goes away.
      
      v2: as Peter suggested, changed the nmiaction to use static
          storage.
      
      v3: as Peter suggested, use macro to shorten the codes. Also
          keep the original usage of register_nmi_handler, so users of
          this call doesn't need change.
      Tested-by: NSeiji Aguchi <seiji.aguchi@hds.com>
      Fixes: https://lkml.org/lkml/2012/3/2/356Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
      [ simplified the wrappers ]
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: thomas.mingarelli@hp.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1333051877-15755-4-git-send-email-dzickus@redhat.com
      [ tidied the patch a bit ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      72b3fb24
    • D
      x86/nmi: Add new NMI queues to deal with IO_CHK and SERR · 553222f3
      Don Zickus 提交于
      In discussions with Thomas Mingarelli about hpwdt, he explained
      to me some issues they were some when using their virtual NMI
      button to test the hpwdt driver.
      
      It turns out the virtual NMI button used on HP's machines do no
      send unknown NMIs but instead send IO_CHK NMIs.  The way the
      kernel code is written, the hpwdt driver can not register itself
      against that type of NMI and therefore can not successfully
      capture system information before panic'ing.
      
      To solve this I created two new NMI queues to allow driver to
      register against the IO_CHK and SERR NMIs.  Or in the hpwdt all
      three (if you include unknown NMIs too).
      
      The change is straightforward and just mimics what the unknown
      NMI does.
      Reported-and-tested-by: NThomas Mingarelli <thomas.mingarelli@hp.com>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1333051877-15755-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      553222f3
  4. 10 10月, 2011 3 次提交
    • D
      x86, nmi: Add in logic to handle multiple events and unknown NMIs · b227e233
      Don Zickus 提交于
      Previous patches allow the NMI subsystem to process multipe NMI events
      in one NMI.  As previously discussed this can cause issues when an event
      triggered another NMI but is processed in the current NMI.  This causes the
      next NMI to go unprocessed and become an 'unknown' NMI.
      
      To handle this, we first have to flag whether or not the NMI handler handled
      more than one event or not.  If it did, then there exists a chance that
      the next NMI might be already processed.  Once the NMI is flagged as a
      candidate to be swallowed, we next look for a back-to-back NMI condition.
      
      This is determined by looking at the %rip from pt_regs.  If it is the same
      as the previous NMI, it is assumed the cpu did not have a chance to jump
      back into a non-NMI context and execute code and instead handled another NMI.
      
      If both of those conditions are true then we will swallow any unknown NMI.
      
      There still exists a chance that we accidentally swallow a real unknown NMI,
      but for now things seem better.
      
      An optimization has also been added to the nmi notifier rountine.  Because x86
      can latch up to one NMI while currently processing an NMI, we don't have to
      worry about executing _all_ the handlers in a standalone NMI.  The idea is
      if multiple NMIs come in, the second NMI will represent them.  For those
      back-to-back NMI cases, we have the potentail to drop NMIs.  Therefore only
      execute all the handlers in the second half of a detected back-to-back NMI.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-5-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      b227e233
    • D
      x86, nmi: Wire up NMI handlers to new routines · 9c48f1c6
      Don Zickus 提交于
      Just convert all the files that have an nmi handler to the new routines.
      Most of it is straight forward conversion.  A couple of places needed some
      tweaking like kgdb which separates the debug notifier from the nmi handler
      and mce removes a call to notify_die.
      
      [Thanks to Ying for finding out the history behind that mce call
      
      https://lkml.org/lkml/2010/5/27/114
      
      And Boris responding that he would like to remove that call because of it
      
      https://lkml.org/lkml/2011/9/21/163]
      
      The things that get converted are the registeration/unregistration routines
      and the nmi handler itself has its args changed along with code removal
      to check which list it is on (most are on one NMI list except for kgdb
      which has both an NMI routine and an NMI Unknown routine).
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NCorey Minyard <minyard@acm.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Corey Minyard <minyard@acm.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      9c48f1c6
    • D
      x86, nmi: Create new NMI handler routines · c9126b2e
      Don Zickus 提交于
      The NMI handlers used to rely on the notifier infrastructure.  This worked
      great until we wanted to support handling multiple events better.
      
      One of the key ideas to the nmi handling is to process _all_ the handlers for
      each NMI.  The reason behind this switch is because NMIs are edge triggered.
      If enough NMIs are triggered, then they could be lost because the cpu can
      only latch at most one NMI (besides the one currently being processed).
      
      In order to deal with this we have decided to process all the NMI handlers
      for each NMI.  This allows the handlers to determine if they recieved an
      event or not (the ones that can not determine this will be left to fend
      for themselves on the unknown NMI list).
      
      As a result of this change it is now possible to have an extra NMI that
      was destined to be received for an already processed event.  Because the
      event was processed in the previous NMI, this NMI gets dropped and becomes
      an 'unknown' NMI.  This of course will cause printks that scare people.
      
      However, we prefer to have extra NMIs as opposed to losing NMIs and as such
      are have developed a basic mechanism to catch most of them.  That will be
      a later patch.
      
      To accomplish this idea, I unhooked the nmi handlers from the notifier
      routines and created a new mechanism loosely based on doIRQ.  The reason
      for this is the notifier routines have a couple of shortcomings.  One we
      could't guarantee all future NMI handlers used NOTIFY_OK instead of
      NOTIFY_STOP.  Second, we couldn't keep track of the number of events being
      handled in each routine (most only handle one, perf can handle more than one).
      Third, I wanted to eventually display which nmi handlers are registered in
      the system in /proc/interrupts to help see who is generating NMIs.
      
      The patch below just implements the new infrastructure but doesn't wire it up
      yet (that is the next patch).  Its design is based on doIRQ structs and the
      atomic notifier routines.  So the rcu stuff in the patch isn't entirely untested
      (as the notifier routines have soaked it) but it should be double checked in
      case I copied the code wrong.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c9126b2e
  5. 18 3月, 2011 1 次提交
  6. 18 2月, 2011 1 次提交
  7. 07 1月, 2011 1 次提交
  8. 23 12月, 2010 1 次提交
    • D
      x86, nmi_watchdog: Remove ARCH_HAS_NMI_WATCHDOG and rely on CONFIG_HARDLOCKUP_DETECTOR · 4a7863cc
      Don Zickus 提交于
      The x86 arch has shifted its use of the nmi_watchdog from a
      local implementation to the global one provide by
      kernel/watchdog.c.  This shift has caused a whole bunch of
      compile problems under different config options.  I attempt to
      simplify things with the patch below.
      
      In order to simplify things, I had to come to terms with the
      meaning of two terms ARCH_HAS_NMI_WATCHDOG and
      CONFIG_HARDLOCKUP_DETECTOR.  Basically they mean the same thing,
      the former on a local level and the latter on a global level.
      
      With the old x86 nmi watchdog gone, there is no need to rely on
      defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't
      make sense any more.  x86 will now use the global
      implementation.
      
      The changes below do a few things.  First it changes the few
      places that relied on ARCH_HAS_NMI_WATCHDOG to use
      CONFIG_X86_LOCAL_APIC (the former was an alias for the latter
      anyway, so nothing unusual here).  Those pieces of code were
      relying more on local apic functionality the nmi watchdog
      functionality, so the change should make sense.
      
      Second, I removed the x86 implementation of
      touch_nmi_watchdog().  It isn't need now, instead x86 will rely
      on kernel/watchdog.c's implementation.
      
      Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from
      x86.  And tweaked the include/linux/nmi.h file to tell users to
      look for an externally defined touch_nmi_watchdog in the case of
      ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This
      changes removes some of the ugliness in that file.
      
      Finally, I added a Kconfig dependency for
      CONFIG_HARDLOCKUP_DETECTOR that said you can't have
      ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR.  You can
      only have one nmi_watchdog.
      
      Tested with
      ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken
      configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig,
      (various broken configs)
      
      Hopefully, after this patch I won't get any more compile broken
      emails. :-)
      
      v3:
        changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function
        prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: fweisbec@gmail.com
      LKML-Reference: <1293044403-14117-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4a7863cc
  9. 18 11月, 2010 2 次提交
    • D
      x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog · 072b198a
      Don Zickus 提交于
      Now that the bulk of the old nmi_watchdog is gone, remove all
      the stub variables and hooks associated with it.
      
      This touches lots of files mainly because of how the io_apic
      nmi_watchdog was implemented.  Now that the io_apic nmi_watchdog
      is forever gone, remove all its fingers.
      
      Most of this code was not being exercised by virtue of
      nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to
      risky here.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      072b198a
    • D
      x86, nmi_watchdog: Remove the old nmi_watchdog · 5f2b0ba4
      Don Zickus 提交于
      Now that we have a new nmi_watchdog that is more generic and
      sits on top of the perf subsystem, we really do not need the old
      nmi_watchdog any more.
      
      In addition, the old nmi_watchdog doesn't really work if you are
      using the default clocksource, hpet.  The old nmi_watchdog code
      relied on local apic interrupts to determine if the cpu is still
      alive.  With hpet as the clocksource, these interrupts don't
      increment any more and the old nmi_watchdog triggers false
      postives.
      
      This piece removes the old nmi_watchdog code and stubs out any
      variables and functions calls.  The stubs are the same ones used
      by the new nmi_watchdog code, so it should be well tested.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5f2b0ba4
  10. 13 5月, 2010 1 次提交
    • D
      lockup_detector: Combine nmi_watchdog and softlockup detector · 58687acb
      Don Zickus 提交于
      The new nmi_watchdog (which uses the perf event subsystem) is very
      similar in structure to the softlockup detector.  Using Ingo's
      suggestion, I combined the two functionalities into one file:
      kernel/watchdog.c.
      
      Now both the nmi_watchdog (or hardlockup detector) and softlockup
      detector sit on top of the perf event subsystem, which is run every
      60 seconds or so to see if there are any lockups.
      
      To detect hardlockups, cpus not responding to interrupts, I
      implemented an hrtimer that runs 5 times for every perf event
      overflow event.  If that stops counting on a cpu, then the cpu is
      most likely in trouble.
      
      To detect softlockups, tasks not yielding to the scheduler, I used the
      previous kthread idea that now gets kicked every time the hrtimer fires.
      If the kthread isn't being scheduled neither is anyone else and the
      warning is printed to the console.
      
      I tested this on x86_64 and both the softlockup and hardlockup paths
      work.
      
      V2:
      - cleaned up the Kconfig and softlockup combination
      - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
      - seperated out the softlockup case from perf event subsystem
      - re-arranged the enabling/disabling nmi watchdog from proc space
      - added cpumasks for hardlockup failure cases
      - removed fallback to soft events if no PMU exists for hard events
      
      V3:
      - comment cleanups
      - drop support for older softlockup code
      - per_cpu cleanups
      - completely remove software clock base hardlockup detector
      - use per_cpu masking on hard/soft lockup detection
      - #ifdef cleanups
      - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
      - documentation additions
      
      V4:
      - documentation fixes
      - convert per_cpu to __get_cpu_var
      - powerpc compile fixes
      
      V5:
      - split apart warn flags for hard and soft lockups
      
      TODO:
      - figure out how to make an arch-agnostic clock2cycles call
        (if possible) to feed into perf events as a sample period
      
      [fweisbec: merged conflict patch]
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      58687acb
  11. 14 2月, 2010 1 次提交
  12. 28 12月, 2009 1 次提交
  13. 24 9月, 2009 1 次提交
  14. 03 8月, 2009 1 次提交
    • I
      debug lockups: Improve lockup detection, fix generic arch fallback · 47cab6a7
      Ingo Molnar 提交于
      As Andrew noted, my previous patch ("debug lockups: Improve lockup
      detection") broke/removed SysRq-L support from architecture that do
      not provide a __trigger_all_cpu_backtrace implementation.
      
      Restore a fallback path and clean up the SysRq-L machinery a bit:
      
       - Rename the arch method to arch_trigger_all_cpu_backtrace()
      
       - Simplify the define
      
       - Document the method a bit - in the hope of more architectures
         adding support for it.
      
      [ The patch touches Sparc code for the rename. ]
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      LKML-Reference: <20090802140809.7ec4bb6b.akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      47cab6a7
  15. 03 7月, 2009 1 次提交
  16. 07 6月, 2009 1 次提交
  17. 23 10月, 2008 2 次提交
  18. 13 10月, 2008 1 次提交
  19. 23 9月, 2008 1 次提交
    • A
      x86, NMI watchdog: setup before enabling NMI watchdog · b3e15bde
      Aristeu Rozanski 提交于
      There's a small window when NMI watchdog is being set up that if any NMIs
      are triggered, the NMI code will make make use of not initalized wd_ops
      elements:
      	void setup_apic_nmi_watchdog(void *unused)
      	{
      		if (__get_cpu_var(wd_enabled))
      			return;
      
      		/* cheap hack to support suspend/resume */
      		/* if cpu0 is not active neither should the other cpus */
      		if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
      			return;
      
      		switch (nmi_watchdog) {
      		case NMI_LOCAL_APIC:
      			/* enable it before to avoid race with handler */
      -->			__get_cpu_var(wd_enabled) = 1;
      -->			if (lapic_watchdog_init(nmi_hz) < 0) {
      (...)
      	asmlinkage notrace __kprobes void default_do_nmi(struct pt_regs *regs)
      	{
      	(...)
      			if (nmi_watchdog_tick(regs, reason))
      				return;
      (...)
      	notrace __kprobes int
      	nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
      	{
      	(...)
      		if (!__get_cpu_var(wd_enabled))
      			return rc;
      		switch (nmi_watchdog) {
      		case NMI_LOCAL_APIC:
      			rc |= lapic_wd_event(nmi_hz);
      (...)
      int lapic_wd_event(unsigned nmi_hz)
      {
      	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
      	u64 ctr;
      
      -->	rdmsrl(wd->perfctr_msr, ctr);
      
      and wd->*_msr will be initialized on each processor type specific setup, after
      enabling NMIs for PMIs. Since the counter was just set, the chances of an
      performance counter generated NMI is minimal, but any other unknown NMI would
      trigger the problem. This patch fixes the problem by setting everything up
      before enabling performance counter generated NMIs and will set wd_enabled
      using a callback function.
      Signed-off-by: NAristeu Rozanski <aris@redhat.com>
      Acked-by: NDon Zickus <dzickus@redhat.com>
      Acked-by: NPrarit Bhargava <prarit@redhat.com>
      Acked-by: NVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b3e15bde
  20. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  21. 08 7月, 2008 3 次提交
  22. 12 6月, 2008 1 次提交
  23. 05 6月, 2008 1 次提交
    • C
      x86: nmi - consolidate nmi_watchdog_default for 32bit mode · 3ed3f062
      Cyrill Gorcunov 提交于
      64bit mode bootstrap code does set nmi_watchdog to NMI_NONE
      by default and doing the same on 32bit mode is safe too.
      Such an action saves us from several #ifdef.
      
      Btw, my previous commit
      
      commit 19ec673c
      Author: Cyrill Gorcunov <gorcunov@gmail.com>
      Date:   Wed May 28 23:00:47 2008 +0400
      
          x86: nmi - fix incorrect NMI watchdog used by default
      
      did not fix the problem completely, moreover it
      introduced additional bug - nmi_watchdog would be
      set to either NMI_LOCAL_APIC or NMI_IO_APIC
      _regardless_ to boot option if being enabled thru
      /proc/sys/kernel/nmi_watchdog. Sorry for that.
      Fix it too.
      Signed-off-by: NCyrill Gorcunov <gorcunov@gmail.com>
      Cc: mingo@redhat.com
      Cc: hpa@zytor.com
      Cc: macro@linux-mips.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      3ed3f062
  24. 30 5月, 2008 1 次提交
  25. 29 5月, 2008 1 次提交
  26. 26 5月, 2008 2 次提交
  27. 25 5月, 2008 2 次提交
  28. 17 4月, 2008 2 次提交
  29. 11 10月, 2007 1 次提交