1. 02 8月, 2009 1 次提交
    • I
      debug lockups: Improve lockup detection · c1dc0b9c
      Ingo Molnar 提交于
      When debugging a recent lockup bug i found various deficiencies
      in how our current lockup detection helpers work:
      
       - SysRq-L is not very efficient as it uses a workqueue, hence
         it cannot punch through hard lockups and cannot see through
         most soft lockups either.
      
       - The SysRq-L code depends on the NMI watchdog - which is off
         by default.
      
       - We dont print backtraces from the RCU code's built-in
         'RCU state machine is stuck' debug code. This debug
         code tends to be one of the first (and only) mechanisms
         that show that a lockup has occured.
      
      This patch changes the code so taht we:
      
       - Trigger the NMI backtrace code from SysRq-L instead of using
         a workqueue (which cannot punch through hard lockups)
      
       - Trigger print-all-CPU-backtraces from the RCU lockup detection
         code
      
      Also decouple the backtrace printing code from the NMI watchdog:
      
       - Dont use variable size cpumasks (it might not be initialized
         and they are a bit more fragile anyway)
      
       - Trigger an NMI immediately via an IPI, instead of waiting
         for the NMI tick to occur. This is a lot faster and can
         produce more relevant backtraces. It will also work if the
         NMI watchdog is disabled.
      
       - Dont print the 'dazed and confused' message when we print
         a backtrace from the NMI
      
       - Do a show_regs() plus a dump_stack() to get maximum info
         out of the dump. Worst-case we get two stacktraces - which
         is not a big deal. Sometimes, if register content is
         corrupted, the precise stack walker in show_regs() wont
         give us a full backtrace - in this case dump_stack() will
         do it.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c1dc0b9c
  2. 01 8月, 2009 6 次提交
  3. 31 7月, 2009 25 次提交
  4. 30 7月, 2009 8 次提交