1. 03 8月, 2009 1 次提交
    • I
      debug lockups: Improve lockup detection, fix generic arch fallback · 47cab6a7
      Ingo Molnar 提交于
      As Andrew noted, my previous patch ("debug lockups: Improve lockup
      detection") broke/removed SysRq-L support from architecture that do
      not provide a __trigger_all_cpu_backtrace implementation.
      
      Restore a fallback path and clean up the SysRq-L machinery a bit:
      
       - Rename the arch method to arch_trigger_all_cpu_backtrace()
      
       - Simplify the define
      
       - Document the method a bit - in the hope of more architectures
         adding support for it.
      
      [ The patch touches Sparc code for the rename. ]
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      LKML-Reference: <20090802140809.7ec4bb6b.akpm@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      47cab6a7
  2. 02 8月, 2009 1 次提交
    • I
      debug lockups: Improve lockup detection · c1dc0b9c
      Ingo Molnar 提交于
      When debugging a recent lockup bug i found various deficiencies
      in how our current lockup detection helpers work:
      
       - SysRq-L is not very efficient as it uses a workqueue, hence
         it cannot punch through hard lockups and cannot see through
         most soft lockups either.
      
       - The SysRq-L code depends on the NMI watchdog - which is off
         by default.
      
       - We dont print backtraces from the RCU code's built-in
         'RCU state machine is stuck' debug code. This debug
         code tends to be one of the first (and only) mechanisms
         that show that a lockup has occured.
      
      This patch changes the code so taht we:
      
       - Trigger the NMI backtrace code from SysRq-L instead of using
         a workqueue (which cannot punch through hard lockups)
      
       - Trigger print-all-CPU-backtraces from the RCU lockup detection
         code
      
      Also decouple the backtrace printing code from the NMI watchdog:
      
       - Dont use variable size cpumasks (it might not be initialized
         and they are a bit more fragile anyway)
      
       - Trigger an NMI immediately via an IPI, instead of waiting
         for the NMI tick to occur. This is a lot faster and can
         produce more relevant backtraces. It will also work if the
         NMI watchdog is disabled.
      
       - Dont print the 'dazed and confused' message when we print
         a backtrace from the NMI
      
       - Do a show_regs() plus a dump_stack() to get maximum info
         out of the dump. Worst-case we get two stacktraces - which
         is not a big deal. Sometimes, if register content is
         corrupted, the precise stack walker in show_regs() wont
         give us a full backtrace - in this case dump_stack() will
         do it.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c1dc0b9c
  3. 30 7月, 2009 9 次提交
  4. 29 7月, 2009 1 次提交
  5. 28 7月, 2009 2 次提交
  6. 27 7月, 2009 16 次提交
  7. 24 7月, 2009 5 次提交
    • M
      [S390] vdso: clock_gettime of CLOCK_THREAD_CPUTIME_ID with noexec=on · 1277580f
      Martin Schwidefsky 提交于
      The combination of noexec=on and a clock_gettime call with clock id
      CLOCK_THREAD_CPUTIME_ID is broken. The vdso code switches to the
      access register mode to get access to the per-cpu data structure to
      execute the magic ectg instruction. After the ectg instruction the
      code always switches back to the primary mode but for noexec=on the
      correct mode is the secondary mode. The effect of the bug is that the
      user space program looses the access to all mappings without PROT_EXEC,
      e.g. the stack. The problem is fixed by restoring the mode that has
      been active before the switch to the access register mode.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1277580f
    • H
      [S390] vdso: fix per cpu area allocation · 3a6ba460
      Heiko Carstens 提交于
      vdso per cpu area allocation in smp_prepare_cpus() happens with GFP_KERNEL
      but irqs disabled. Triggers this one:
      
      Badness at kernel/lockdep.c:2280
      Modules linked in:
      CPU: 0 Not tainted 2.6.30 #2
      Process swapper (pid: 1, task: 000000003fe88000, ksp: 000000003fe87eb8)
      Krnl PSW : 0400c00180000000 0000000000083360 (lockdep_trace_alloc+0xec/0xf8)
      [...]
      Call Trace:
      ([<00000000000832b6>] lockdep_trace_alloc+0x42/0xf8)
       [<00000000000b1880>] __alloc_pages_internal+0x3e8/0x5c4
       [<00000000000b1b4a>] __get_free_pages+0x3a/0xb0
       [<0000000000026546>] vdso_alloc_per_cpu+0x6a/0x18c
       [<00000000005eff82>] smp_prepare_cpus+0x322/0x594
       [<00000000005e8232>] kernel_init+0x76/0x398
       [<000000000001bb1e>] kernel_thread_starter+0x6/0xc
       [<000000000001bb18>] kernel_thread_starter+0x0/0xc
      
      Fix this by moving the allocation out of the irqs disabled section.
      Reported-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      3a6ba460
    • H
      [S390] hibernation: fix register corruption on machine checks · c63b196a
      Heiko Carstens 提交于
      swsusp_arch_suspend() actually saves all cpu register contents on
      hibernation.
      Machine checks must be disabled since swsusp_arch_suspend() stores
      register contents to their lowcore save areas. That's the same
      place where register contents on machine checks would be saved.
      To avoid register corruption disable machine checks.
      We must also disable machine checks in the new psw mask for
      program checks, since swsusp_arch_suspend() may generate program
      checks.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      c63b196a
    • H
      [S390] hibernation: fix lowcore handling · 5f954c34
      Heiko Carstens 提交于
      Our swsusp_arch_suspend() backend implementation disables prefixing
      by setting the contents of the prefix register to 0.
      However afterwards common code functions are called which might
      access percpu data structures.
      Since the lowcore contains e.g. the percpu base pointer this isn't
      a good idea. So fix this by copying the hibernating cpu's lowcore to
      absolute address zero.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      5f954c34
    • T
      x86: geode: Mark mfgpt irq IRQF_TIMER to prevent resume failure · d6c585a4
      Thomas Gleixner 提交于
      Timer interrupts are excluded from being disabled during suspend. The
      clock events code manages the disabling of clock events on its own
      because the timer interrupt needs to be functional before the resume
      code reenables the device interrupts.
      
      The mfgpt timer request its interrupt without setting the IRQF_TIMER
      flag so suspend_device_irqs() disables it as well which results in a
      fatal resume failure.
      
      Adding IRQF_TIMER to the interupt flags when requesting the mrgpt
      timer interrupt solves the problem.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      LKML-Reference: <new-submission>
      Cc: Andres Salomon <dilinger@debian.org>
      Cc: stable@kernel.org
      d6c585a4
  8. 23 7月, 2009 4 次提交
  9. 22 7月, 2009 1 次提交