1. 19 6月, 2008 1 次提交
    • S
      x86: fix NULL pointer deref in __switch_to · 75118a82
      Suresh Siddha 提交于
      Patrick McHardy reported a crash:
      
      > > I get this oops once a day, its apparently triggered by something
      > > run by cron, but the process is a different one each time.
      > >
      > > Kernel is -git from yesterday shortly before the -rc6 release
      > > (last commit is the usb-2.6 merge, the x86 patches are missing),
      > > .config is attached.
      > >
      > > I'll retry with current -git, but the patches that have gone in
      > > since I last updated don't look related.
      > >
      > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
      > > 000001ff
      > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
      > > [62060.043009] *pde = 00000000
      > > [62060.043009] Oops: 0002 [#1] PREEMPT
      
      Vegard Nossum analyzed it:
      
      > This decodes to
      >
      >    0:   0f ae 00                fxsave (%eax)
      >
      > so it's related to the floating-point context. This is the exact
      > location of the crash:
      >
      > $ addr2line -e arch/x86/kernel/process_32.o -i ab0
      > include/asm/i387.h:232
      > include/asm/i387.h:262
      > arch/x86/kernel/process_32.c:595
      >
      > ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
      > Or maybe it never had any other value.
      
      Somehow (as described below) TS_USEDFPU is set but the fpu is not
      allocated or freed.
      
      Another possible FPU pre-emption issue with the sleazy FPU optimization
      which was benign before but not so anymore, with the dynamic FPU allocation
      patch.
      
      New task is getting exec'd and it is prempted at the below point.
      
      flush_thread() {
      	...
      	/*
      	* Forget coprocessor state..
      	*/
      	clear_fpu(tsk);
      		<----- Preemption point
      	clear_used_math();
      	...
      }
      
      Now when it context switches in again, as the used_math() is still set
      and fpu_counter can be > 5, we will do a math_state_restore() which sets
      the task's TS_USEDFPU. After it continues from the above preemption point
      it does clear_used_math() and much later free_thread_xstate().
      
      Now, at the next context switch, it is quite possible that xstate is
      null, used_math() is not set and TS_USEDFPU is still set. This will
      trigger unlazy_fpu() causing kernel oops.
      
      Fix this  by clearing tsk's fpu_counter before clearing task's fpu.
      Reported-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75118a82
  2. 04 6月, 2008 1 次提交
    • S
      x86, fpu: fix CONFIG_PREEMPT=y corruption of application's FPU stack · 870568b3
      Suresh Siddha 提交于
      Jürgen Mell reported an FPU state corruption bug under CONFIG_PREEMPT,
      and bisected it to commit v2.6.19-1363-gacc20761, "i386: add sleazy FPU
      optimization".
      
      Add tsk_used_math() checks to prevent calling math_state_restore()
      which can sleep in the case of !tsk_used_math(). This prevents
      making a blocking call in __switch_to().
      
      Apparently "fpu_counter > 5" check is not enough, as in some signal handling
      and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
      
      It's a side effect though. This is the failing scenario:
      
      process 'A' in save_i387_ia32() just after clear_used_math()
      
      Got an interrupt and pre-empted out.
      
      At the next context switch to process 'A' again, kernel tries to restore
      the math state proactively and sees a fpu_counter > 0 and !tsk_used_math()
      
      This results in init_fpu() during the __switch_to()'s math_state_restore()
      
      And resulting in fpu corruption which will be saved/restored
      (save_i387_fxsave and restore_i387_fxsave) during the remaining
      part of the signal handling after the context switch.
      Bisected-by: NJürgen Mell <j.mell@t-online.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Tested-by: NJürgen Mell <j.mell@t-online.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      870568b3
  3. 27 4月, 2008 1 次提交
    • P
      fix idle (arch, acpi and apm) and lockdep · 7f424a8b
      Peter Zijlstra 提交于
      OK, so 25-mm1 gave a lockdep error which made me look into this.
      
      The first thing that I noticed was the horrible mess; the second thing I
      saw was hacks like: 71e93d15
      
      The problem is that arch idle routines are somewhat inconsitent with
      their IRQ state handling and instead of fixing _that_, we go paper over
      the problem.
      
      So the thing I've tried to do is set a standard for idle routines and
      fix them all up to adhere to that. So the rules are:
      
        idle routines are entered with IRQs disabled
        idle routines will exit with IRQs enabled
      
      Nearly all already did this in one form or another.
      
      Merge the 32 and 64 bit bits so they no longer have different bugs.
      
      As for the actual lockdep warning; __sti_mwait() did a plainly un-annotated
      irq-enable.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NBob Copeland <me@bobcopeland.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7f424a8b
  4. 25 4月, 2008 1 次提交
  5. 20 4月, 2008 4 次提交
  6. 17 4月, 2008 5 次提交
  7. 11 4月, 2008 1 次提交
  8. 01 3月, 2008 1 次提交
  9. 09 2月, 2008 2 次提交
  10. 04 2月, 2008 1 次提交
  11. 30 1月, 2008 20 次提交
  12. 15 1月, 2008 1 次提交
    • S
      Kick CPUS that might be sleeping in cpus_idle_wait · 40d6a146
      Steven Rostedt 提交于
      Sometimes cpu_idle_wait gets stuck because it might miss CPUS that are
      already in idle, have no tasks waiting to run and have no interrupts going
      to them.  This is common on bootup when switching cpu idle governors.
      
      This patch gives those CPUS that don't check in an IPI kick.
      
       Background:
       -----------
      I notice this while developing the mcount patches, that every once in a
      while the system would hang. Looking deeper, the hang was always at boot
      up when registering init_menu of the cpu_idle menu governor. Talking
      with Thomas Gliexner, we discovered that one of the CPUS had no timer
      events scheduled for it and it was in idle (running with NO_HZ). So the
      CPU would not set the cpu_idle_state bit.
      
      Hitting sysrq-t a few times would eventually route the interrupt to the
      stuck CPU and the system would continue.
      
      Note, I would have used the PDA isidle but that is set after the
      cpu_idle_state bit is cleared, and would leave a window open where we
      may miss being kicked.
      
      hmm, looking closer at this, we still have a small race window between
      clearing the cpu_idle_state and disabling interrupts (hence the RFC).
      
          CPU0:                          CPU 1:
        ---------                       ---------
       cpu_idle_wait():                 cpu_idle():
            |                           __cpu_cpu_var(is_idle) = 1;
            |                           if (__get_cpu_var(cpu_idle_state)) /* == 0 */
       per_cpu(cpu_idle_state, 1) = 1;         |
       if (per_cpu(is_idle, 1)) /* == 1 */     |
       smp_call_function(1)                    |
            |                             receives ipi and runs do_nothing.
       wait on map == empty               idle();
         /* waits forever */
      
      So really we need interrupts off for most of this then. One might think
      that we could simply clear the cpu_idle_state from do_nothing, but I'm
      assuming that cpu_idle governors can be removed, and this might cause a
      race that a governor might be used after the module was removed.
      
      Venki said:
      
        I think your RFC patch is the right solution here.  As I see it, there is
        no race with your RFC patch.  As long as you call a dummy smp_call_function
        on all CPUs, we should be OK.  We can get rid of cpu_idle_state and the
        current wait forever logic altogether with dummy smp_call_function.  And so
        there wont be any wait forever scenario.
      
        The whole point of cpu_idle_wait() is to make all CPUs come out of idle
        loop atleast once.  The caller will use cpu_idle_wait something like this.
      
        // Want to change idle handler
      
        - Switch global idle handler to always present default_idle
      
        - call cpu_idle_wait so that all cpus come out of idle for an instant
          and stop using old idle pointer and start using default idle
      
        - Change the idle handler to a new handler
      
        - optional cpu_idle_wait if you want all cpus to start using the new
          handler immediately.
      
      Maybe the below 1s patch is safe bet for .24.  But for .25, I would say we
      just replace all complicated logic by simple dummy smp_call_function and
      remove cpu_idle_state altogether.
      Signed-off-by: NSteven Rostedt <srostedt@redhat.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Acked-by: NIngo Molnar <mingo@elte.hu>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Len Brown <lenb@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      40d6a146
  13. 20 12月, 2007 1 次提交