1. 08 7月, 2008 1 次提交
  2. 19 6月, 2008 1 次提交
    • S
      x86: fix NULL pointer deref in __switch_to · 75118a82
      Suresh Siddha 提交于
      Patrick McHardy reported a crash:
      
      > > I get this oops once a day, its apparently triggered by something
      > > run by cron, but the process is a different one each time.
      > >
      > > Kernel is -git from yesterday shortly before the -rc6 release
      > > (last commit is the usb-2.6 merge, the x86 patches are missing),
      > > .config is attached.
      > >
      > > I'll retry with current -git, but the patches that have gone in
      > > since I last updated don't look related.
      > >
      > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
      > > 000001ff
      > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118
      > > [62060.043009] *pde = 00000000
      > > [62060.043009] Oops: 0002 [#1] PREEMPT
      
      Vegard Nossum analyzed it:
      
      > This decodes to
      >
      >    0:   0f ae 00                fxsave (%eax)
      >
      > so it's related to the floating-point context. This is the exact
      > location of the crash:
      >
      > $ addr2line -e arch/x86/kernel/process_32.o -i ab0
      > include/asm/i387.h:232
      > include/asm/i387.h:262
      > arch/x86/kernel/process_32.c:595
      >
      > ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
      > Or maybe it never had any other value.
      
      Somehow (as described below) TS_USEDFPU is set but the fpu is not
      allocated or freed.
      
      Another possible FPU pre-emption issue with the sleazy FPU optimization
      which was benign before but not so anymore, with the dynamic FPU allocation
      patch.
      
      New task is getting exec'd and it is prempted at the below point.
      
      flush_thread() {
      	...
      	/*
      	* Forget coprocessor state..
      	*/
      	clear_fpu(tsk);
      		<----- Preemption point
      	clear_used_math();
      	...
      }
      
      Now when it context switches in again, as the used_math() is still set
      and fpu_counter can be > 5, we will do a math_state_restore() which sets
      the task's TS_USEDFPU. After it continues from the above preemption point
      it does clear_used_math() and much later free_thread_xstate().
      
      Now, at the next context switch, it is quite possible that xstate is
      null, used_math() is not set and TS_USEDFPU is still set. This will
      trigger unlazy_fpu() causing kernel oops.
      
      Fix this  by clearing tsk's fpu_counter before clearing task's fpu.
      Reported-by: NPatrick McHardy <kaber@trash.net>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      75118a82
  3. 10 6月, 2008 2 次提交
  4. 04 6月, 2008 1 次提交
    • S
      x86, fpu: fix CONFIG_PREEMPT=y corruption of application's FPU stack · 870568b3
      Suresh Siddha 提交于
      Jürgen Mell reported an FPU state corruption bug under CONFIG_PREEMPT,
      and bisected it to commit v2.6.19-1363-gacc20761, "i386: add sleazy FPU
      optimization".
      
      Add tsk_used_math() checks to prevent calling math_state_restore()
      which can sleep in the case of !tsk_used_math(). This prevents
      making a blocking call in __switch_to().
      
      Apparently "fpu_counter > 5" check is not enough, as in some signal handling
      and fork/exec scenarios, fpu_counter > 5 and !tsk_used_math() is possible.
      
      It's a side effect though. This is the failing scenario:
      
      process 'A' in save_i387_ia32() just after clear_used_math()
      
      Got an interrupt and pre-empted out.
      
      At the next context switch to process 'A' again, kernel tries to restore
      the math state proactively and sees a fpu_counter > 0 and !tsk_used_math()
      
      This results in init_fpu() during the __switch_to()'s math_state_restore()
      
      And resulting in fpu corruption which will be saved/restored
      (save_i387_fxsave and restore_i387_fxsave) during the remaining
      part of the signal handling after the context switch.
      Bisected-by: NJürgen Mell <j.mell@t-online.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Tested-by: NJürgen Mell <j.mell@t-online.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      870568b3
  5. 27 4月, 2008 1 次提交
    • P
      fix idle (arch, acpi and apm) and lockdep · 7f424a8b
      Peter Zijlstra 提交于
      OK, so 25-mm1 gave a lockdep error which made me look into this.
      
      The first thing that I noticed was the horrible mess; the second thing I
      saw was hacks like: 71e93d15
      
      The problem is that arch idle routines are somewhat inconsitent with
      their IRQ state handling and instead of fixing _that_, we go paper over
      the problem.
      
      So the thing I've tried to do is set a standard for idle routines and
      fix them all up to adhere to that. So the rules are:
      
        idle routines are entered with IRQs disabled
        idle routines will exit with IRQs enabled
      
      Nearly all already did this in one form or another.
      
      Merge the 32 and 64 bit bits so they no longer have different bugs.
      
      As for the actual lockdep warning; __sti_mwait() did a plainly un-annotated
      irq-enable.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NBob Copeland <me@bobcopeland.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7f424a8b
  6. 25 4月, 2008 1 次提交
  7. 20 4月, 2008 4 次提交
  8. 17 4月, 2008 5 次提交
  9. 11 4月, 2008 1 次提交
  10. 01 3月, 2008 1 次提交
  11. 09 2月, 2008 2 次提交
  12. 04 2月, 2008 1 次提交
  13. 30 1月, 2008 19 次提交