1. 17 4月, 2015 1 次提交
    • B
      x86/fpu: Load xsave pointer *after* initialization · 18ecb3bf
      Borislav Petkov 提交于
      So I was playing with gdb today and did this simple thing:
      
      	gdb /bin/ls
      
      	...
      
      	(gdb) run
      
      Box exploded with this splat:
      
      	BUG: unable to handle kernel NULL pointer dereference at 00000000000001d0
      	IP: [<ffffffff8100fe5a>] xstateregs_get+0x7a/0x120
      	[...]
      
      	Call Trace:
      	 ptrace_regset
      	 ptrace_request
      	 ? wait_task_inactive
      	 ? preempt_count_sub
      	 arch_ptrace
      	 ? ptrace_get_task_struct
      	 SyS_ptrace
      	 system_call_fastpath
      
      ... because we do cache &target->thread.fpu.state->xsave into the
      local variable xsave but that pointer is NULL at that time and
      it gets initialized later, in init_fpu(), see:
      
      	e7f180dc ("x86/fpu: Change xstateregs_get()/set() to use ->xsave.i387 rather than ->fxsave")
      
      The fix is simple: load xsave *after* init_fpu() has run.
      
      Also do the same in xstateregs_set(), as suggested by Oleg Nesterov.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NOleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1429209697-5902-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      18ecb3bf
  2. 23 3月, 2015 2 次提交
  3. 10 3月, 2015 2 次提交
  4. 23 2月, 2015 2 次提交
  5. 19 2月, 2015 3 次提交
  6. 04 2月, 2015 1 次提交
  7. 20 1月, 2015 3 次提交
    • O
      x86, fpu: Fix math_state_restore() race with kernel_fpu_begin() · 7575637a
      Oleg Nesterov 提交于
      math_state_restore() can race with kernel_fpu_begin() if irq comes
      right after __thread_fpu_begin(), __save_init_fpu() will overwrite
      fpu->state we are going to restore.
      
      Add 2 simple helpers, kernel_fpu_disable() and kernel_fpu_enable()
      which simply set/clear in_kernel_fpu, and change math_state_restore()
      to exclude kernel_fpu_begin() in between.
      
      Alternatively we could use local_irq_save/restore, but probably these
      new helpers can have more users.
      
      Perhaps they should disable/enable preemption themselves, in this case
      we can remove preempt_disable() in __restore_xstate_sig().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: matt.fleming@intel.com
      Cc: bp@suse.de
      Cc: pbonzini@redhat.com
      Cc: luto@amacapital.net
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Link: http://lkml.kernel.org/r/20150115192028.GD27332@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      7575637a
    • O
      x86, fpu: Don't abuse has_fpu in __kernel_fpu_begin/end() · 33a3ebdc
      Oleg Nesterov 提交于
      Now that we have in_kernel_fpu we can remove __thread_clear_has_fpu()
      in __kernel_fpu_begin(). And this allows to replace the asymmetrical
      and nontrivial use_eager_fpu + tsk_used_math check in kernel_fpu_end()
      with the same __thread_has_fpu() check.
      
      The logic becomes really simple; if _begin() does save() then _end()
      needs restore(), this is controlled by __thread_has_fpu(). Otherwise
      they do clts/stts unless use_eager_fpu().
      
      Not only this makes begin/end symmetrical and imo more understandable,
      potentially this allows to change irq_fpu_usable() to avoid all other
      checks except "in_kernel_fpu".
      
      Also, with this patch __kernel_fpu_end() does restore_fpu_checking()
      and WARNs if it fails instead of math_state_restore(). I think this
      looks better because we no longer need __thread_fpu_begin(), and it
      would be better to report the failure in this case.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Cc: matt.fleming@intel.com
      Cc: bp@suse.de
      Cc: pbonzini@redhat.com
      Cc: luto@amacapital.net
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Link: http://lkml.kernel.org/r/20150115192005.GC27332@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      33a3ebdc
    • O
      x86, fpu: Introduce per-cpu in_kernel_fpu state · 14e153ef
      Oleg Nesterov 提交于
      interrupted_kernel_fpu_idle() tries to detect if kernel_fpu_begin()
      is safe or not. In particular it should obviously deny the nested
      kernel_fpu_begin() and this logic looks very confusing.
      
      If use_eager_fpu() == T we rely on a) __thread_has_fpu() check in
      interrupted_kernel_fpu_idle(), and b) on the fact that _begin() does
      __thread_clear_has_fpu().
      
      Otherwise we demand that the interrupted task has no FPU if it is in
      kernel mode, this works because __kernel_fpu_begin() does clts() and
      interrupted_kernel_fpu_idle() checks X86_CR0_TS.
      
      Add the per-cpu "bool in_kernel_fpu" variable, and change this code
      to check/set/clear it. This allows to do more cleanups and fixes, see
      the next changes.
      
      The patch also moves WARN_ON_ONCE() under preempt_disable() just to
      make this_cpu_read() look better, this is not really needed. And in
      fact I think we should move it into __kernel_fpu_begin().
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Reviewed-by: NRik van Riel <riel@redhat.com>
      Cc: matt.fleming@intel.com
      Cc: bp@suse.de
      Cc: pbonzini@redhat.com
      Cc: luto@amacapital.net
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Suresh Siddha <sbsiddha@gmail.com>
      Link: http://lkml.kernel.org/r/20150115191943.GB27332@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      14e153ef
  8. 30 5月, 2014 1 次提交
  9. 12 3月, 2014 1 次提交
    • S
      x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU · 731bd6a9
      Suresh Siddha 提交于
      For non-eager fpu mode, thread's fpu state is allocated during the first
      fpu usage (in the context of device not available exception). This
      (math_state_restore()) can be a blocking call and hence we enable
      interrupts (which were originally disabled when the exception happened),
      allocate memory and disable interrupts etc.
      
      But the eager-fpu mode, call's the same math_state_restore() from
      kernel_fpu_end(). The assumption being that tsk_used_math() is always
      set for the eager-fpu mode and thus avoid the code path of enabling
      interrupts, allocating fpu state using blocking call and disable
      interrupts etc.
      
      But the below issue was noticed by Maarten Baert, Nate Eldredge and
      few others:
      
      If a user process dumps core on an ecrypt fs while aesni-intel is loaded,
      we get a BUG() in __find_get_block() complaining that it was called with
      interrupts disabled; then all further accesses to our ecrypt fs hang
      and we have to reboot.
      
      The aesni-intel code (encrypting the core file that we are writing) needs
      the FPU and quite properly wraps its code in kernel_fpu_{begin,end}(),
      the latter of which calls math_state_restore(). So after kernel_fpu_end(),
      interrupts may be disabled, which nobody seems to expect, and they stay
      that way until we eventually get to __find_get_block() which barfs.
      
      For eager fpu, most the time, tsk_used_math() is true. At few instances
      during thread exit, signal return handling etc, tsk_used_math() might
      be false.
      
      In kernel_fpu_end(), for eager-fpu, call math_state_restore()
      only if tsk_used_math() is set. Otherwise, don't bother. Kernel code
      path which cleared tsk_used_math() knows what needs to be done
      with the fpu state.
      Reported-by: NMaarten Baert <maarten-baert@hotmail.com>
      Reported-by: NNate Eldredge <nate@thatsmathematics.com>
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSuresh Siddha <sbsiddha@gmail.com>
      Link: http://lkml.kernel.org/r/1391410583.3801.6.camel@europa
      Cc: George Spelvin <linux@horizon.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      731bd6a9
  10. 13 11月, 2013 1 次提交
  11. 27 7月, 2013 1 次提交
    • H
      x86, fpu: correct the asm constraints for fxsave, unbreak mxcsr.daz · eaa5a990
      H.J. Lu 提交于
      GCC will optimize mxcsr_feature_mask_init in arch/x86/kernel/i387.c:
      
      		memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct));
      		asm volatile("fxsave %0" : : "m" (fx_scratch));
      		mask = fx_scratch.mxcsr_mask;
      		if (mask == 0)
      			mask = 0x0000ffbf;
      
      to
      
      		memset(&fx_scratch, 0, sizeof(struct i387_fxsave_struct));
      		asm volatile("fxsave %0" : : "m" (fx_scratch));
      		mask = 0x0000ffbf;
      
      since asm statement doesn’t say it will update fx_scratch.  As the
      result, the DAZ bit will be cleared.  This patch fixes it. This bug
      dates back to at least kernel 2.6.12.
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      eaa5a990
  12. 15 7月, 2013 1 次提交
    • P
      x86: delete __cpuinit usage from all x86 files · 148f9bb8
      Paul Gortmaker 提交于
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      are flagged as __cpuinit  -- so if we remove the __cpuinit from
      arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      content into no-ops as early as possible, since that will get rid
      of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the arch/x86 uses of the __cpuinit macros from
      all C files.  x86 only had the one __CPUINIT used in assembly files,
      and it wasn't paired off with a .previous or a __FINIT, so we can
      delete it directly w/o any corresponding additional change there.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      148f9bb8
  13. 07 6月, 2013 1 次提交
  14. 31 5月, 2013 1 次提交
    • P
      x86: Allow FPU to be used at interrupt time even with eagerfpu · 5187b28f
      Pekka Riikonen 提交于
      With the addition of eagerfpu the irq_fpu_usable() now returns false
      negatives especially in the case of ksoftirqd and interrupted idle task,
      two common cases for FPU use for example in networking/crypto.  With
      eagerfpu=off FPU use is possible in those contexts.  This is because of
      the eagerfpu check in interrupted_kernel_fpu_idle():
      
      ...
        * For now, with eagerfpu we will return interrupted kernel FPU
        * state as not-idle. TBD: Ideally we can change the return value
        * to something like __thread_has_fpu(current). But we need to
        * be careful of doing __thread_clear_has_fpu() before saving
        * the FPU etc for supporting nested uses etc. For now, take
        * the simple route!
      ...
       	if (use_eager_fpu())
       		return 0;
      
      As eagerfpu is automatically "on" on those CPUs that also have the
      features like AES-NI this patch changes the eagerfpu check to return 1 in
      case the kernel_fpu_begin() has not been said yet.  Once it has been the
      __thread_has_fpu() will start returning 0.
      
      Notice that with eagerfpu the __thread_has_fpu is always true initially.
      FPU use is thus always possible no matter what task is under us, unless
      the state has already been saved with kernel_fpu_begin().
      
      [ hpa: this is a performance regression, not a correctness regression,
        but since it can be quite serious on CPUs which need encryption at
        interrupt time I am marking this for urgent/stable. ]
      Signed-off-by: NPekka Riikonen <priikone@iki.fi>
      Link: http://lkml.kernel.org/r/alpine.GSO.2.00.1305131356320.18@git.silcnet.org
      Cc: <stable@vger.kernel.org> v3.7+
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5187b28f
  15. 15 11月, 2012 1 次提交
  16. 22 9月, 2012 1 次提交
    • S
      x86, kvm: fix kvm's usage of kernel_fpu_begin/end() · b1a74bf8
      Suresh Siddha 提交于
      Preemption is disabled between kernel_fpu_begin/end() and as such
      it is not a good idea to use these routines in kvm_load/put_guest_fpu()
      which can be very far apart.
      
      kvm_load/put_guest_fpu() routines are already called with
      preemption disabled and KVM already uses the preempt notifier to save
      the guest fpu state using kvm_put_guest_fpu().
      
      So introduce __kernel_fpu_begin/end() routines which don't touch
      preemption and use them instead of kernel_fpu_begin/end()
      for KVM's use model of saving/restoring guest FPU state.
      
      Also with this change (and with eagerFPU model), fix the host cr0.TS vm-exit
      state in the case of VMX. For eagerFPU case, host cr0.TS is always clear.
      So no need to worry about it. For the traditional lazyFPU restore case,
      change the cr0.TS bit for the host state during vm-exit to be always clear
      and cr0.TS bit is set in the __vmx_load_host_state() when the FPU
      (guest FPU or the host task's FPU) state is not active. This ensures
      that the host/guest FPU state is properly saved, restored
      during context-switch and with interrupts (using irq_fpu_usable()) not
      stomping on the active FPU state.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1348164109.26695.338.camel@sbsiddha-desk.sc.intel.com
      Cc: Avi Kivity <avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      b1a74bf8
  17. 19 9月, 2012 3 次提交
    • S
      x86, fpu: decouple non-lazy/eager fpu restore from xsave · 5d2bd700
      Suresh Siddha 提交于
      Decouple non-lazy/eager fpu restore policy from the existence of the xsave
      feature. Introduce a synthetic CPUID flag to represent the eagerfpu
      policy. "eagerfpu=on" boot paramter will enable the policy.
      Requested-by: NH. Peter Anvin <hpa@zytor.com>
      Requested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1347300665-6209-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      5d2bd700
    • S
      x86, fpu: use non-lazy fpu restore for processors supporting xsave · 304bceda
      Suresh Siddha 提交于
      Fundamental model of the current Linux kernel is to lazily init and
      restore FPU instead of restoring the task state during context switch.
      This changes that fundamental lazy model to the non-lazy model for
      the processors supporting xsave feature.
      
      Reasons driving this model change are:
      
      i. Newer processors support optimized state save/restore using xsaveopt and
      xrstor by tracking the INIT state and MODIFIED state during context-switch.
      This is faster than modifying the cr0.TS bit which has serializing semantics.
      
      ii. Newer glibc versions use SSE for some of the optimized copy/clear routines.
      With certain workloads (like boot, kernel-compilation etc), application
      completes its work with in the first 5 task switches, thus taking upto 5 #DNA
      traps with the kernel not getting a chance to apply the above mentioned
      pre-load heuristic.
      
      iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit
      and thus will not work correctly in the presence of lazy restore. Non-lazy
      state restore is needed for enabling such features.
      
      Some data on a two socket SNB system:
       * Saved 20K DNA exceptions during boot on a two socket SNB system.
       * Saved 50K DNA exceptions during kernel-compilation workload.
       * Improved throughput of the AVX based checksumming function inside the
         kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts
         pair.
      
      Also now kernel_fpu_begin/end() relies on the patched
      alternative instructions. So move check_fpu() which uses the
      kernel_fpu_begin/end() after alternative_instructions().
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com
      Merge 32-bit boot fix from,
      Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com
      Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Avi Kivity <avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      304bceda
    • S
      x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels · 72a671ce
      Suresh Siddha 提交于
      Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied
      to/from the fpstate in the task struct.
      
      And in the case of signal delivery for x86_64 binaries, if the fpstate is live
      in the CPU registers, then the live state is copied directly to the user
      sigframe. Otherwise  fpstate in the task struct is copied to the user sigframe.
      During restore, fpstate in the user sigframe is restored directly to the live
      CPU registers.
      
      Historically, different code paths led to different bugs. For example,
      x86_64 code path was not preemption safe till recently. Also there is lot
      of code duplication for support of new features like xsave etc.
      
      Unify signal handling code paths for x86 and x86_64 kernels.
      
      New strategy is as follows:
      
      Signal delivery: Both for 32/64-bit frames, align the core math frame area to
      64bytes as needed by xsave (this where the main fpu/extended state gets copied
      to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave
      frames). If the state is live, copy the register state directly to the user
      frame. If not live, copy the state in the thread struct to the user frame. And
      for 32-bit [f]xsave frames, construct the fsave header separately before
      the actual [f]xsave area.
      
      Signal return: As the 32-bit frames with [f]xstate has an additional
      'fsave' header, copy everything back from the user sigframe to the
      fpstate in the task structure and reconstruct the fxstate from the 'fsave'
      header (Also user passed pointers may not be correctly aligned for
      any attempt to directly restore any partial state). At the next fpstate usage,
      everything will be restored to the live CPU registers.
      For all the 64-bit frames and the 32-bit fsave frame, restore the state from
      the user sigframe directly to the live CPU registers. 64-bit signals always
      restored the math frame directly, so we can expect the math frame pointer
      to be correctly aligned. For 32-bit fsave frames, there are no alignment
      requirements, so we can restore the state directly.
      
      "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are
      with in the noise range with this change.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com
      [ Merged in compilation fix ]
      Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      72a671ce
  18. 15 5月, 2012 1 次提交
  19. 17 4月, 2012 1 次提交
  20. 22 2月, 2012 2 次提交
  21. 21 7月, 2011 1 次提交
    • P
      treewide: fix potentially dangerous trailing ';' in #defined values/expressions · 497888cf
      Phil Carmody 提交于
      All these are instances of
        #define NAME value;
      or
        #define NAME(params_opt) value;
      
      These of course fail to build when used in contexts like
        if(foo $OP NAME)
        while(bar $OP NAME)
      and may silently generate the wrong code in contexts such as
        foo = NAME + 1;    /* foo = value; + 1; */
        bar = NAME - 1;    /* bar = value; - 1; */
        baz = NAME & quux; /* baz = value; & quux; */
      
      Reported on comp.lang.c,
      Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com>
      Initial analysis of the dangers provided by Keith Thompson in that thread.
      
      There are many more instances of more complicated macros having unnecessary
      trailing semicolons, but this pile seems to be all of the cases of simple
      values suffering from the problem. (Thus things that are likely to be found
      in one of the contexts above, more complicated ones aren't.)
      Signed-off-by: NPhil Carmody <ext-phil.2.carmody@nokia.com>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      497888cf
  22. 18 3月, 2011 1 次提交
  23. 12 1月, 2011 1 次提交
  24. 10 9月, 2010 3 次提交
  25. 15 8月, 2010 1 次提交
    • X
      KVM: fix poison overwritten caused by using wrong xstate size · f45755b8
      Xiaotian Feng 提交于
      fpu.state is allocated from task_xstate_cachep, the size of task_xstate_cachep
      is xstate_size. xstate_size is set from cpuid instruction, which is often
      smaller than sizeof(struct xsave_struct). kvm is using sizeof(struct xsave_struct)
      to fill in/out fpu.state.xsave, as what we allocated for fpu.state is
      xstate_size, kernel will write out of memory and caused poison/redzone/padding
      overwritten warnings.
      Signed-off-by: NXiaotian Feng <dfeng@redhat.com>
      Reviewed-by: NSheng Yang <sheng@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Sheng Yang <sheng@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      f45755b8
  26. 13 8月, 2010 1 次提交
  27. 01 8月, 2010 1 次提交
  28. 22 7月, 2010 1 次提交