1. 19 9月, 2012 8 次提交
  2. 05 9月, 2012 1 次提交
  3. 06 6月, 2012 1 次提交
  4. 17 5月, 2012 1 次提交
    • S
      x86, xsave: remove thread_has_fpu() bug check in __sanitize_i387_state() · d75f1b39
      Suresh Siddha 提交于
      Code paths like fork(), exit() and signal handling flush the fpu
      state explicitly to the structures in memory.
      
      BUG_ON() in __sanitize_i387_state() is checking that the fpu state
      is not live any more. But for preempt kernels, task can be scheduled
      out and in at any place and the preload_fpu logic during context switch
      can make the fpu registers live again.
      
      For example, consider a 64-bit Task which uses fpu frequently and as such
      you will find its fpu_counter mostly non-zero. During its time slice, kernel
      used fpu by doing kernel_fpu_begin/kernel_fpu_end(). After this, in the same
      scheduling slice, task-A got a signal to handle. Then during the signal
      setup path we got preempted when we are just before the sanitize_i387_state()
      in arch/x86/kernel/xsave.c:save_i387_xstate(). And when we come back we
      will have the fpu registers live that can hit the bug_on.
      
      Similarly during core dump, other threads can context-switch in and out
      (because of spurious wakeups while waiting for the coredump to finish in
       kernel/exit.c:exit_mm()) and the main thread dumping core can run into this
      bug when it finds some other thread with its fpu registers live on some other cpu.
      
      So remove the paranoid check for now, even though it caught a bug in the
      multi-threaded core dump case (fixed in the previous patch).
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Link: http://lkml.kernel.org/r/1336692811-30576-3-git-send-email-suresh.b.siddha@intel.com
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      d75f1b39
  5. 22 2月, 2012 1 次提交
    • L
      i387: Split up <asm/i387.h> into exported and internal interfaces · 1361b83a
      Linus Torvalds 提交于
      While various modules include <asm/i387.h> to get access to things we
      actually *intend* for them to use, most of that header file was really
      pretty low-level internal stuff that we really don't want to expose to
      others.
      
      So split the header file into two: the small exported interfaces remain
      in <asm/i387.h>, while the internal definitions that are only used by
      core architecture code are now in <asm/fpu-internal.h>.
      
      The guiding principle for this was to expose functions that we export to
      modules, and leave them in <asm/i387.h>, while stuff that is used by
      task switching or was marked GPL-only is in <asm/fpu-internal.h>.
      
      The fpu-internal.h file could be further split up too, especially since
      arch/x86/kvm/ uses some of the remaining stuff for its module.  But that
      kvm usage should probably be abstracted out a bit, and at least now the
      internal FPU accessor functions are much more contained.  Even if it
      isn't perhaps as contained as it _could_ be.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      1361b83a
  6. 19 2月, 2012 1 次提交
    • L
      i387: move TS_USEDFPU flag from thread_info to task_struct · f94edacf
      Linus Torvalds 提交于
      This moves the bit that indicates whether a thread has ownership of the
      FPU from the TS_USEDFPU bit in thread_info->status to a word of its own
      (called 'has_fpu') in task_struct->thread.has_fpu.
      
      This fixes two independent bugs at the same time:
      
       - changing 'thread_info->status' from the scheduler causes nasty
         problems for the other users of that variable, since it is defined to
         be thread-synchronous (that's what the "TS_" part of the naming was
         supposed to indicate).
      
         So perfectly valid code could (and did) do
      
      	ti->status |= TS_RESTORE_SIGMASK;
      
         and the compiler was free to do that as separate load, or and store
         instructions.  Which can cause problems with preemption, since a task
         switch could happen in between, and change the TS_USEDFPU bit. The
         change to TS_USEDFPU would be overwritten by the final store.
      
         In practice, this seldom happened, though, because the 'status' field
         was seldom used more than once, so gcc would generally tend to
         generate code that used a read-modify-write instruction and thus
         happened to avoid this problem - RMW instructions are naturally low
         fat and preemption-safe.
      
       - On x86-32, the current_thread_info() pointer would, during interrupts
         and softirqs, point to a *copy* of the real thread_info, because
         x86-32 uses %esp to calculate the thread_info address, and thus the
         separate irq (and softirq) stacks would cause these kinds of odd
         thread_info copy aliases.
      
         This is normally not a problem, since interrupts aren't supposed to
         look at thread information anyway (what thread is running at
         interrupt time really isn't very well-defined), but it confused the
         heck out of irq_fpu_usable() and the code that tried to squirrel
         away the FPU state.
      
         (It also caused untold confusion for us poor kernel developers).
      
      It also turns out that using 'task_struct' is actually much more natural
      for most of the call sites that care about the FPU state, since they
      tend to work with the task struct for other reasons anyway (ie
      scheduling).  And the FPU data that we are going to save/restore is
      found there too.
      
      Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to
      the %esp issue.
      
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Reported-and-tested-by: NRaphael Prevost <raphael@buro.asia>
      Acked-and-tested-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Tested-by: NPeter Anvin <hpa@zytor.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f94edacf
  7. 17 2月, 2012 2 次提交
    • L
      i387: don't ever touch TS_USEDFPU directly, use helper functions · 6d59d7a9
      Linus Torvalds 提交于
      This creates three helper functions that do the TS_USEDFPU accesses, and
      makes everybody that used to do it by hand use those helpers instead.
      
      In addition, there's a couple of helper functions for the "change both
      CR0.TS and TS_USEDFPU at the same time" case, and the places that do
      that together have been changed to use those.  That means that we have
      fewer random places that open-code this situation.
      
      The intent is partly to clarify the code without actually changing any
      semantics yet (since we clearly still have some hard to reproduce bug in
      this area), but also to make it much easier to use another approach
      entirely to caching the CR0.TS bit for software accesses.
      
      Right now we use a bit in the thread-info 'status' variable (this patch
      does not change that), but we might want to make it a full field of its
      own or even make it a per-cpu variable.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6d59d7a9
    • L
      i387: fix x86-64 preemption-unsafe user stack save/restore · 15d8791c
      Linus Torvalds 提交于
      Commit 5b1cbac3 ("i387: make irq_fpu_usable() tests more robust")
      added a sanity check to the #NM handler to verify that we never cause
      the "Device Not Available" exception in kernel mode.
      
      However, that check actually pinpointed a (fundamental) race where we do
      cause that exception as part of the signal stack FPU state save/restore
      code.
      
      Because we use the floating point instructions themselves to save and
      restore state directly from user mode, we cannot do that atomically with
      testing the TS_USEDFPU bit: the user mode access itself may cause a page
      fault, which causes a task switch, which saves and restores the FP/MMX
      state from the kernel buffers.
      
      This kind of "recursive" FP state save is fine per se, but it means that
      when the signal stack save/restore gets restarted, it will now take the
      '#NM' exception we originally tried to avoid.  With preemption this can
      happen even without the page fault - but because of the user access, we
      cannot just disable preemption around the save/restore instruction.
      
      There are various ways to solve this, including using the
      "enable/disable_page_fault()" helpers to not allow page faults at all
      during the sequence, and fall back to copying things by hand without the
      use of the native FP state save/restore instructions.
      
      However, the simplest thing to do is to just allow the #NM from kernel
      space, but fix the race in setting and clearing CR0.TS that this all
      exposed: the TS bit changes and the TS_USEDFPU bit absolutely have to be
      atomic wrt scheduling, so while the actual state save/restore can be
      interrupted and restarted, the act of actually clearing/setting CR0.TS
      and the TS_USEDFPU bit together must not.
      
      Instead of just adding random "preempt_disable/enable()" calls to what
      is already excessively ugly code, this introduces some helper functions
      that mostly mirror the "kernel_fpu_begin/end()" functionality, just for
      the user state instead.
      
      Those helper functions should probably eventually replace the other
      ad-hoc CR0.TS and TS_USEDFPU tests too, but I'll need to think about it
      some more: the task switching functionality in particular needs to
      expose the difference between the 'prev' and 'next' threads, while the
      new helper functions intentionally were written to only work with
      'current'.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      15d8791c
  8. 18 3月, 2011 1 次提交
  9. 14 12月, 2010 1 次提交
  10. 22 7月, 2010 6 次提交
  11. 21 7月, 2010 1 次提交
  12. 20 7月, 2010 2 次提交
  13. 07 7月, 2010 1 次提交
    • S
      x86: Avoid unnecessary __clear_user() and xrstor in signal handling · 8e221b6d
      Suresh Siddha 提交于
      fxsave/xsave doesn't touch all the bytes in the memory layout used by
      these instructions. Specifically SW reserved (bytes 464..511) fields
      in the fxsave frame and the reserved fields in the xsave header.
      
      To present a clean context for the signal handling, just clear these fields
      instead of clearing the complete fxsave/xsave memory layout, when we dump these
      registers directly to the user signal frame.
      
      Also avoid the call to second xrstor (which inits the state not passed
      in the signal frame) in restore_user_xstate() if all the state has already
      been restored by the first xrstor.
      
      These changes improve the performance of signal handling(by ~3-5% as measured
      by the lat_sig).
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1277249017.2847.85.camel@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      8e221b6d
  14. 10 6月, 2010 1 次提交
  15. 11 5月, 2010 2 次提交
    • A
      x86: Introduce 'struct fpu' and related API · 86603283
      Avi Kivity 提交于
      Currently all fpu state access is through tsk->thread.xstate.  Since we wish
      to generalize fpu access to non-task contexts, wrap the state in a new
      'struct fpu' and convert existing access to use an fpu API.
      
      Signal frame handlers are not converted to the API since they will remain
      task context only things.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1273135546-29690-3-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      86603283
    • A
      x86: Eliminate TS_XSAVE · c9ad4882
      Avi Kivity 提交于
      The fpu code currently uses current->thread_info->status & TS_XSAVE as
      a way to distinguish between XSAVE capable processors and older processors.
      The decision is not really task specific; instead we use the task status to
      avoid a global memory reference - the value should be the same across all
      threads.
      
      Eliminate this tie-in into the task structure by using an alternative
      instruction keyed off the XSAVE cpu feature; this results in shorter and
      faster code, without introducing a global memory reference.
      
      [ hpa: in the future, this probably should use an asm jmp ]
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      c9ad4882
  16. 12 2月, 2010 1 次提交
    • S
      x86, ptrace: regset extensions to support xstate · 5b3efd50
      Suresh Siddha 提交于
      Add the xstate regset support which helps extend the kernel ptrace and the
      core-dump interfaces to support AVX state etc.
      
      This regset interface is designed to support all the future state that gets
      supported using xsave/xrstor infrastructure.
      
      Looking at the memory layout saved by "xsave", one can't say which state
      is represented in the memory layout. This is because if a particular state is
      in init state, in the xsave hdr it can be represented by bit '0'. And hence
      we can't really say by the xsave header wether a state is in init state or
      the state is not saved in the memory layout.
      
      And hence the xsave memory layout available through this regset
      interface uses SW usable bytes [464..511] to convey what state is represented
      in the memory layout.
      
      First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
      mask(which is same as the 64bit mask returned by the xgetbv's xCR0).
      
      The note NT_X86_XSTATE represents the extended state information in the
      core file, using the above mentioned memory layout.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100211195614.802495327@sbs-t61.sc.intel.com>
      Signed-off-by: NHongjiu Lu <hjl.tools@gmail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5b3efd50
  17. 21 4月, 2009 1 次提交
    • S
      x86-64: fix FPU corruption with signals and preemption · 06c38d5e
      Suresh Siddha 提交于
      In 64bit signal delivery path, clear_used_math() was happening before saving
      the current active FPU state on to the user stack for signal handling. Between
      clear_used_math() and the state store on to the user stack, potentially we
      can get a page fault for the user address and can block. Infact, while testing
      we were hitting the might_fault() in __clear_user() which can do a schedule().
      
      At a later point in time, we will schedule back into this process and
      resume the save state (using "xsave/fxsave" instruction) which can lead
      to DNA fault. And as used_math was cleared before, we will reinit the FP state
      in the DNA fault and continue. This reinit will result in loosing the
      FPU state of the process.
      
      Move clear_used_math() to a point after the FPU state has been stored
      onto the user stack.
      
      This issue is present from a long time (even before the xsave changes
      and the x86 merge). But it can easily be exposed in 2.6.28.x and 2.6.29.x
      series because of the __clear_user() in this path, which has an explicit
      __cond_resched() leading to a context switch with CONFIG_PREEMPT_VOLUNTARY.
      
      [ Impact: fix FPU state corruption ]
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <stable@kernel.org>			[2.6.28.x, 2.6.29.x]
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      06c38d5e
  18. 12 4月, 2009 1 次提交
  19. 31 12月, 2008 1 次提交
  20. 20 11月, 2008 1 次提交
  21. 22 10月, 2008 1 次提交
  22. 12 10月, 2008 1 次提交
    • I
      x86, fpu: check __clear_user() return value · 9f482807
      Ingo Molnar 提交于
      fix warning:
      
        arch/x86/kernel/xsave.c: In function ‘save_i387_xstate’:
        arch/x86/kernel/xsave.c:98: warning: ignoring return value of ‘__clear_user’, declared with attribute warn_unused_result
      
      check the return value and act on it. We should not be ignoring faults
      at this point.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      9f482807
  23. 08 10月, 2008 2 次提交
  24. 07 9月, 2008 1 次提交