1. 20 7月, 2010 2 次提交
    • S
      x86, xsave: Use xsaveopt in context-switch path when supported · 6bad06b7
      Suresh Siddha 提交于
      xsaveopt is a more optimized form of xsave specifically designed
      for the context switch usage. xsaveopt doesn't save the state that's not
      modified from the prior xrstor. And if a specific feature state gets
      modified to the init state, then xsaveopt just updates the header bit
      in the xsave memory layout without updating the corresponding memory
      layout.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100719230205.604014179@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      6bad06b7
    • S
      x86, xsave: Sync xsave memory layout with its header for user handling · 29104e10
      Suresh Siddha 提交于
      With xsaveopt, if a processor implementation discern that a processor state
      component is in its initialized state it may modify the corresponding bit in
      the xsave_hdr.xstate_bv as '0', with out modifying the corresponding memory
      layout. Hence wHile presenting the xstate information to the user, we always
      ensure that the memory layout of a feature will be in the init state if the
      corresponding header bit is zero. This ensures the consistency and avoids the
      condition of the user seeing some some stale state in the memory layout during
      signal handling, debugging etc.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100719230205.351459480@sbs-t61.sc.intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      29104e10
  2. 12 5月, 2010 1 次提交
  3. 11 5月, 2010 3 次提交
    • H
      x86, fpu: Use the proper asm constraint in use_xsave() · dce8bf4e
      H. Peter Anvin 提交于
      The proper constraint for a receiving 8-bit variable is "=qm", not
      "=g" which equals "=rim"; even though the "i" will never match, bugs
      can and do happen due to the difference between "q" and "r".
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com>
      dce8bf4e
    • A
      x86: Introduce 'struct fpu' and related API · 86603283
      Avi Kivity 提交于
      Currently all fpu state access is through tsk->thread.xstate.  Since we wish
      to generalize fpu access to non-task contexts, wrap the state in a new
      'struct fpu' and convert existing access to use an fpu API.
      
      Signal frame handlers are not converted to the API since they will remain
      task context only things.
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1273135546-29690-3-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      86603283
    • A
      x86: Eliminate TS_XSAVE · c9ad4882
      Avi Kivity 提交于
      The fpu code currently uses current->thread_info->status & TS_XSAVE as
      a way to distinguish between XSAVE capable processors and older processors.
      The decision is not really task specific; instead we use the task status to
      avoid a global memory reference - the value should be the same across all
      threads.
      
      Eliminate this tie-in into the task structure by using an alternative
      instruction keyed off the XSAVE cpu feature; this results in shorter and
      faster code, without introducing a global memory reference.
      
      [ hpa: in the future, this probably should use an asm jmp ]
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      c9ad4882
  4. 12 2月, 2010 1 次提交
    • S
      x86, ptrace: regset extensions to support xstate · 5b3efd50
      Suresh Siddha 提交于
      Add the xstate regset support which helps extend the kernel ptrace and the
      core-dump interfaces to support AVX state etc.
      
      This regset interface is designed to support all the future state that gets
      supported using xsave/xrstor infrastructure.
      
      Looking at the memory layout saved by "xsave", one can't say which state
      is represented in the memory layout. This is because if a particular state is
      in init state, in the xsave hdr it can be represented by bit '0'. And hence
      we can't really say by the xsave header wether a state is in init state or
      the state is not saved in the memory layout.
      
      And hence the xsave memory layout available through this regset
      interface uses SW usable bytes [464..511] to convey what state is represented
      in the memory layout.
      
      First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate
      mask(which is same as the 64bit mask returned by the xgetbv's xCR0).
      
      The note NT_X86_XSTATE represents the extended state information in the
      core file, using the above mentioned memory layout.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20100211195614.802495327@sbs-t61.sc.intel.com>
      Signed-off-by: NHongjiu Lu <hjl.tools@gmail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      5b3efd50
  5. 03 11月, 2009 1 次提交
  6. 02 9月, 2009 1 次提交
    • H
      x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h · ae4b688d
      Huang Ying 提交于
      This function measures whether the FPU/SSE state can be touched in
      interrupt context. If the interrupted code is in user space or has no
      valid FPU/SSE context (CR0.TS == 1), FPU/SSE state can be used in IRQ
      or soft_irq context too.
      
      This is used by AES-NI accelerated AES implementation and PCLMULQDQ
      accelerated GHASH implementation.
      
      v3:
       - Renamed to irq_fpu_usable to reflect the purpose of the function.
      
      v2:
       - Renamed to irq_is_fpu_using to reflect the real situation.
      Signed-off-by: NHuang Ying <ying.huang@intel.com>
      CC: H. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      ae4b688d
  7. 18 6月, 2009 1 次提交
    • J
      x86: split out core __math_state_restore · e6e9cac8
      Jeremy Fitzhardinge 提交于
      Split the core fpu state restoration out into __math_state_restore, which
      assumes that cr0.TS is clear and that the fpu context has been initialized.
      
      This will be used during context switch.  There are two reasons this is
      desireable:
      
      - There's a small clarification.  When __switch_to() calls math_state_restore,
        it relies on the fact that tsk_used_math() returns true, and so will
        never do a blocking init_fpu().  __math_state_restore() does not have
        (or need) that logic, so the question never arises.
      
      - It allows the clts() to be moved earler in __switch_to() so it can be performed
        while cpu context updates are batched (will be done in a later patch).
      
      [ Impact: refactor code to make reuse cleaner; no functional change ]
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      e6e9cac8
  8. 09 6月, 2009 1 次提交
    • C
      x86: Clear TS in irq_ts_save() when in an atomic section · 0b8c3d5a
      Chuck Ebbert 提交于
      The dynamic FPU context allocation changes caused the padlock driver
      to generate the below warning. Fix it by masking TS when doing padlock
      encryption operations in an atomic section.
      
      This solves:
      
      BUG: sleeping function called from invalid context at mm/slub.c:1602
      in_atomic(): 1, irqs_disabled(): 0, pid: 82, name: cryptomgr_test
      Pid: 82, comm: cryptomgr_test Not tainted 2.6.29.4-168.test7.fc11.x86_64 #1
      Call Trace:
      [<ffffffff8103ff16>] __might_sleep+0x10b/0x110
      [<ffffffff810cd3b2>] kmem_cache_alloc+0x37/0xf1
      [<ffffffff81018505>] init_fpu+0x49/0x8a
      [<ffffffff81012a83>] math_state_restore+0x3e/0xbc
      [<ffffffff813ac6d0>] do_device_not_available+0x9/0xb
      [<ffffffff810123ab>] device_not_available+0x1b/0x20
      [<ffffffffa001c066>] ? aes_crypt+0x66/0x74 [padlock_aes]
      [<ffffffff8119a51a>] ? blkcipher_walk_next+0x257/0x2e0
      [<ffffffff8119a731>] ? blkcipher_walk_first+0x18e/0x19d
      [<ffffffffa001c1fe>] aes_encrypt+0x9d/0xe5 [padlock_aes]
      [<ffffffffa0027253>] crypt+0x6b/0x114 [xts]
      [<ffffffffa001c161>] ? aes_encrypt+0x0/0xe5 [padlock_aes]
      [<ffffffffa001c161>] ? aes_encrypt+0x0/0xe5 [padlock_aes]
      [<ffffffffa0027390>] encrypt+0x49/0x4b [xts]
      [<ffffffff81199acc>] async_encrypt+0x3c/0x3e
      [<ffffffff8119dafc>] test_skcipher+0x1da/0x658
      [<ffffffff811979c3>] ? crypto_spawn_tfm+0x8e/0xb1
      [<ffffffff8119672d>] ? __crypto_alloc_tfm+0x11b/0x15f
      [<ffffffff811979c3>] ? crypto_spawn_tfm+0x8e/0xb1
      [<ffffffff81199dbe>] ? skcipher_geniv_init+0x2b/0x47
      [<ffffffff8119a905>] ? async_chainiv_init+0x5c/0x61
      [<ffffffff8119dfdd>] alg_test_skcipher+0x63/0x9b
      [<ffffffff8119e1bc>] alg_test+0x12d/0x175
      [<ffffffff8119c488>] cryptomgr_test+0x38/0x54
      [<ffffffff8119c450>] ? cryptomgr_test+0x0/0x54
      [<ffffffff8105c6c9>] kthread+0x4d/0x78
      [<ffffffff8101264a>] child_rip+0xa/0x20
      [<ffffffff81011f67>] ? restore_args+0x0/0x30
      [<ffffffff8105c67c>] ? kthread+0x0/0x78
      [<ffffffff81012640>] ? child_rip+0x0/0x20
      Signed-off-by: NChuck Ebbert <cebbert@redhat.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      LKML-Reference: <20090609104050.50158cfe@dhcp-100-2-144.bos.redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0b8c3d5a
  9. 08 4月, 2009 3 次提交
  10. 05 3月, 2009 1 次提交
    • D
      x86, math-emu: fix init_fpu for task != current · ab9e1858
      Daniel Glöckner 提交于
      Impact: fix math-emu related crash while using GDB/ptrace
      
      init_fpu() calls finit to initialize a task's xstate, while finit always
      works on the current task. If we use PTRACE_GETFPREGS on another
      process and both processes did not already use floating point, we get
      a null pointer exception in finit.
      
      This patch creates a new function finit_task that takes a task_struct
      parameter. finit becomes a wrapper that simply calls finit_task with
      current. On the plus side this avoids many calls to get_current which
      would each resolve to an inline assembler mov instruction.
      
      An empty finit_task has been added to i387.h to avoid linker errors in
      case the compiler still emits the call in init_fpu when
      CONFIG_MATH_EMULATION is not defined.
      
      The declaration of finit in i387.h has been removed as the remaining
      code using this function gets its prototype from fpu_proto.h.
      Signed-off-by: NDaniel Glöckner <dg@emlix.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Bill Metzenthen <billm@melbpc.org.au>
      LKML-Reference: <E1Lew31-0004il-Fg@mailer.emlix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab9e1858
  11. 23 10月, 2008 2 次提交
  12. 13 8月, 2008 1 次提交
    • S
      crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore() · e4914012
      Suresh Siddha 提交于
      Wolfgang Walter reported this oops on his via C3 using padlock for
      AES-encryption:
      
      ##################################################################
      
      BUG: unable to handle kernel NULL pointer dereference at 000001f0
      IP: [<c01028c5>] __switch_to+0x30/0x117
      *pde = 00000000
      Oops: 0002 [#1] PREEMPT
      Modules linked in:
      
      Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
      EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
      EIP is at __switch_to+0x30/0x117
      EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
      ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
       DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
      Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
             c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
             c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
      Call Trace:
       [<c03b5b43>] ? schedule+0x285/0x2ff
       [<c0131856>] ? pm_qos_requirement+0x3c/0x53
       [<c0239f54>] ? acpi_processor_idle+0x0/0x434
       [<c01025fe>] ? cpu_idle+0x73/0x7f
       [<c03a4dcd>] ? rest_init+0x61/0x63
       =======================
      
      Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
      around the padlock instructions fix the oops.
      
      Suresh wrote:
      
      These padlock instructions though don't use/touch SSE registers, but it behaves
      similar to other SSE instructions. For example, it might cause DNA faults
      when cr0.ts is set. While this is a spurious DNA trap, it might cause
      oops with the recent fpu code changes.
      
      This is the code sequence  that is probably causing this problem:
      
      a) new app is getting exec'd and it is somewhere in between
         start_thread() and flush_old_exec() in the load_xyz_binary()
      
      b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
         cleared.
      
      c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
         routines in the network stack. This generates a math fault (as
         cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
         in the task's xstate.
      
      d) Return to exec code path, which does start_thread() which does
         free_thread_xstate() and sets xstate pointer to NULL while
         the TS_USEDFPU is still set.
      
      e) At the next context switch from the new exec'd task to another task,
         we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
         This can cause an oops during unlazy_fpu() in __switch_to()
      
      Now:
      
      1) This should happen with or with out pre-emption. Viro also encountered
         similar problem with out CONFIG_PREEMPT.
      
      2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
         kernel_fpu_begin() will manually do a clts() and won't run in to the
         situation of setting TS_USEDFPU in step "c" above.
      
      3) This was working before the fpu changes, because its a spurious
         math fault  which doesn't corrupt any fpu/sse registers and the task's
         math state was always in an allocated state.
      
      With out the recent lazy fpu allocation changes, while we don't see oops,
      there is a possible race still present in older kernels(for example,
      while kernel is using kernel_fpu_begin() in some optimized clear/copy
      page and an interrupt/softirq happens which uses these padlock
      instructions generating DNA fault).
      
      This is the failing scenario that existed even before the lazy fpu allocation
      changes:
      
      0. CPU's TS flag is set
      
      1. kernel using FPU in some optimized copy  routine and while doing
      kernel_fpu_begin() takes an interrupt just before doing clts()
      
      2. Takes an interrupt and ipsec uses padlock instruction. And we
      take a DNA fault as TS flag is still set.
      
      3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts
      
      4. We complete the padlock routine
      
      5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
      the optimized copy routine and does kernel_fpu_end(). At this point,
      we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
      set and not cleared.
      
      6. Now kernel resumes its user operation. And at the next context
      switch, kernel sees it has do a FP save as TS_USEDFPU is still set
      and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
      will take a DNA fault, as cr0.ts is '1' and now, because we are
      in __switch_to(), math_state_restore() will get confused and will
      restore the next task's FP state and will save it in prev tasks's FP state.
      Remember, in __switch_to() we are already on the stack of the next task
      but take a DNA fault for the prev task.
      
      This causes the fpu leakage.
      
      Fix the padlock instruction usage by calling them inside the
      context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
      manually in the interrupt context. This will not generate spurious DNA
      in the  context of the interrupt which will fix the oops encountered and
      the possible FPU leakage issue.
      Reported-and-bisected-by: NWolfgang Walter <wolfgang.walter@stwm.de>
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      e4914012
  13. 31 7月, 2008 5 次提交
  14. 26 7月, 2008 1 次提交
    • S
      x64, fpu: fix possible FPU leakage in error conditions · 6ffac1e9
      Suresh Siddha 提交于
      On Thu, Jul 24, 2008 at 03:43:44PM -0700, Linus Torvalds wrote:
      > So how about this patch as a starting point? This is the RightThing(tm) to
      > do regardless, and if it then makes it easier to do some other cleanups,
      > we should do it first. What do you think?
      
      restore_fpu_checking() calls init_fpu() in error conditions.
      
      While this is wrong(as our main intention is to clear the fpu state of
      the thread), this was benign before commit 92d140e2 ("x86: fix taking
      DNA during 64bit sigreturn").
      
      Post commit 92d140e2, live FPU registers may not belong to this
      process at this error scenario.
      
      In the error condition for restore_fpu_checking() (especially during the
      64bit signal return), we are doing init_fpu(), which saves the live FPU
      register state (possibly belonging to some other process context) into
      the thread struct (through unlazy_fpu() in init_fpu()). This is wrong
      and can leak the FPU data.
      
      For the signal handler restore error condition in restore_i387(), clear
      the fpu state present in the thread struct(before ultimately sending a
      SIGSEGV for badframe).
      
      For the paranoid error condition check in math_state_restore(), send a
      SIGSEGV, if we fail to restore the state.
      Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: <stable@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6ffac1e9
  15. 25 7月, 2008 1 次提交
    • L
      x86-64: Clean up 'save/restore_i387()' usage · b30f3ae5
      Linus Torvalds 提交于
      Suresh Siddha wants to fix a possible FPU leakage in error conditions,
      but the fact that save/restore_i387() are inlines in a header file makes
      that harder to do than necessary.  So start off with an obvious cleanup.
      
      This just moves the x86-64 version of save/restore_i387() out of the
      header file, and moves it to the only file that it is actually used in:
      arch/x86/kernel/signal_64.c.  So exposing it in a header file was wrong
      to begin with.
      
      [ Side note: I'd like to fix up some of the games we play with the
        32-bit version of these functions too, but that's a separate
        matter.  The 32-bit versions are shared - under different names
        at that! - by both the native x86-32 code and the x86-64 32-bit
        compatibility code ]
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b30f3ae5
  16. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  17. 22 7月, 2008 1 次提交
  18. 04 6月, 2008 1 次提交
  19. 11 5月, 2008 1 次提交
  20. 20 4月, 2008 3 次提交
  21. 17 4月, 2008 1 次提交
  22. 19 2月, 2008 1 次提交
  23. 04 2月, 2008 1 次提交
  24. 30 1月, 2008 3 次提交
  25. 11 10月, 2007 1 次提交