“934d44ae641318fea7c36d049742d9431f8903e1”上不存在“tools/manylinux1/Dockerfile.cuda9_cudnn7_gcc48_py35_centos6”
  1. 13 10月, 2010 1 次提交
  2. 13 1月, 2010 1 次提交
    • P
      sh: Move over to dynamically allocated FPU context. · 0ea820cf
      Paul Mundt 提交于
      This follows the x86 xstate changes and implements a task_xstate slab
      cache that is dynamically sized to match one of hard FP/soft FP/FPU-less.
      
      This also tidies up and consolidates some of the SH-2A/SH-4 FPU
      fragmentation. Now fpu state restorers are commonly defined, with the
      init_fpu()/fpu_init() mess reworked to follow the x86 convention.
      The fpu_init() register initialization has been replaced by xstate setup
      followed by writing out to hardware via the standard restore path.
      
      As init_fpu() now performs a slab allocation a secondary lighterweight
      restorer is also introduced for the context switch.
      
      In the future the DSP state will be rolled in here, too.
      
      More work remains for math emulation and the SH-5 FPU, which presently
      uses its own special (UP-only) interfaces.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      0ea820cf
  3. 24 11月, 2009 1 次提交
    • S
      sh: Minor optimisations to FPU handling · d3ea9fa0
      Stuart Menefy 提交于
      A number of small optimisations to FPU handling, in particular:
      
       - move the task USEDFPU flag from the thread_info flags field (which
         is accessed asynchronously to the thread) to a new status field,
         which is only accessed by the thread itself. This allows locking to
         be removed in most cases, or can be reduced to a preempt_lock().
         This mimics the i386 behaviour.
      
       - move the modification of regs->sr and thread_info->status flags out
         of save_fpu() to __unlazy_fpu(). This gives the compiler a better
         chance to optimise things, as well as making save_fpu() symmetrical
         with restore_fpu() and init_fpu().
      
       - implement prepare_to_copy(), so that when creating a thread, we can
         unlazy the FPU prior to copying the thread data structures.
      
      Also make sure that the FPU is disabled while in the kernel, in
      particular while booting, and for newly created kernel threads,
      
      In a very artificial benchmark, the execution time for 2500000
      context switches was reduced from 50 to 45 seconds.
      Signed-off-by: NStuart Menefy <stuart.menefy@st.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      d3ea9fa0
  4. 11 6月, 2007 1 次提交
  5. 21 5月, 2007 1 次提交
  6. 03 10月, 2006 1 次提交
  7. 27 9月, 2006 1 次提交