1. 19 5月, 2015 7 次提交
    • I
      x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again) · 7366ed77
      Ingo Molnar 提交于
      So 6 years ago we made the FPU fpstate dynamically allocated:
      
        aa283f49 ("x86, fpu: lazy allocation of FPU area - v5")
        61c4628b ("x86, fpu: split FPU state from task struct - v5")
      
      In hindsight this was a mistake:
      
         - it complicated context allocation failure handling, such as:
      
      		/* kthread execs. TODO: cleanup this horror. */
      		if (WARN_ON(fpstate_alloc_init(fpu)))
      			force_sig(SIGKILL, tsk);
      
         - it caused us to enable irqs in fpu__restore():
      
                      local_irq_enable();
                      /*
                       * does a slab alloc which can sleep
                       */
                      if (fpstate_alloc_init(fpu)) {
                              /*
                               * ran out of memory!
                               */
                              do_group_exit(SIGKILL);
                              return;
                      }
                      local_irq_disable();
      
         - it (slightly) slowed down task creation/destruction by adding
           slab allocation/free pattens.
      
         - it made access to context contents (slightly) slower by adding
           one more pointer dereference.
      
      The motivation for the dynamic allocation was two-fold:
      
         - reduce memory consumption by non-FPU tasks
      
         - allocate and handle only the necessary amount of context for
           various XSAVE processors that have varying hardware frame
           sizes.
      
      These days, with glibc using SSE memcpy by default and GCC optimizing
      for SSE/AVX by default, the scope of FPU using apps on an x86 system is
      much larger than it was 6 years ago.
      
      For example on a freshly installed Fedora 21 desktop system, with a
      recent kernel, all non-kthread tasks have used the FPU shortly after
      bootup.
      
      Also, even modern embedded x86 CPUs try to support the latest vector
      instruction set - so they'll too often use the larger xstate frame
      sizes.
      
      So remove the dynamic allocation complication by embedding the FPU
      fpstate in task_struct again. This should make the FPU a lot more
      accessible to all sorts of atomic contexts.
      
      We could still optimize for the xstate frame size in the future,
      by moving the state structure to the last element of task_struct,
      and allocating only a part of that.
      
      This change is kept minimal by still keeping the ctx_alloc()/free()
      routines (that now do nothing substantial) - we'll remove them in
      the following patches.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      7366ed77
    • I
      x86/fpu: Move various internal function prototypes to fpu/internal.h · 952f07ec
      Ingo Molnar 提交于
      There are a number of FPU internal function prototypes and an inline function
      in fpu/api.h, mostly placed so historically as the code grew over the years.
      
      Move them over into fpu/internal.h where they belong. (Add sched.h include
      to stackprotector.h which incorrectly relied on getting it from fpu/api.h.)
      
      fpu/api.h is now a pure file that only contains FPU APIs intended for driver
      use.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      952f07ec
    • I
      x86/fpu: Rename i387.h to fpu/api.h · df6b35f4
      Ingo Molnar 提交于
      We already have fpu/types.h, move i387.h to fpu/api.h.
      
      The file name has become a misnomer anyway: it offers generic FPU APIs,
      but is not limited to i387 functionality.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      df6b35f4
    • I
      x86/fpu: Use 'struct fpu' in fpstate_alloc_init() · db2b1d3a
      Ingo Molnar 提交于
      Migrate this function to pure 'struct fpu' usage.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      db2b1d3a
    • I
      x86/fpu: Get rid of PF_USED_MATH usage, convert it to fpu->fpstate_active · c5bedc68
      Ingo Molnar 提交于
      Introduce a simple fpu->fpstate_active flag in the fpu context data structure
      and use that instead of PF_USED_MATH in task->flags.
      
      Testing for this flag byte should be slightly more efficient than
      testing a bit in a bitmask, but the main advantage is that most
      FPU functions can now be performed on a 'struct fpu' alone, they
      don't need access to 'struct task_struct' anymore.
      
      There's a slight linecount increase, mostly due to the 'fpu' local
      variables and due to extra comments. The local variables will go away
      once we move most of the FPU methods to pure 'struct fpu' parameters.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c5bedc68
    • I
      x86/fpu: Open code PF_USED_MATH usages · 4c138410
      Ingo Molnar 提交于
      PF_USED_MATH is used directly, but also in a handful of helper inlines.
      
      To ease the elimination of PF_USED_MATH, convert all inline helpers
      to open-coded PF_USED_MATH usage.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4c138410
    • I
      x86/fpu: Split an fpstate_alloc_init() function out of init_fpu() · 97185c95
      Ingo Molnar 提交于
      Most init_fpu() users don't want the register-saving aspect of the
      function, they are calling it for 'current' and when FPU registers
      are not allocated and initialized yet.
      
      Split out a simplified API that does just that (and add debug-checks
      for these conditions): fpstate_alloc_init().
      
      Use it where appropriate.
      Reviewed-by: NBorislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      97185c95
  2. 06 5月, 2014 1 次提交
  3. 28 1月, 2014 1 次提交
  4. 13 3月, 2012 1 次提交
  5. 11 5月, 2010 2 次提交
  6. 05 3月, 2009 1 次提交
    • D
      x86, math-emu: fix init_fpu for task != current · ab9e1858
      Daniel Glöckner 提交于
      Impact: fix math-emu related crash while using GDB/ptrace
      
      init_fpu() calls finit to initialize a task's xstate, while finit always
      works on the current task. If we use PTRACE_GETFPREGS on another
      process and both processes did not already use floating point, we get
      a null pointer exception in finit.
      
      This patch creates a new function finit_task that takes a task_struct
      parameter. finit becomes a wrapper that simply calls finit_task with
      current. On the plus side this avoids many calls to get_current which
      would each resolve to an inline assembler mov instruction.
      
      An empty finit_task has been added to i387.h to avoid linker errors in
      case the compiler still emits the call in init_fpu when
      CONFIG_MATH_EMULATION is not defined.
      
      The declaration of finit in i387.h has been removed as the remaining
      code using this function gets its prototype from fpu_proto.h.
      Signed-off-by: NDaniel Glöckner <dg@emlix.com>
      Cc: Suresh Siddha <suresh.b.siddha@intel.com>
      Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Bill Metzenthen <billm@melbpc.org.au>
      LKML-Reference: <E1Lew31-0004il-Fg@mailer.emlix.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ab9e1858
  7. 10 2月, 2009 2 次提交
    • T
      x86: add %gs accessors for x86_32 · d9a89a26
      Tejun Heo 提交于
      Impact: cleanup
      
      On x86_32, %gs is handled lazily.  It's not saved and restored on
      kernel entry/exit but only when necessary which usually is during task
      switch but there are few other places.  Currently, it's done by
      calling savesegment() and loadsegment() explicitly.  Define
      get_user_gs(), set_user_gs() and task_user_gs() and use them instead.
      
      While at it, clean up register access macros in signal.c.
      
      This cleans up code a bit and will help future changes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d9a89a26
    • T
      x86: fix math_emu register frame access · d315760f
      Tejun Heo 提交于
      do_device_not_available() is the handler for #NM and it declares that
      it takes a unsigned long and calls math_emu(), which takes a long
      argument and surprisingly expects the stack frame starting at the zero
      argument would match struct math_emu_info, which isn't true regardless
      of configuration in the current code.
      
      This patch makes do_device_not_available() take struct pt_regs like
      other exception handlers and initialize struct math_emu_info with
      pointer to it and pass pointer to the math_emu_info to math_emulate()
      like normal C functions do.  This way, unless gcc makes a copy of
      struct pt_regs in do_device_not_available(), the register frame is
      correctly accessed regardless of kernel configuration or compiler
      used.
      
      This doesn't fix all math_emu problems but it at least gets it
      somewhat working.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      d315760f
  8. 09 2月, 2009 1 次提交
  9. 18 6月, 2008 1 次提交
    • P
      x86: coding style fixes to arch/x86/math-emu/reg_constant · f016e15c
      Paolo Ciarrocchi 提交于
      Before:
      total: 6 errors, 1 warnings, 117 lines checked
      
      After:
      total: 0 errors, 1 warnings, 117 lines checked
      
      paolo@paolo-desktop:~/linux.trees.git$ md5sum /tmp/reg_constant.o.*
      780388a3056d58fb759efaf190d5d3d1  /tmp/reg_constant.o.after
      780388a3056d58fb759efaf190d5d3d1  /tmp/reg_constant.o.before
      
      paolo@paolo-desktop:~/linux.trees.git$ size /tmp/reg_constant.o.*
         text    data     bss     dec     hex filename
          457       0       0     457     1c9 /tmp/reg_constant.o.after
          457       0       0     457     1c9 /tmp/reg_constant.o.before
      Signed-off-by: NPaolo Ciarrocchi <paolo.ciarrocchi@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f016e15c
  10. 04 6月, 2008 1 次提交
  11. 20 4月, 2008 1 次提交
  12. 17 4月, 2008 2 次提交
  13. 30 1月, 2008 5 次提交
  14. 15 10月, 2007 1 次提交
  15. 11 10月, 2007 1 次提交