提交 · 7366ed771f6ed95e4c4525c335722888a83b4b6c · OpenHarmony / kernel_linux

19 5月, 2015 1 次提交

x86/fpu: Simplify FPU handling by embedding the fpstate in task_struct (again) · 7366ed77

由 Ingo Molnar 提交于 4月 27, 2015

So 6 years ago we made the FPU fpstate dynamically allocated:

  aa283f49 ("x86, fpu: lazy allocation of FPU area - v5")
  61c4628b ("x86, fpu: split FPU state from task struct - v5")

In hindsight this was a mistake:

   - it complicated context allocation failure handling, such as:

		/* kthread execs. TODO: cleanup this horror. */
		if (WARN_ON(fpstate_alloc_init(fpu)))
			force_sig(SIGKILL, tsk);

   - it caused us to enable irqs in fpu__restore():

                local_irq_enable();
                /*
                 * does a slab alloc which can sleep
                 */
                if (fpstate_alloc_init(fpu)) {
                        /*
                         * ran out of memory!
                         */
                        do_group_exit(SIGKILL);
                        return;
                }
                local_irq_disable();

   - it (slightly) slowed down task creation/destruction by adding
     slab allocation/free pattens.

   - it made access to context contents (slightly) slower by adding
     one more pointer dereference.

The motivation for the dynamic allocation was two-fold:

   - reduce memory consumption by non-FPU tasks

   - allocate and handle only the necessary amount of context for
     various XSAVE processors that have varying hardware frame
     sizes.

These days, with glibc using SSE memcpy by default and GCC optimizing
for SSE/AVX by default, the scope of FPU using apps on an x86 system is
much larger than it was 6 years ago.

For example on a freshly installed Fedora 21 desktop system, with a
recent kernel, all non-kthread tasks have used the FPU shortly after
bootup.

Also, even modern embedded x86 CPUs try to support the latest vector
instruction set - so they'll too often use the larger xstate frame
sizes.

So remove the dynamic allocation complication by embedding the FPU
fpstate in task_struct again. This should make the FPU a lot more
accessible to all sorts of atomic contexts.

We could still optimize for the xstate frame size in the future,
by moving the state structure to the last element of task_struct,
and allocating only a part of that.

This change is kept minimal by still keeping the ctx_alloc()/free()
routines (that now do nothing substantial) - we'll remove them in
the following patches.
Reviewed-by: NBorislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

7366ed77

11 5月, 2010 1 次提交

x86, fpu: Unbreak FPU emulation · c3f8978e

由 H. Peter Anvin 提交于 5月 10, 2010

Unbreak FPU emulation, broken by checkin
86603283:
x86: Introduce 'struct fpu' and related API
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1273135546-29690-3-git-send-email-avi@redhat.com>

c3f8978e

10 2月, 2009 1 次提交

x86: fix math_emu register frame access · d315760f

由 Tejun Heo 提交于 2月 09, 2009

do_device_not_available() is the handler for #NM and it declares that
it takes a unsigned long and calls math_emu(), which takes a long
argument and surprisingly expects the stack frame starting at the zero
argument would match struct math_emu_info, which isn't true regardless
of configuration in the current code.

This patch makes do_device_not_available() take struct pt_regs like
other exception handlers and initialize struct math_emu_info with
pointer to it and pass pointer to the math_emu_info to math_emulate()
like normal C functions do.  This way, unless gcc makes a copy of
struct pt_regs in do_device_not_available(), the register frame is
correctly accessed regardless of kernel configuration or compiler
used.

This doesn't fix all math_emu problems but it at least gets it
somewhat working.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

d315760f

09 2月, 2009 1 次提交

x86: math_emu info cleanup · ae6af41f

由 Tejun Heo 提交于 2月 09, 2009

Impact: cleanup

* Come on, struct info?  s/struct info/struct math_emu_info/

* Use struct pt_regs and kernel_vm86_regs instead of defining its own
  register frame structure.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ae6af41f

20 4月, 2008 1 次提交

x86, fpu: split FPU state from task struct - v5 · 61c4628b

由 Suresh Siddha 提交于 3月 10, 2008

Split the FPU save area from the task struct. This allows easy migration
of FPU context, and it's generally cleaner. It also allows the following
two optimizations:

1) only allocate when the application actually uses FPU, so in the first
lazy FPU trap. This could save memory for non-fpu using apps. Next patch
does this lazy allocation.

2) allocate the right size for the actual cpu rather than 512 bytes always.
Patches enabling xsave/xrstor support (coming shortly) will take advantage
of this.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

61c4628b

11 10月, 2007 1 次提交

i386: move math-emu · da957e11

由 Thomas Gleixner 提交于 10月 11, 2007

Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

da957e11

07 12月, 2006 1 次提交

[PATCH] i386: fix must_checks · d606f88f

由 Randy Dunlap 提交于 12月 07, 2006

Fix __must_check warnings in i386/math-emu.
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NAndi Kleen <ak@suse.de>

d606f88f

17 4月, 2005 1 次提交

Linux-2.6.12-rc2 · 1da177e4

由 Linus Torvalds 提交于 4月 16, 2005

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

1da177e4

OpenHarmony / kernel_linux 上一次同步 3 年多

OpenHarmony / kernel_linux
上一次同步 3 年多