提交 · d2d0ac9a4644e00120bb9b7427a512a99d2cacc5 · openeuler / Kernel

23 3月, 2015 3 次提交

x86/fpu: Fold __drop_fpu() into its sole user · d2d0ac9a

由 Borislav Petkov 提交于 3月 14, 2015

Fold it into drop_fpu(). Phew, one less FPU function to pay attention
to.

No functionality change.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NOleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

d2d0ac9a

x86/fpu: Introduce restore_init_xstate() · 8f4d8186

由 Oleg Nesterov 提交于 3月 11, 2015

Extract the "use_eager_fpu()" code from drop_init_fpu() into a new,
simple helper restore_init_xstate(). The next patch adds another user.

- It is not clear why we do not check use_fxsr() like fpu_restore_checking()
  does. eager_fpu_init_bp() calls setup_init_fpu_buf() too, and we have the
  "eagerfpu=on" kernel option.

- Ignoring the fact that init_xstate_buf is "struct xsave_struct *", not
  "union thread_xstate *", it is not clear why we can not simply use
  fpu_restore_checking() and avoid the code duplication.

- It is not clear why we can't call setup_init_fpu_buf() unconditionally
  to always create init_xstate_buf(). Then do_device_not_available() path
  (at least) could use restore_init_xstate() too. It doesn't need to init
  fpu->state, its content doesn't matter until unlazy_fpu()/__switch_to()/etc
  which overwrites this memory anyway.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Link: http://lkml.kernel.org/r/20150311173429.GD5032@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

8f4d8186

x86/fpu: Document user_fpu_begin() · fb14b4ea

由 Oleg Nesterov 提交于 3月 11, 2015

Currently, user_fpu_begin() has a single caller and it is not clear why
do we actually need it and why we should not worry about preemption
right after preempt_enable().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Link: http://lkml.kernel.org/r/20150311173409.GC5032@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fb14b4ea

13 3月, 2015 1 次提交

x86/fpu: Drop_fpu() should not assume that tsk equals current · f4c36863

由 Oleg Nesterov 提交于 3月 13, 2015

drop_fpu() does clear_used_math() and usually this is correct
because tsk == current.

However switch_fpu_finish()->restore_fpu_checking() is called before
__switch_to() updates the "current_task" variable. If it fails,
we will wrongly clear the PF_USED_MATH flag of the previous task.

So use clear_stopped_child_used_math() instead.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NRik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

f4c36863

10 3月, 2015 1 次提交

x86/fpu: Factor out memset(xstate, 0) in fpu_finit() paths · 1d23c451

由 Oleg Nesterov 提交于 3月 10, 2015

fx_finit() has two users but only fpu_finit() needs to clear
xstate, alloc_bootmem_align() in setup_init_fpu_buf() returns
zero-filled memory.

And note that both memset()'s look confusing. Yes, offsetof() is
0 for ->fxsave or ->fsave, but it would be cleaner to turn
them into a single memset() which zeroes fpu->state.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Acked-by: NRik van Riel <riel@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1425967585-4725-2-git-send-email-bp@alien8.de
Link: http://lkml.kernel.org/r/20150302183257.GC23085@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1d23c451

19 2月, 2015 6 次提交

x86/fpu: Also check fpu_lazy_restore() when use_eager_fpu() · 728e53fe

由 Rik van Riel 提交于 2月 06, 2015

With Oleg's patch:

  33a3ebdc ("x86, fpu: Don't abuse has_fpu in __kernel_fpu_begin/end()")

kernel threads no longer have an FPU state even on systems with
use_eager_fpu().

That in turn means that a task may still have its FPU state
loaded in the FPU registers, if the task only got interrupted by
kernel threads from when it went to sleep, to when it woke up
again.

In that case, there is no need to restore the FPU state for
this task, since it is still in the registers.

The kernel can simply use the same logic to determine this as
is used for !use_eager_fpu() systems.
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/1423252925-14451-9-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>

728e53fe

x86/fpu: Use task_disable_lazy_fpu_restore() helper · 6a5fe895

由 Rik van Riel 提交于 2月 06, 2015

Replace magic assignments of fpu.last_cpu = ~0 with more explicit
task_disable_lazy_fpu_restore() calls.
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1423252925-14451-8-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>

6a5fe895

x86/fpu: Use an explicit if/else in switch_fpu_prepare() · 1361ef29

由 Rik van Riel 提交于 2月 06, 2015

Use an explicit if/else branch after __save_init_fpu(old) in
switch_fpu_prepare(). This makes substituting the assignment with a call
in task_disable_lazy_fpu_restore() in the next patch easier to review.
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/1423252925-14451-7-git-send-email-riel@redhat.com
[ Space out stuff for more readability. ]
Signed-off-by: NBorislav Petkov <bp@suse.de>

1361ef29

x86/fpu: Introduce task_disable_lazy_fpu_restore() helper · 33e03ded

由 Rik van Riel 提交于 2月 06, 2015

Currently there are a few magic assignments sprinkled through the
code that disable lazy FPU state restoring, some more effective than
others, and all equally mystifying.

It would be easier to have a helper to explicitly disable lazy
FPU state restoring for a task.
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/1423252925-14451-6-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>

33e03ded

x86/fpu: Move lazy restore functions up a few lines · 1c927eea

由 Rik van Riel 提交于 2月 06, 2015

We need another lazy restore related function, that will be called
from a function that is above where the lazy restore functions are
now. It would be nice to keep all three functions grouped together.
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/1423252925-14451-5-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>

1c927eea

x86/fpu: Change math_error() to use unlazy_fpu(), kill (now) unused save_init_fpu() · 08a744c6

由 Oleg Nesterov 提交于 2月 06, 2015

math_error() calls save_init_fpu() after conditional_sti(), this means
that the caller can be preempted. If !use_eager_fpu() we can hit the
WARN_ON_ONCE(!__thread_has_fpu(tsk)) and/or save the wrong FPU state.

Change math_error() to use unlazy_fpu() and kill save_init_fpu().
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Signed-off-by: NRik van Riel <riel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1423252925-14451-4-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>

08a744c6

23 12月, 2014 1 次提交

x86/fpu: Use a symbolic name for asm operand · 6ca7a8a1

由 Borislav Petkov 提交于 12月 21, 2014

Fix up the else-case in fpu_fxsave() which seems like it has
been overlooked. Correct comment style in restore_fpu_checking()
while at it.
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1419170543-11393-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

6ca7a8a1

03 9月, 2014 1 次提交

x86, fpu: Change __thread_fpu_begin() to use use_eager_fpu() · 31d96338

由 Oleg Nesterov 提交于 9月 02, 2014

__thread_fpu_begin() checks X86_FEATURE_EAGER_FPU by hand, we have
a helper for that.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20140902175720.GA21656@redhat.comReviewed-by: NSuresh Siddha <sbsiddha@gmail.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

31d96338

19 6月, 2014 1 次提交

x86, cpufeature: Convert more "features" to bugs · 9b13a93d

由 Borislav Petkov 提交于 6月 18, 2014

X86_FEATURE_FXSAVE_LEAK, X86_FEATURE_11AP and
X86_FEATURE_CLFLUSH_MONITOR are not really features but synthetic bits
we use for applying different bug workarounds. Call them what they
really are, and make sure they get the proper cross-CPU behavior (OR
rather than AND).
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1403042783-23278-1-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

9b13a93d

30 5月, 2014 1 次提交

x86/xsaves: Save xstate to task's xsave area in __save_fpu during booting time · f41d830f

由 Fenghua Yu 提交于 5月 29, 2014

__save_fpu() can be called during early booting time when cpu caps are not
enabled and alternative can not be used yet. Therefore, it calls
xsave_state_booting() during booting time to save xstate to task's xsave area.
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1401387164-43416-14-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

f41d830f

17 4月, 2014 1 次提交

x86, fpu: Extend the use of static_cpu_has_safe · c6b40691

由 Matt Fleming 提交于 3月 27, 2014

It may be necessary to save and restore the FPU context during EFI runtime
system services calls. However, this may happen during boot and before
alternatives have run. Thus, we need to use static_cpu_has_safe instead.

The rationale behind the use of static_cpu_has_safe is the same as in
commit 5f8c4218 ("x86, fpu: Use static_cpu_has_safe
before alternatives") by Borislav Petkov.
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>

c6b40691

12 1月, 2014 1 次提交

x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround · 26bef131

由 Linus Torvalds 提交于 1月 11, 2014

Before we do an EMMS in the AMD FXSAVE information leak workaround we
need to clear any pending exceptions, otherwise we trap with a
floating-point exception inside this code.
Reported-by: Nhalfdog <me@halfdog.net>
Tested-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

26bef131

13 11月, 2013 1 次提交

x86: move fpu_counter into ARCH specific thread_struct · c375f15a

由 Vineet Gupta 提交于 11月 12, 2013

Only a couple of arches (sh/x86) use fpu_counter in task_struct so it can
be moved out into ARCH specific thread_struct, reducing the size of
task_struct for other arches.

Compile tested i386_defconfig + gcc 4.7.3
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Cc: Paul Mundt <paul.mundt@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c375f15a

21 6月, 2013 1 次提交

x86, fpu: Use static_cpu_has_safe before alternatives · 5f8c4218

由 Borislav Petkov 提交于 6月 09, 2013

The call stack below shows how this happens: basically eager_fpu_init()
calls __thread_fpu_begin(current) which then does if (!use_eager_fpu()),
which, in turn, uses static_cpu_has.

And we're executing before alternatives so static_cpu_has doesn't work
there yet.

Use the safe variant in this path which becomes optimal after
alternatives have run.

WARNING: at arch/x86/kernel/cpu/common.c:1368 warn_pre_alternatives+0x1e/0x20()
You're using static_cpu_has before alternatives have run!
Modules linked in:
Pid: 0, comm: swapper Not tainted 3.9.0-rc8+ #1
Call Trace:
 warn_slowpath_common
 warn_slowpath_fmt
 ? fpu_finit
 warn_pre_alternatives
 eager_fpu_init
 fpu_init
 cpu_init
 trap_init
 start_kernel
 ? repair_env_string
 x86_64_start_reservations
 x86_64_start_kernel
Signed-off-by: NBorislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1370772454-6106-6-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

5f8c4218

07 6月, 2013 1 次提交

x86: Get rid of ->hard_math and all the FPU asm fu · 60e019eb

由 H. Peter Anvin 提交于 4月 29, 2013

Reimplement FPU detection code in C and drop old, not-so-recommended
detection method in asm. Move all the relevant stuff into i387.c where
it conceptually belongs. Finally drop cpuinfo_x86.hard_math.

[ hpa: huge thanks to Borislav for taking my original concept patch
  and productizing it ]

[ Boris, note to self: do not use static_cpu_has before alternatives! ]
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1367244262-29511-2-git-send-email-bp@alien8.de
Link: http://lkml.kernel.org/r/1365436666-9837-2-git-send-email-bp@alien8.deSigned-off-by: NBorislav Petkov <bp@suse.de>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

60e019eb

14 2月, 2013 1 次提交
- A
  x86: convert to ksignal · 235b8022
  由 Al Viro 提交于 11月 09, 2012
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  235b8022
01 12月, 2012 1 次提交

x86, fpu: Avoid FPU lazy restore after suspend · 644c1541

由 Vincent Palatin 提交于 11月 30, 2012

When a cpu enters S3 state, the FPU state is lost.
After resuming for S3, if we try to lazy restore the FPU for a process running
on the same CPU, this will result in a corrupted FPU context.

Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU,
so nobody will try to lazy restore a state which doesn't exist in the hardware.

Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
by doing thousands of suspend/resume cycles with 4 processes doing FPU
operations running. Without the patch, a process is killed after a
few hundreds cycles by a SIGFPE.

Cc: Duncan Laurie <dlaurie@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Cc: <stable@kernel.org> v3.4+ # for 3.4 need to replace this_cpu_write by percpu_write
Signed-off-by: NVincent Palatin <vpalatin@chromium.org>
Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

644c1541

26 9月, 2012 1 次提交

x86, smap: Do not abuse the [f][x]rstor_checking() functions for user space · e139e955

由 H. Peter Anvin 提交于 9月 25, 2012

With SMAP, the [f][x]rstor_checking() functions are no longer usable
for user-space pointers by applying a simple __force cast.  Instead,
create new [f][x]rstor_user() functions which do the proper SMAP
magic.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1343171129-2747-3-git-send-email-suresh.b.siddha@intel.com

e139e955

22 9月, 2012 1 次提交

x86, smap: Add STAC and CLAC instructions to control user space access · 63bcff2a

由 H. Peter Anvin 提交于 9月 21, 2012

When Supervisor Mode Access Prevention (SMAP) is enabled, access to
userspace from the kernel is controlled by the AC flag.  To make the
performance of manipulating that flag acceptable, there are two new
instructions, STAC and CLAC, to set and clear it.

This patch adds those instructions, via alternative(), when the SMAP
feature is enabled.  It also adds X86_EFLAGS_AC unconditionally to the
SYSCALL entry mask; there is simply no reason to make that one
conditional.
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com

63bcff2a

19 9月, 2012 8 次提交

x86, fpu: remove cpu_has_xmm check in the fx_finit() · a8615af4

由 Suresh Siddha 提交于 9月 10, 2012

CPUs with FXSAVE but no XMM/MXCSR (Pentium II from Intel,
Crusoe/TM-3xxx/5xxx from Transmeta, and presumably some of the K6
generation from AMD) ever looked at the mxcsr field during
fxrstor/fxsave. So remove the cpu_has_xmm check in the fx_finit()
Reported-by: NAl Viro <viro@zeniv.linux.org.uk>
Acked-by: NH. Peter Anvin <hpa@zytor.com>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1347300665-6209-6-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

a8615af4

x86, fpu: decouple non-lazy/eager fpu restore from xsave · 5d2bd700

由 Suresh Siddha 提交于 9月 06, 2012

Decouple non-lazy/eager fpu restore policy from the existence of the xsave
feature. Introduce a synthetic CPUID flag to represent the eagerfpu
policy. "eagerfpu=on" boot paramter will enable the policy.
Requested-by: NH. Peter Anvin <hpa@zytor.com>
Requested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1347300665-6209-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

5d2bd700

x86, fpu: use non-lazy fpu restore for processors supporting xsave · 304bceda

由 Suresh Siddha 提交于 8月 24, 2012

Fundamental model of the current Linux kernel is to lazily init and
restore FPU instead of restoring the task state during context switch.
This changes that fundamental lazy model to the non-lazy model for
the processors supporting xsave feature.

Reasons driving this model change are:

i. Newer processors support optimized state save/restore using xsaveopt and
xrstor by tracking the INIT state and MODIFIED state during context-switch.
This is faster than modifying the cr0.TS bit which has serializing semantics.

ii. Newer glibc versions use SSE for some of the optimized copy/clear routines.
With certain workloads (like boot, kernel-compilation etc), application
completes its work with in the first 5 task switches, thus taking upto 5 #DNA
traps with the kernel not getting a chance to apply the above mentioned
pre-load heuristic.

iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit
and thus will not work correctly in the presence of lazy restore. Non-lazy
state restore is needed for enabling such features.

Some data on a two socket SNB system:
 * Saved 20K DNA exceptions during boot on a two socket SNB system.
 * Saved 50K DNA exceptions during kernel-compilation workload.
 * Improved throughput of the AVX based checksumming function inside the
   kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts
   pair.

Also now kernel_fpu_begin/end() relies on the patched
alternative instructions. So move check_fpu() which uses the
kernel_fpu_begin/end() after alternative_instructions().
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com
Merge 32-bit boot fix from,
Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

304bceda

x86, fpu: remove unnecessary user_fpu_end() in save_xstate_sig() · 377ffbcc

由 Suresh Siddha 提交于 8月 24, 2012

Few lines below we do drop_fpu() which is more safer. Remove the
unnecessary user_fpu_end() in save_xstate_sig(), which allows
the drop_fpu() to ignore any pending exceptions from the user-space
and drop the current fpu.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1345842782-24175-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

377ffbcc

x86, fpu: drop_fpu() before restoring new state from sigframe · e9625917

由 Suresh Siddha 提交于 8月 24, 2012

No need to save the state with unlazy_fpu(), that is about to get overwritten
by the state from the signal frame. Instead use drop_fpu() and continue
to restore the new state.

Also fold the stop_fpu_preload() into drop_fpu().
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1345842782-24175-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

e9625917

x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels · 72a671ce

由 Suresh Siddha 提交于 7月 24, 2012

Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied
to/from the fpstate in the task struct.

And in the case of signal delivery for x86_64 binaries, if the fpstate is live
in the CPU registers, then the live state is copied directly to the user
sigframe. Otherwise fpstate in the task struct is copied to the user sigframe.
During restore, fpstate in the user sigframe is restored directly to the live
CPU registers.

Historically, different code paths led to different bugs. For example,
x86_64 code path was not preemption safe till recently. Also there is lot
of code duplication for support of new features like xsave etc.

Unify signal handling code paths for x86 and x86_64 kernels.

New strategy is as follows:

Signal delivery: Both for 32/64-bit frames, align the core math frame area to
64bytes as needed by xsave (this where the main fpu/extended state gets copied
to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave
frames). If the state is live, copy the register state directly to the user
frame. If not live, copy the state in the thread struct to the user frame. And
for 32-bit [f]xsave frames, construct the fsave header separately before
the actual [f]xsave area.

Signal return: As the 32-bit frames with [f]xstate has an additional
'fsave' header, copy everything back from the user sigframe to the
fpstate in the task structure and reconstruct the fxstate from the 'fsave'
header (Also user passed pointers may not be correctly aligned for
any attempt to directly restore any partial state). At the next fpstate usage,
everything will be restored to the live CPU registers.
For all the 64-bit frames and the 32-bit fsave frame, restore the state from
the user sigframe directly to the live CPU registers. 64-bit signals always
restored the math frame directly, so we can expect the math frame pointer
to be correctly aligned. For 32-bit fsave frames, there are no alignment
requirements, so we can restore the state directly.

"lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are
with in the noise range with this change.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com
[ Merged in compilation fix ]
Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

72a671ce

x86, fpu: Consolidate inline asm routines for saving/restoring fpu state · 0ca5bd0d

由 Suresh Siddha 提交于 7月 24, 2012

Consolidate x86, x86_64 inline asm routines saving/restoring fpu state
using config_enabled().
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1343171129-2747-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

0ca5bd0d

x86, signal: Cleanup ifdefs and is_ia32, is_x32 · 050902c0

由 Suresh Siddha 提交于 7月 24, 2012

Use config_enabled() to cleanup the definitions of is_ia32/is_x32. Move
the function prototypes to the header file to cleanup ifdefs,
and move the x32_setup_rt_frame() code around.
Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1343171129-2747-2-git-send-email-suresh.b.siddha@intel.com
Merged in compilation fix from,
Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

050902c0

15 5月, 2012 1 次提交

x86: replace percpu_xxx funcs with this_cpu_xxx · c6ae41e7

由 Alex Shi 提交于 5月 11, 2012

Since percpu_xxx() serial functions are duplicated with this_cpu_xxx().
Removing percpu_xxx() definition and replacing them by this_cpu_xxx()
in code. There is no function change in this patch, just preparation for
later percpu_xxx serial function removing.

On x86 machine the this_cpu_xxx() serial functions are same as
__this_cpu_xxx() without no unnecessary premmpt enable/disable.

Thanks for Stephen Rothwell, he found and fixed a i386 build error in
the patch.

Also thanks for Andrew Morton, he kept updating the patchset in Linus'
tree.
Signed-off-by: NAlex Shi <alex.shi@intel.com>
Acked-by: NChristoph Lameter <cl@gentwo.org>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: N"H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NTejun Heo <tj@kernel.org>

c6ae41e7

22 2月, 2012 2 次提交

i387: Split up <asm/i387.h> into exported and internal interfaces · 1361b83a

由 Linus Torvalds 提交于 2月 21, 2012

While various modules include <asm/i387.h> to get access to things we
actually *intend* for them to use, most of that header file was really
pretty low-level internal stuff that we really don't want to expose to
others.

So split the header file into two: the small exported interfaces remain
in <asm/i387.h>, while the internal definitions that are only used by
core architecture code are now in <asm/fpu-internal.h>.

The guiding principle for this was to expose functions that we export to
modules, and leave them in <asm/i387.h>, while stuff that is used by
task switching or was marked GPL-only is in <asm/fpu-internal.h>.

The fpu-internal.h file could be further split up too, especially since
arch/x86/kvm/ uses some of the remaining stuff for its module. But that
kvm usage should probably be abstracted out a bit, and at least now the
internal FPU accessor functions are much more contained. Even if it
isn't perhaps as contained as it _could_ be.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

1361b83a

i387: Uninline the generic FP helpers that we expose to kernel modules · 8546c008

由 Linus Torvalds 提交于 2月 21, 2012

Instead of exporting the very low-level internals of the FPU state
save/restore code (ie things like 'fpu_owner_task'), we should export
the higher-level interfaces.

Inlining these things is pointless anyway: sure, sometimes the end
result is small, but while 'stts()' can result in just three x86
instructions, those are not cheap instructions (writing %cr0 is a
serializing instruction and a very slow one at that).

So the overhead of a function call is not noticeable, and we really
don't want random modules mucking about with our internal state save
logic anyway.

So this unexports 'fpu_owner_task', and instead uninlines and exports
the actual functions that modules can use: fpu_kernel_begin/end() and
unlazy_fpu().
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211339590.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>

8546c008

21 2月, 2012 3 次提交

i387: support lazy restore of FPU state · 7e16838d

由 Linus Torvalds 提交于 2月 19, 2012

This makes us recognize when we try to restore FPU state that matches
what we already have in the FPU on this CPU, and avoids the restore
entirely if so.

To do this, we add two new data fields:

 - a percpu 'fpu_owner_task' variable that gets written any time we
   update the "has_fpu" field, and thus acts as a kind of back-pointer
   to the task that owns the CPU.  The exception is when we save the FPU
   state as part of a context switch - if the save can keep the FPU
   state around, we leave the 'fpu_owner_task' variable pointing at the
   task whose FP state still remains on the CPU.

 - a per-thread 'last_cpu' field, that indicates which CPU that thread
   used its FPU on last.  We update this on every context switch
   (writing an invalid CPU number if the last context switch didn't
   leave the FPU in a lazily usable state), so we know that *that*
   thread has done nothing else with the FPU since.

These two fields together can be used when next switching back to the
task to see if the CPU still matches: if 'fpu_owner_task' matches the
task we are switching to, we know that no other task (or kernel FPU
usage) touched the FPU on this CPU in the meantime, and if the current
CPU number matches the 'last_cpu' field, we know that this thread did no
other FP work on any other CPU, so the FPU state on the CPU must match
what was saved on last context switch.

In that case, we can avoid the 'f[x]rstor' entirely, and just clear the
CR0.TS bit.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7e16838d

i387: use 'restore_fpu_checking()' directly in task switching code · 80ab6f1e

由 Linus Torvalds 提交于 2月 19, 2012

This inlines what is usually just a couple of instructions, but more
importantly it also fixes the theoretical error case (can that FPU
restore really ever fail? Maybe we should remove the checking).

We can't start sending signals from within the scheduler, we're much too
deep in the kernel and are holding the runqueue lock etc.  So don't
bother even trying.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

80ab6f1e

i387: fix up some fpu_counter confusion · cea20ca3

由 Linus Torvalds 提交于 2月 20, 2012

This makes sure we clear the FPU usage counter for newly created tasks,
just so that we start off in a known state (for example, don't try to
preload the FPU state on the first task switch etc).

It also fixes a thinko in when we increment the fpu_counter at task
switch time, introduced by commit 34ddc81a ("i387: re-introduce FPU
state preloading at context switch time").  We should increment the
*new* task fpu_counter, not the old task, and only if we decide to use
that state (whether lazily or preloaded).
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cea20ca3

19 2月, 2012 2 次提交

i387: re-introduce FPU state preloading at context switch time · 34ddc81a

由 Linus Torvalds 提交于 2月 18, 2012

After all the FPU state cleanups and finally finding the problem that
caused all our FPU save/restore problems, this re-introduces the
preloading of FPU state that was removed in commit b3b0870e ("i387:
do not preload FPU state at task switch time").

However, instead of simply reverting the removal, this reimplements
preloading with several fixes, most notably

 - properly abstracted as a true FPU state switch, rather than as
   open-coded save and restore with various hacks.

   In particular, implementing it as a proper FPU state switch allows us
   to optimize the CR0.TS flag accesses: there is no reason to set the
   TS bit only to then almost immediately clear it again.  CR0 accesses
   are quite slow and expensive, don't flip the bit back and forth for
   no good reason.

 - Make sure that the same model works for both x86-32 and x86-64, so
   that there are no gratuitous differences between the two due to the
   way they save and restore segment state differently due to
   architectural differences that really don't matter to the FPU state.

 - Avoid exposing the "preload" state to the context switch routines,
   and in particular allow the concept of lazy state restore: if nothing
   else has used the FPU in the meantime, and the process is still on
   the same CPU, we can avoid restoring state from memory entirely, just
   re-expose the state that is still in the FPU unit.

   That optimized lazy restore isn't actually implemented here, but the
   infrastructure is set up for it.  Of course, older CPU's that use
   'fnsave' to save the state cannot take advantage of this, since the
   state saving also trashes the state.

In other words, there is now an actual _design_ to the FPU state saving,
rather than just random historical baggage.  Hopefully it's easier to
follow as a result.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

34ddc81a

i387: move TS_USEDFPU flag from thread_info to task_struct · f94edacf

由 Linus Torvalds 提交于 2月 17, 2012

This moves the bit that indicates whether a thread has ownership of the
FPU from the TS_USEDFPU bit in thread_info->status to a word of its own
(called 'has_fpu') in task_struct->thread.has_fpu.

This fixes two independent bugs at the same time:

 - changing 'thread_info->status' from the scheduler causes nasty
   problems for the other users of that variable, since it is defined to
   be thread-synchronous (that's what the "TS_" part of the naming was
   supposed to indicate).

   So perfectly valid code could (and did) do

	ti->status |= TS_RESTORE_SIGMASK;

   and the compiler was free to do that as separate load, or and store
   instructions.  Which can cause problems with preemption, since a task
   switch could happen in between, and change the TS_USEDFPU bit. The
   change to TS_USEDFPU would be overwritten by the final store.

   In practice, this seldom happened, though, because the 'status' field
   was seldom used more than once, so gcc would generally tend to
   generate code that used a read-modify-write instruction and thus
   happened to avoid this problem - RMW instructions are naturally low
   fat and preemption-safe.

 - On x86-32, the current_thread_info() pointer would, during interrupts
   and softirqs, point to a *copy* of the real thread_info, because
   x86-32 uses %esp to calculate the thread_info address, and thus the
   separate irq (and softirq) stacks would cause these kinds of odd
   thread_info copy aliases.

   This is normally not a problem, since interrupts aren't supposed to
   look at thread information anyway (what thread is running at
   interrupt time really isn't very well-defined), but it confused the
   heck out of irq_fpu_usable() and the code that tried to squirrel
   away the FPU state.

   (It also caused untold confusion for us poor kernel developers).

It also turns out that using 'task_struct' is actually much more natural
for most of the call sites that care about the FPU state, since they
tend to work with the task struct for other reasons anyway (ie
scheduling).  And the FPU data that we are going to save/restore is
found there too.

Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to
the %esp issue.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Reported-and-tested-by: NRaphael Prevost <raphael@buro.asia>
Acked-and-tested-by: NSuresh Siddha <suresh.b.siddha@intel.com>
Tested-by: NPeter Anvin <hpa@zytor.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f94edacf

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功