- 23 3月, 2015 3 次提交
-
-
由 Borislav Petkov 提交于
Fold it into drop_fpu(). Phew, one less FPU function to pay attention to. No functionality change. Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NOleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Extract the "use_eager_fpu()" code from drop_init_fpu() into a new, simple helper restore_init_xstate(). The next patch adds another user. - It is not clear why we do not check use_fxsr() like fpu_restore_checking() does. eager_fpu_init_bp() calls setup_init_fpu_buf() too, and we have the "eagerfpu=on" kernel option. - Ignoring the fact that init_xstate_buf is "struct xsave_struct *", not "union thread_xstate *", it is not clear why we can not simply use fpu_restore_checking() and avoid the code duplication. - It is not clear why we can't call setup_init_fpu_buf() unconditionally to always create init_xstate_buf(). Then do_device_not_available() path (at least) could use restore_init_xstate() too. It doesn't need to init fpu->state, its content doesn't matter until unlazy_fpu()/__switch_to()/etc which overwrites this memory anyway. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150311173429.GD5032@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Oleg Nesterov 提交于
Currently, user_fpu_begin() has a single caller and it is not clear why do we actually need it and why we should not worry about preemption right after preempt_enable(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Link: http://lkml.kernel.org/r/20150311173409.GC5032@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 3月, 2015 1 次提交
-
-
由 Oleg Nesterov 提交于
drop_fpu() does clear_used_math() and usually this is correct because tsk == current. However switch_fpu_finish()->restore_fpu_checking() is called before __switch_to() updates the "current_task" variable. If it fails, we will wrongly clear the PF_USED_MATH flag of the previous task. So use clear_stopped_child_used_math() instead. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Reviewed-by: NRik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 10 3月, 2015 1 次提交
-
-
由 Oleg Nesterov 提交于
fx_finit() has two users but only fpu_finit() needs to clear xstate, alloc_bootmem_align() in setup_init_fpu_buf() returns zero-filled memory. And note that both memset()'s look confusing. Yes, offsetof() is 0 for ->fxsave or ->fsave, but it would be cleaner to turn them into a single memset() which zeroes fpu->state. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NBorislav Petkov <bp@suse.de> Acked-by: NRik van Riel <riel@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tavis Ormandy <taviso@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1425967585-4725-2-git-send-email-bp@alien8.de Link: http://lkml.kernel.org/r/20150302183257.GC23085@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 2月, 2015 6 次提交
-
-
由 Rik van Riel 提交于
With Oleg's patch: 33a3ebdc ("x86, fpu: Don't abuse has_fpu in __kernel_fpu_begin/end()") kernel threads no longer have an FPU state even on systems with use_eager_fpu(). That in turn means that a task may still have its FPU state loaded in the FPU registers, if the task only got interrupted by kernel threads from when it went to sleep, to when it woke up again. In that case, there is no need to restore the FPU state for this task, since it is still in the registers. The kernel can simply use the same logic to determine this as is used for !use_eager_fpu() systems. Signed-off-by: NRik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/1423252925-14451-9-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
由 Rik van Riel 提交于
Replace magic assignments of fpu.last_cpu = ~0 with more explicit task_disable_lazy_fpu_restore() calls. Signed-off-by: NRik van Riel <riel@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1423252925-14451-8-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
由 Rik van Riel 提交于
Use an explicit if/else branch after __save_init_fpu(old) in switch_fpu_prepare(). This makes substituting the assignment with a call in task_disable_lazy_fpu_restore() in the next patch easier to review. Signed-off-by: NRik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/1423252925-14451-7-git-send-email-riel@redhat.com [ Space out stuff for more readability. ] Signed-off-by: NBorislav Petkov <bp@suse.de>
-
由 Rik van Riel 提交于
Currently there are a few magic assignments sprinkled through the code that disable lazy FPU state restoring, some more effective than others, and all equally mystifying. It would be easier to have a helper to explicitly disable lazy FPU state restoring for a task. Signed-off-by: NRik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/1423252925-14451-6-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
由 Rik van Riel 提交于
We need another lazy restore related function, that will be called from a function that is above where the lazy restore functions are now. It would be nice to keep all three functions grouped together. Signed-off-by: NRik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/1423252925-14451-5-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
由 Oleg Nesterov 提交于
math_error() calls save_init_fpu() after conditional_sti(), this means that the caller can be preempted. If !use_eager_fpu() we can hit the WARN_ON_ONCE(!__thread_has_fpu(tsk)) and/or save the wrong FPU state. Change math_error() to use unlazy_fpu() and kill save_init_fpu(). Signed-off-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NRik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1423252925-14451-4-git-send-email-riel@redhat.comSigned-off-by: NBorislav Petkov <bp@suse.de>
-
- 23 12月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
Fix up the else-case in fpu_fxsave() which seems like it has been overlooked. Correct comment style in restore_fpu_checking() while at it. Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1419170543-11393-1-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 03 9月, 2014 1 次提交
-
-
由 Oleg Nesterov 提交于
__thread_fpu_begin() checks X86_FEATURE_EAGER_FPU by hand, we have a helper for that. Signed-off-by: NOleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175720.GA21656@redhat.comReviewed-by: NSuresh Siddha <sbsiddha@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 19 6月, 2014 1 次提交
-
-
由 Borislav Petkov 提交于
X86_FEATURE_FXSAVE_LEAK, X86_FEATURE_11AP and X86_FEATURE_CLFLUSH_MONITOR are not really features but synthetic bits we use for applying different bug workarounds. Call them what they really are, and make sure they get the proper cross-CPU behavior (OR rather than AND). Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1403042783-23278-1-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 30 5月, 2014 1 次提交
-
-
由 Fenghua Yu 提交于
__save_fpu() can be called during early booting time when cpu caps are not enabled and alternative can not be used yet. Therefore, it calls xsave_state_booting() during booting time to save xstate to task's xsave area. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1401387164-43416-14-git-send-email-fenghua.yu@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 17 4月, 2014 1 次提交
-
-
由 Matt Fleming 提交于
It may be necessary to save and restore the FPU context during EFI runtime system services calls. However, this may happen during boot and before alternatives have run. Thus, we need to use static_cpu_has_safe instead. The rationale behind the use of static_cpu_has_safe is the same as in commit 5f8c4218 ("x86, fpu: Use static_cpu_has_safe before alternatives") by Borislav Petkov. Signed-off-by: NMatt Fleming <matt.fleming@intel.com> Signed-off-by: NRicardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Borislav Petkov <bp@suse.de>
-
- 12 1月, 2014 1 次提交
-
-
由 Linus Torvalds 提交于
Before we do an EMMS in the AMD FXSAVE information leak workaround we need to clear any pending exceptions, otherwise we trap with a floating-point exception inside this code. Reported-by: Nhalfdog <me@halfdog.net> Tested-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.comSigned-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 13 11月, 2013 1 次提交
-
-
由 Vineet Gupta 提交于
Only a couple of arches (sh/x86) use fpu_counter in task_struct so it can be moved out into ARCH specific thread_struct, reducing the size of task_struct for other arches. Compile tested i386_defconfig + gcc 4.7.3 Signed-off-by: NVineet Gupta <vgupta@synopsys.com> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Paul Mundt <paul.mundt@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 21 6月, 2013 1 次提交
-
-
由 Borislav Petkov 提交于
The call stack below shows how this happens: basically eager_fpu_init() calls __thread_fpu_begin(current) which then does if (!use_eager_fpu()), which, in turn, uses static_cpu_has. And we're executing before alternatives so static_cpu_has doesn't work there yet. Use the safe variant in this path which becomes optimal after alternatives have run. WARNING: at arch/x86/kernel/cpu/common.c:1368 warn_pre_alternatives+0x1e/0x20() You're using static_cpu_has before alternatives have run! Modules linked in: Pid: 0, comm: swapper Not tainted 3.9.0-rc8+ #1 Call Trace: warn_slowpath_common warn_slowpath_fmt ? fpu_finit warn_pre_alternatives eager_fpu_init fpu_init cpu_init trap_init start_kernel ? repair_env_string x86_64_start_reservations x86_64_start_kernel Signed-off-by: NBorislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/1370772454-6106-6-git-send-email-bp@alien8.deSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 07 6月, 2013 1 次提交
-
-
由 H. Peter Anvin 提交于
Reimplement FPU detection code in C and drop old, not-so-recommended detection method in asm. Move all the relevant stuff into i387.c where it conceptually belongs. Finally drop cpuinfo_x86.hard_math. [ hpa: huge thanks to Borislav for taking my original concept patch and productizing it ] [ Boris, note to self: do not use static_cpu_has before alternatives! ] Signed-off-by: NH. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1367244262-29511-2-git-send-email-bp@alien8.de Link: http://lkml.kernel.org/r/1365436666-9837-2-git-send-email-bp@alien8.deSigned-off-by: NBorislav Petkov <bp@suse.de> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 14 2月, 2013 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 01 12月, 2012 1 次提交
-
-
由 Vincent Palatin 提交于
When a cpu enters S3 state, the FPU state is lost. After resuming for S3, if we try to lazy restore the FPU for a process running on the same CPU, this will result in a corrupted FPU context. Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU, so nobody will try to lazy restore a state which doesn't exist in the hardware. Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off, by doing thousands of suspend/resume cycles with 4 processes doing FPU operations running. Without the patch, a process is killed after a few hundreds cycles by a SIGFPE. Cc: Duncan Laurie <dlaurie@chromium.org> Cc: Olof Johansson <olofj@chromium.org> Cc: <stable@kernel.org> v3.4+ # for 3.4 need to replace this_cpu_write by percpu_write Signed-off-by: NVincent Palatin <vpalatin@chromium.org> Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 26 9月, 2012 1 次提交
-
-
由 H. Peter Anvin 提交于
With SMAP, the [f][x]rstor_checking() functions are no longer usable for user-space pointers by applying a simple __force cast. Instead, create new [f][x]rstor_user() functions which do the proper SMAP magic. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-3-git-send-email-suresh.b.siddha@intel.com
-
- 22 9月, 2012 1 次提交
-
-
由 H. Peter Anvin 提交于
When Supervisor Mode Access Prevention (SMAP) is enabled, access to userspace from the kernel is controlled by the AC flag. To make the performance of manipulating that flag acceptable, there are two new instructions, STAC and CLAC, to set and clear it. This patch adds those instructions, via alternative(), when the SMAP feature is enabled. It also adds X86_EFLAGS_AC unconditionally to the SYSCALL entry mask; there is simply no reason to make that one conditional. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
-
- 19 9月, 2012 8 次提交
-
-
由 Suresh Siddha 提交于
CPUs with FXSAVE but no XMM/MXCSR (Pentium II from Intel, Crusoe/TM-3xxx/5xxx from Transmeta, and presumably some of the K6 generation from AMD) ever looked at the mxcsr field during fxrstor/fxsave. So remove the cpu_has_xmm check in the fx_finit() Reported-by: NAl Viro <viro@zeniv.linux.org.uk> Acked-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1347300665-6209-6-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Decouple non-lazy/eager fpu restore policy from the existence of the xsave feature. Introduce a synthetic CPUID flag to represent the eagerfpu policy. "eagerfpu=on" boot paramter will enable the policy. Requested-by: NH. Peter Anvin <hpa@zytor.com> Requested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1347300665-6209-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Fundamental model of the current Linux kernel is to lazily init and restore FPU instead of restoring the task state during context switch. This changes that fundamental lazy model to the non-lazy model for the processors supporting xsave feature. Reasons driving this model change are: i. Newer processors support optimized state save/restore using xsaveopt and xrstor by tracking the INIT state and MODIFIED state during context-switch. This is faster than modifying the cr0.TS bit which has serializing semantics. ii. Newer glibc versions use SSE for some of the optimized copy/clear routines. With certain workloads (like boot, kernel-compilation etc), application completes its work with in the first 5 task switches, thus taking upto 5 #DNA traps with the kernel not getting a chance to apply the above mentioned pre-load heuristic. iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit and thus will not work correctly in the presence of lazy restore. Non-lazy state restore is needed for enabling such features. Some data on a two socket SNB system: * Saved 20K DNA exceptions during boot on a two socket SNB system. * Saved 50K DNA exceptions during kernel-compilation workload. * Improved throughput of the AVX based checksumming function inside the kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts pair. Also now kernel_fpu_begin/end() relies on the patched alternative instructions. So move check_fpu() which uses the kernel_fpu_begin/end() after alternative_instructions(). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com Merge 32-bit boot fix from, Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Cc: NeilBrown <neilb@suse.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Few lines below we do drop_fpu() which is more safer. Remove the unnecessary user_fpu_end() in save_xstate_sig(), which allows the drop_fpu() to ignore any pending exceptions from the user-space and drop the current fpu. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
No need to save the state with unlazy_fpu(), that is about to get overwritten by the state from the signal frame. Instead use drop_fpu() and continue to restore the new state. Also fold the stop_fpu_preload() into drop_fpu(). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Consolidate x86, x86_64 inline asm routines saving/restoring fpu state using config_enabled(). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Use config_enabled() to cleanup the definitions of is_ia32/is_x32. Move the function prototypes to the header file to cleanup ifdefs, and move the x32_setup_rt_frame() code around. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-2-git-send-email-suresh.b.siddha@intel.com Merged in compilation fix from, Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 15 5月, 2012 1 次提交
-
-
由 Alex Shi 提交于
Since percpu_xxx() serial functions are duplicated with this_cpu_xxx(). Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in code. There is no function change in this patch, just preparation for later percpu_xxx serial function removing. On x86 machine the this_cpu_xxx() serial functions are same as __this_cpu_xxx() without no unnecessary premmpt enable/disable. Thanks for Stephen Rothwell, he found and fixed a i386 build error in the patch. Also thanks for Andrew Morton, he kept updating the patchset in Linus' tree. Signed-off-by: NAlex Shi <alex.shi@intel.com> Acked-by: NChristoph Lameter <cl@gentwo.org> Acked-by: NTejun Heo <tj@kernel.org> Acked-by: N"H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 22 2月, 2012 2 次提交
-
-
由 Linus Torvalds 提交于
While various modules include <asm/i387.h> to get access to things we actually *intend* for them to use, most of that header file was really pretty low-level internal stuff that we really don't want to expose to others. So split the header file into two: the small exported interfaces remain in <asm/i387.h>, while the internal definitions that are only used by core architecture code are now in <asm/fpu-internal.h>. The guiding principle for this was to expose functions that we export to modules, and leave them in <asm/i387.h>, while stuff that is used by task switching or was marked GPL-only is in <asm/fpu-internal.h>. The fpu-internal.h file could be further split up too, especially since arch/x86/kvm/ uses some of the remaining stuff for its module. But that kvm usage should probably be abstracted out a bit, and at least now the internal FPU accessor functions are much more contained. Even if it isn't perhaps as contained as it _could_ be. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Linus Torvalds 提交于
Instead of exporting the very low-level internals of the FPU state save/restore code (ie things like 'fpu_owner_task'), we should export the higher-level interfaces. Inlining these things is pointless anyway: sure, sometimes the end result is small, but while 'stts()' can result in just three x86 instructions, those are not cheap instructions (writing %cr0 is a serializing instruction and a very slow one at that). So the overhead of a function call is not noticeable, and we really don't want random modules mucking about with our internal state save logic anyway. So this unexports 'fpu_owner_task', and instead uninlines and exports the actual functions that modules can use: fpu_kernel_begin/end() and unlazy_fpu(). Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211339590.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 21 2月, 2012 3 次提交
-
-
由 Linus Torvalds 提交于
This makes us recognize when we try to restore FPU state that matches what we already have in the FPU on this CPU, and avoids the restore entirely if so. To do this, we add two new data fields: - a percpu 'fpu_owner_task' variable that gets written any time we update the "has_fpu" field, and thus acts as a kind of back-pointer to the task that owns the CPU. The exception is when we save the FPU state as part of a context switch - if the save can keep the FPU state around, we leave the 'fpu_owner_task' variable pointing at the task whose FP state still remains on the CPU. - a per-thread 'last_cpu' field, that indicates which CPU that thread used its FPU on last. We update this on every context switch (writing an invalid CPU number if the last context switch didn't leave the FPU in a lazily usable state), so we know that *that* thread has done nothing else with the FPU since. These two fields together can be used when next switching back to the task to see if the CPU still matches: if 'fpu_owner_task' matches the task we are switching to, we know that no other task (or kernel FPU usage) touched the FPU on this CPU in the meantime, and if the current CPU number matches the 'last_cpu' field, we know that this thread did no other FP work on any other CPU, so the FPU state on the CPU must match what was saved on last context switch. In that case, we can avoid the 'f[x]rstor' entirely, and just clear the CR0.TS bit. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
This inlines what is usually just a couple of instructions, but more importantly it also fixes the theoretical error case (can that FPU restore really ever fail? Maybe we should remove the checking). We can't start sending signals from within the scheduler, we're much too deep in the kernel and are holding the runqueue lock etc. So don't bother even trying. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
This makes sure we clear the FPU usage counter for newly created tasks, just so that we start off in a known state (for example, don't try to preload the FPU state on the first task switch etc). It also fixes a thinko in when we increment the fpu_counter at task switch time, introduced by commit 34ddc81a ("i387: re-introduce FPU state preloading at context switch time"). We should increment the *new* task fpu_counter, not the old task, and only if we decide to use that state (whether lazily or preloaded). Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 2月, 2012 2 次提交
-
-
由 Linus Torvalds 提交于
After all the FPU state cleanups and finally finding the problem that caused all our FPU save/restore problems, this re-introduces the preloading of FPU state that was removed in commit b3b0870e ("i387: do not preload FPU state at task switch time"). However, instead of simply reverting the removal, this reimplements preloading with several fixes, most notably - properly abstracted as a true FPU state switch, rather than as open-coded save and restore with various hacks. In particular, implementing it as a proper FPU state switch allows us to optimize the CR0.TS flag accesses: there is no reason to set the TS bit only to then almost immediately clear it again. CR0 accesses are quite slow and expensive, don't flip the bit back and forth for no good reason. - Make sure that the same model works for both x86-32 and x86-64, so that there are no gratuitous differences between the two due to the way they save and restore segment state differently due to architectural differences that really don't matter to the FPU state. - Avoid exposing the "preload" state to the context switch routines, and in particular allow the concept of lazy state restore: if nothing else has used the FPU in the meantime, and the process is still on the same CPU, we can avoid restoring state from memory entirely, just re-expose the state that is still in the FPU unit. That optimized lazy restore isn't actually implemented here, but the infrastructure is set up for it. Of course, older CPU's that use 'fnsave' to save the state cannot take advantage of this, since the state saving also trashes the state. In other words, there is now an actual _design_ to the FPU state saving, rather than just random historical baggage. Hopefully it's easier to follow as a result. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
This moves the bit that indicates whether a thread has ownership of the FPU from the TS_USEDFPU bit in thread_info->status to a word of its own (called 'has_fpu') in task_struct->thread.has_fpu. This fixes two independent bugs at the same time: - changing 'thread_info->status' from the scheduler causes nasty problems for the other users of that variable, since it is defined to be thread-synchronous (that's what the "TS_" part of the naming was supposed to indicate). So perfectly valid code could (and did) do ti->status |= TS_RESTORE_SIGMASK; and the compiler was free to do that as separate load, or and store instructions. Which can cause problems with preemption, since a task switch could happen in between, and change the TS_USEDFPU bit. The change to TS_USEDFPU would be overwritten by the final store. In practice, this seldom happened, though, because the 'status' field was seldom used more than once, so gcc would generally tend to generate code that used a read-modify-write instruction and thus happened to avoid this problem - RMW instructions are naturally low fat and preemption-safe. - On x86-32, the current_thread_info() pointer would, during interrupts and softirqs, point to a *copy* of the real thread_info, because x86-32 uses %esp to calculate the thread_info address, and thus the separate irq (and softirq) stacks would cause these kinds of odd thread_info copy aliases. This is normally not a problem, since interrupts aren't supposed to look at thread information anyway (what thread is running at interrupt time really isn't very well-defined), but it confused the heck out of irq_fpu_usable() and the code that tried to squirrel away the FPU state. (It also caused untold confusion for us poor kernel developers). It also turns out that using 'task_struct' is actually much more natural for most of the call sites that care about the FPU state, since they tend to work with the task struct for other reasons anyway (ie scheduling). And the FPU data that we are going to save/restore is found there too. Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to the %esp issue. Cc: Arjan van de Ven <arjan@linux.intel.com> Reported-and-tested-by: NRaphael Prevost <raphael@buro.asia> Acked-and-tested-by: NSuresh Siddha <suresh.b.siddha@intel.com> Tested-by: NPeter Anvin <hpa@zytor.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-