- 19 9月, 2012 8 次提交
-
-
由 Suresh Siddha 提交于
Add the "eagerfpu=auto" (that selects the default scheme in enabling eagerfpu) which can override compiled-in boot parameters like "eagerfpu=on/off" (that force enable/disable eagerfpu). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1347300665-6209-5-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
xsaveopt/xrstor support optimized state save/restore by tracking the INIT state and MODIFIED state during context-switch. Enable eagerfpu by default for processors supporting xsaveopt. Can be disabled by passing "eagerfpu=off" boot parameter. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1347300665-6209-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Decouple non-lazy/eager fpu restore policy from the existence of the xsave feature. Introduce a synthetic CPUID flag to represent the eagerfpu policy. "eagerfpu=on" boot paramter will enable the policy. Requested-by: NH. Peter Anvin <hpa@zytor.com> Requested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1347300665-6209-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Fundamental model of the current Linux kernel is to lazily init and restore FPU instead of restoring the task state during context switch. This changes that fundamental lazy model to the non-lazy model for the processors supporting xsave feature. Reasons driving this model change are: i. Newer processors support optimized state save/restore using xsaveopt and xrstor by tracking the INIT state and MODIFIED state during context-switch. This is faster than modifying the cr0.TS bit which has serializing semantics. ii. Newer glibc versions use SSE for some of the optimized copy/clear routines. With certain workloads (like boot, kernel-compilation etc), application completes its work with in the first 5 task switches, thus taking upto 5 #DNA traps with the kernel not getting a chance to apply the above mentioned pre-load heuristic. iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit and thus will not work correctly in the presence of lazy restore. Non-lazy state restore is needed for enabling such features. Some data on a two socket SNB system: * Saved 20K DNA exceptions during boot on a two socket SNB system. * Saved 50K DNA exceptions during kernel-compilation workload. * Improved throughput of the AVX based checksumming function inside the kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts pair. Also now kernel_fpu_begin/end() relies on the patched alternative instructions. So move check_fpu() which uses the kernel_fpu_begin/end() after alternative_instructions(). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com Merge 32-bit boot fix from, Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Cc: NeilBrown <neilb@suse.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Few lines below we do drop_fpu() which is more safer. Remove the unnecessary user_fpu_end() in save_xstate_sig(), which allows the drop_fpu() to ignore any pending exceptions from the user-space and drop the current fpu. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
No need to save the state with unlazy_fpu(), that is about to get overwritten by the state from the signal frame. Instead use drop_fpu() and continue to restore the new state. Also fold the stop_fpu_preload() into drop_fpu(). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-2-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Consolidate x86, x86_64 inline asm routines saving/restoring fpu state using config_enabled(). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-3-git-send-email-suresh.b.siddha@intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 05 9月, 2012 1 次提交
-
-
由 Mathias Krause 提交于
Don't remove the __user annotation of the fpstate pointer, but drop the superfluous void * cast instead. This fixes the following sparse warnings: xsave.c:135:15: warning: cast removes address space of expression xsave.c:135:15: warning: incorrect type in argument 1 (different address spaces) xsave.c:135:15: expected void const volatile [noderef] <asn:1>*<noident> [...] Signed-off-by: NMathias Krause <minipli@googlemail.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1346621506-30857-6-git-send-email-minipli@googlemail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 6月, 2012 1 次提交
-
-
由 Joe Perches 提交于
Use a more current logging style: - Bare printks should have a KERN_<LEVEL> for consistency's sake - Add pr_fmt where appropriate - Neaten some macro definitions - Convert some Ok output to OK - Use "%s: ", __func__ in pr_fmt for summit - Convert some printks to pr_<level> Message output is not identical in all cases. Signed-off-by: NJoe Perches <joe@perches.com> Cc: levinsasha928@gmail.com Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop [ merged two similar patches, tidied up the changelog ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 17 5月, 2012 1 次提交
-
-
由 Suresh Siddha 提交于
Code paths like fork(), exit() and signal handling flush the fpu state explicitly to the structures in memory. BUG_ON() in __sanitize_i387_state() is checking that the fpu state is not live any more. But for preempt kernels, task can be scheduled out and in at any place and the preload_fpu logic during context switch can make the fpu registers live again. For example, consider a 64-bit Task which uses fpu frequently and as such you will find its fpu_counter mostly non-zero. During its time slice, kernel used fpu by doing kernel_fpu_begin/kernel_fpu_end(). After this, in the same scheduling slice, task-A got a signal to handle. Then during the signal setup path we got preempted when we are just before the sanitize_i387_state() in arch/x86/kernel/xsave.c:save_i387_xstate(). And when we come back we will have the fpu registers live that can hit the bug_on. Similarly during core dump, other threads can context-switch in and out (because of spurious wakeups while waiting for the coredump to finish in kernel/exit.c:exit_mm()) and the main thread dumping core can run into this bug when it finds some other thread with its fpu registers live on some other cpu. So remove the paranoid check for now, even though it caught a bug in the multi-threaded core dump case (fixed in the previous patch). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-3-git-send-email-suresh.b.siddha@intel.com Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 22 2月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
While various modules include <asm/i387.h> to get access to things we actually *intend* for them to use, most of that header file was really pretty low-level internal stuff that we really don't want to expose to others. So split the header file into two: the small exported interfaces remain in <asm/i387.h>, while the internal definitions that are only used by core architecture code are now in <asm/fpu-internal.h>. The guiding principle for this was to expose functions that we export to modules, and leave them in <asm/i387.h>, while stuff that is used by task switching or was marked GPL-only is in <asm/fpu-internal.h>. The fpu-internal.h file could be further split up too, especially since arch/x86/kvm/ uses some of the remaining stuff for its module. But that kvm usage should probably be abstracted out a bit, and at least now the internal FPU accessor functions are much more contained. Even if it isn't perhaps as contained as it _could_ be. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.orgSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 19 2月, 2012 1 次提交
-
-
由 Linus Torvalds 提交于
This moves the bit that indicates whether a thread has ownership of the FPU from the TS_USEDFPU bit in thread_info->status to a word of its own (called 'has_fpu') in task_struct->thread.has_fpu. This fixes two independent bugs at the same time: - changing 'thread_info->status' from the scheduler causes nasty problems for the other users of that variable, since it is defined to be thread-synchronous (that's what the "TS_" part of the naming was supposed to indicate). So perfectly valid code could (and did) do ti->status |= TS_RESTORE_SIGMASK; and the compiler was free to do that as separate load, or and store instructions. Which can cause problems with preemption, since a task switch could happen in between, and change the TS_USEDFPU bit. The change to TS_USEDFPU would be overwritten by the final store. In practice, this seldom happened, though, because the 'status' field was seldom used more than once, so gcc would generally tend to generate code that used a read-modify-write instruction and thus happened to avoid this problem - RMW instructions are naturally low fat and preemption-safe. - On x86-32, the current_thread_info() pointer would, during interrupts and softirqs, point to a *copy* of the real thread_info, because x86-32 uses %esp to calculate the thread_info address, and thus the separate irq (and softirq) stacks would cause these kinds of odd thread_info copy aliases. This is normally not a problem, since interrupts aren't supposed to look at thread information anyway (what thread is running at interrupt time really isn't very well-defined), but it confused the heck out of irq_fpu_usable() and the code that tried to squirrel away the FPU state. (It also caused untold confusion for us poor kernel developers). It also turns out that using 'task_struct' is actually much more natural for most of the call sites that care about the FPU state, since they tend to work with the task struct for other reasons anyway (ie scheduling). And the FPU data that we are going to save/restore is found there too. Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to the %esp issue. Cc: Arjan van de Ven <arjan@linux.intel.com> Reported-and-tested-by: NRaphael Prevost <raphael@buro.asia> Acked-and-tested-by: NSuresh Siddha <suresh.b.siddha@intel.com> Tested-by: NPeter Anvin <hpa@zytor.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 2月, 2012 2 次提交
-
-
由 Linus Torvalds 提交于
This creates three helper functions that do the TS_USEDFPU accesses, and makes everybody that used to do it by hand use those helpers instead. In addition, there's a couple of helper functions for the "change both CR0.TS and TS_USEDFPU at the same time" case, and the places that do that together have been changed to use those. That means that we have fewer random places that open-code this situation. The intent is partly to clarify the code without actually changing any semantics yet (since we clearly still have some hard to reproduce bug in this area), but also to make it much easier to use another approach entirely to caching the CR0.TS bit for software accesses. Right now we use a bit in the thread-info 'status' variable (this patch does not change that), but we might want to make it a full field of its own or even make it a per-cpu variable. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
Commit 5b1cbac3 ("i387: make irq_fpu_usable() tests more robust") added a sanity check to the #NM handler to verify that we never cause the "Device Not Available" exception in kernel mode. However, that check actually pinpointed a (fundamental) race where we do cause that exception as part of the signal stack FPU state save/restore code. Because we use the floating point instructions themselves to save and restore state directly from user mode, we cannot do that atomically with testing the TS_USEDFPU bit: the user mode access itself may cause a page fault, which causes a task switch, which saves and restores the FP/MMX state from the kernel buffers. This kind of "recursive" FP state save is fine per se, but it means that when the signal stack save/restore gets restarted, it will now take the '#NM' exception we originally tried to avoid. With preemption this can happen even without the page fault - but because of the user access, we cannot just disable preemption around the save/restore instruction. There are various ways to solve this, including using the "enable/disable_page_fault()" helpers to not allow page faults at all during the sequence, and fall back to copying things by hand without the use of the native FP state save/restore instructions. However, the simplest thing to do is to just allow the #NM from kernel space, but fix the race in setting and clearing CR0.TS that this all exposed: the TS bit changes and the TS_USEDFPU bit absolutely have to be atomic wrt scheduling, so while the actual state save/restore can be interrupted and restarted, the act of actually clearing/setting CR0.TS and the TS_USEDFPU bit together must not. Instead of just adding random "preempt_disable/enable()" calls to what is already excessively ugly code, this introduces some helper functions that mostly mirror the "kernel_fpu_begin/end()" functionality, just for the user state instead. Those helper functions should probably eventually replace the other ad-hoc CR0.TS and TS_USEDFPU tests too, but I'll need to think about it some more: the task switching functionality in particular needs to expose the difference between the 'prev' and 'next' threads, while the new helper functions intentionally were written to only work with 'current'. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 18 3月, 2011 1 次提交
-
-
由 Lucas De Marchi 提交于
They were generated by 'codespell' and then manually reviewed. Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi> Cc: trivial@kernel.org LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 14 12月, 2010 1 次提交
-
-
由 Suresh Siddha 提交于
Alignment of alloc_bootmem() depends on the value of L1_CACHE_SHIFT. What we need here, however, is 64 byte alignment. Use alloc_bootmem_align() and explicitly specify the alignment instead. This fixes a kernel boot crash reported by Jody when the cpu in .config is set to MPENTIUMII but the kernel is booted on a xsave-capable CPU. Reported-by: NJody Bruchon <jody@nctritech.com> Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20101116212442.059967454@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org>
-
- 22 7月, 2010 6 次提交
-
-
由 H. Peter Anvin 提交于
xstate_enable_boot_cpu() is, as the name implies, only used on the boot CPU; furthermore, it invokes alloc_bootmem(), which is __init; hence it needs to be tagged __init rather than __cpuinit. Furthermore, it is *not* safe in the long run to rely on CPU 0 only coming online during the early boot -- at some point we're going to support offlining (and re-onlining) the boot CPU, and at that point we must not call xstate_enable_boot_cpu() again. The code is a fair bit more obscure than one would like, because the __ref overrides aren't quite powerful enough. Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <4C476236.1020302@zytor.com>
-
由 Robert Richter 提交于
This is called only from initialization code. Signed-off-by: NRobert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-6-git-send-email-robert.richter@amd.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Robert Richter 提交于
The pointer is only used in xsave.c. Making it static. Signed-off-by: NRobert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-5-git-send-email-robert.richter@amd.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Robert Richter 提交于
The patch introduces the XSTATE_CPUID macro and adds a check that tests if XSTATE_CPUID exists. Signed-off-by: NRobert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-4-git-send-email-robert.richter@amd.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Robert Richter 提交于
The patch renames xsave_cntxt_init() and __xsave_init() into xstate_enable_boot_cpu() and xstate_enable() as this names are more meaningful. It also removes the duplicate xcr setup for the boot cpu. Signed-off-by: NRobert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-3-git-send-email-robert.richter@amd.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Robert Richter 提交于
As xsave also supports other than fpu features, it should be initialized independently of the fpu. This patch moves this out of fpu initialization. There is also a lot of cross referencing between fpu and xsave code. This patch reduces this by making xsave_cntxt_init() and init_thread_xstate() static functions. The patch moves the cpu_has_xsave check at the beginning of xsave_init(). All other checks may removed then. Signed-off-by: NRobert Richter <robert.richter@amd.com> LKML-Reference: <1279731838-1522-2-git-send-email-robert.richter@amd.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 21 7月, 2010 1 次提交
-
-
由 Robert Richter 提交于
This patch moves boot cpu initialization to xsave_init(). Now all cpus are initialized in one single function. Signed-off-by: NRobert Richter <robert.richter@amd.com> LKML-Reference: <1279651857-24639-5-git-send-email-robert.richter@amd.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 20 7月, 2010 2 次提交
-
-
由 Suresh Siddha 提交于
With xsaveopt, if a processor implementation discern that a processor state component is in its initialized state it may modify the corresponding bit in the xsave_hdr.xstate_bv as '0', with out modifying the corresponding memory layout. Hence wHile presenting the xstate information to the user, we always ensure that the memory layout of a feature will be in the init state if the corresponding header bit is zero. This ensures the consistency and avoids the condition of the user seeing some some stale state in the memory layout during signal handling, debugging etc. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.351459480@sbs-t61.sc.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
由 Suresh Siddha 提交于
Subleaves of the cpuid vector 0xd provides the offset and size of different feature state that are managed by the xsave/xrstor. Track this for the upcoming usage during signal handling. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100719230205.262987929@sbs-t61.sc.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 07 7月, 2010 1 次提交
-
-
由 Suresh Siddha 提交于
fxsave/xsave doesn't touch all the bytes in the memory layout used by these instructions. Specifically SW reserved (bytes 464..511) fields in the fxsave frame and the reserved fields in the xsave header. To present a clean context for the signal handling, just clear these fields instead of clearing the complete fxsave/xsave memory layout, when we dump these registers directly to the user signal frame. Also avoid the call to second xrstor (which inits the state not passed in the signal frame) in restore_user_xstate() if all the state has already been restored by the first xrstor. These changes improve the performance of signal handling(by ~3-5% as measured by the lat_sig). Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1277249017.2847.85.camel@sbs-t61.sc.intel.com> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 10 6月, 2010 1 次提交
-
-
由 Dan Carpenter 提交于
The places which call check_for_xstate() only care about zero or non-zero so this patch doesn't change how the code runs, but it's a cleanup. The main reason for this patch is that I'm looking for places which don't return -EFAULT for copy_from_user() failures. Signed-off-by: NDan Carpenter <error27@gmail.com> LKML-Reference: <20100603100746.GU5483@bicker> Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com>
-
- 11 5月, 2010 2 次提交
-
-
由 Avi Kivity 提交于
Currently all fpu state access is through tsk->thread.xstate. Since we wish to generalize fpu access to non-task contexts, wrap the state in a new 'struct fpu' and convert existing access to use an fpu API. Signal frame handlers are not converted to the API since they will remain task context only things. Signed-off-by: NAvi Kivity <avi@redhat.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1273135546-29690-3-git-send-email-avi@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Avi Kivity 提交于
The fpu code currently uses current->thread_info->status & TS_XSAVE as a way to distinguish between XSAVE capable processors and older processors. The decision is not really task specific; instead we use the task status to avoid a global memory reference - the value should be the same across all threads. Eliminate this tie-in into the task structure by using an alternative instruction keyed off the XSAVE cpu feature; this results in shorter and faster code, without introducing a global memory reference. [ hpa: in the future, this probably should use an asm jmp ] Signed-off-by: NAvi Kivity <avi@redhat.com> Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1273135546-29690-2-git-send-email-avi@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 12 2月, 2010 1 次提交
-
-
由 Suresh Siddha 提交于
Add the xstate regset support which helps extend the kernel ptrace and the core-dump interfaces to support AVX state etc. This regset interface is designed to support all the future state that gets supported using xsave/xrstor infrastructure. Looking at the memory layout saved by "xsave", one can't say which state is represented in the memory layout. This is because if a particular state is in init state, in the xsave hdr it can be represented by bit '0'. And hence we can't really say by the xsave header wether a state is in init state or the state is not saved in the memory layout. And hence the xsave memory layout available through this regset interface uses SW usable bytes [464..511] to convey what state is represented in the memory layout. First 8 bytes of the sw_usable_bytes[464..467] will be set to OS enabled xstate mask(which is same as the 64bit mask returned by the xgetbv's xCR0). The note NT_X86_XSTATE represents the extended state information in the core file, using the above mentioned memory layout. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <20100211195614.802495327@sbs-t61.sc.intel.com> Signed-off-by: NHongjiu Lu <hjl.tools@gmail.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 21 4月, 2009 1 次提交
-
-
由 Suresh Siddha 提交于
In 64bit signal delivery path, clear_used_math() was happening before saving the current active FPU state on to the user stack for signal handling. Between clear_used_math() and the state store on to the user stack, potentially we can get a page fault for the user address and can block. Infact, while testing we were hitting the might_fault() in __clear_user() which can do a schedule(). At a later point in time, we will schedule back into this process and resume the save state (using "xsave/fxsave" instruction) which can lead to DNA fault. And as used_math was cleared before, we will reinit the FP state in the DNA fault and continue. This reinit will result in loosing the FPU state of the process. Move clear_used_math() to a point after the FPU state has been stored onto the user stack. This issue is present from a long time (even before the xsave changes and the x86 merge). But it can easily be exposed in 2.6.28.x and 2.6.29.x series because of the __clear_user() in this path, which has an explicit __cond_resched() leading to a context switch with CONFIG_PREEMPT_VOLUNTARY. [ Impact: fix FPU state corruption ] Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Cc: <stable@kernel.org> [2.6.28.x, 2.6.29.x] Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
-
- 12 4月, 2009 1 次提交
-
-
由 Suresh Siddha 提交于
Impact: save/restore Intel-AVX state properly between tasks Intel Advanced Vector Extensions (AVX) introduce 256-bit vector processing capability. More about AVX at http://software.intel.com/sites/avx Add OS support for YMM state management using xsave/xrstor infrastructure to support AVX. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <1239402084.27006.8057.camel@localhost.localdomain> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 31 12月, 2008 1 次提交
-
-
由 Jaswinder Singh Rajput 提交于
Impact: cleanup, reduce kernel size a bit, avoid sparse warning Fixes sparse warning: arch/x86/kernel/xsave.c:162:5: warning: symbol 'restore_user_xstate' was not declared. Should it be static? Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 20 11月, 2008 1 次提交
-
-
由 Rakib Mullick 提交于
Annotate xsave_cntxt_init() as "can be called outside of __init". Signed-off-by: NRakib Mullick <rakib.mullick@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 22 10月, 2008 1 次提交
-
-
由 roel kluin 提交于
These variables are only used in their source files, so make them static. Signed-off-by: NRoel Kluin <roel.kluin@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 12 10月, 2008 1 次提交
-
-
由 Ingo Molnar 提交于
fix warning: arch/x86/kernel/xsave.c: In function ‘save_i387_xstate’: arch/x86/kernel/xsave.c:98: warning: ignoring return value of ‘__clear_user’, declared with attribute warn_unused_result check the return value and act on it. We should not be ignoring faults at this point. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 10月, 2008 2 次提交
-
-
由 Suresh Siddha 提交于
If a processor implementation discern that a processor state component is in its initialized state, it may modify the corresponding bit in the xsave header.xstate_bv as '0'. State in the memory layout setup by 'xsave' will be consistent with the bit values in the header. During signal handling, legacy applications may change the FP/SSE bits in the sigcontext memory layout without touching the FP/SSE header bits in the xsave header. So always set FP/SSE bits in the xsave header while saving the sigcontext state to the user space. During signal return, this will enable the kernel to capture any changes to the FP/SSE bits by the legacy applications which don't touch xsave headers. xsave aware apps can change the xstate_bv in the xsave header aswell as change any contents in the memory layout. xrestor as part of sigreturn will capture all the changes. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
由 Suresh Siddha 提交于
Actually return failure on error. Signed-off-by: NSuresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
-
- 07 9月, 2008 1 次提交
-
-
由 Alexey Dobriyan 提交于
WARNING: vmlinux.o(.text+0x22453): Section mismatch in reference from the function setup_xstate_init() to the function .init.text:__alloc_bootmem() The function setup_xstate_init() references the function __init __alloc_bootmem(). This is often because setup_xstate_init lacks a __init annotation or the annotation of __alloc_bootmem is wrong. Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-