- 03 11月, 2017 2 次提交
-
-
由 Dave Martin 提交于
This patch adds the core support for switching and managing the SVE architectural state of user tasks. Calls to the existing FPSIMD low-level save/restore functions are factored out as new functions task_fpsimd_{save,load}(), since SVE now dynamically may or may not need to be handled at these points depending on the kernel configuration, hardware features discovered at boot, and the runtime state of the task. To make these decisions as fast as possible, const cpucaps are used where feasible, via the system_supports_sve() helper. The SVE registers are only tracked for threads that have explicitly used SVE, indicated by the new thread flag TIF_SVE. Otherwise, the FPSIMD view of the architectural state is stored in thread.fpsimd_state as usual. When in use, the SVE registers are not stored directly in thread_struct due to their potentially large and variable size. Because the task_struct slab allocator must be configured very early during kernel boot, it is also tricky to configure it correctly to match the maximum vector length provided by the hardware, since this depends on examining secondary CPUs as well as the primary. Instead, a pointer sve_state in thread_struct points to a dynamically allocated buffer containing the SVE register data, and code is added to allocate and free this buffer at appropriate times. TIF_SVE is set when taking an SVE access trap from userspace, if suitable hardware support has been detected. This enables SVE for the thread: a subsequent return to userspace will disable the trap accordingly. If such a trap is taken without sufficient system- wide hardware support, SIGILL is sent to the thread instead as if an undefined instruction had been executed: this may happen if userspace tries to use SVE in a system where not all CPUs support it for example. The kernel will clear TIF_SVE and disable SVE for the thread whenever an explicit syscall is made by userspace. For backwards compatibility reasons and conformance with the spirit of the base AArch64 procedure call standard, the subset of the SVE register state that aliases the FPSIMD registers is still preserved across a syscall even if this happens. The remainder of the SVE register state logically becomes zero at syscall entry, though the actual zeroing work is currently deferred until the thread next tries to use SVE, causing another trap to the kernel. This implementation is suboptimal: in the future, the fastpath case may be optimised to zero the registers in-place and leave SVE enabled for the task, where beneficial. TIF_SVE is also cleared in the following slowpath cases, which are taken as reasonable hints that the task may no longer use SVE: * exec * fork and clone Code is added to sync data between thread.fpsimd_state and thread.sve_state whenever enabling/disabling SVE, in a manner consistent with the SVE architectural programmer's model. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Alex Bennée <alex.bennee@linaro.org> [will: added #include to fix allnoconfig build] [will: use enable_daif in do_sve_acc] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Dave Martin 提交于
Manipulating the SVE architectural state, including the vector and predicate registers, first-fault register and the vector length, requires the use of dedicated instructions added by SVE. This patch adds suitable assembly functions for saving and restoring the SVE registers and querying the vector length. Setting of the vector length is done as part of register restore. Since people building kernels may not all get an SVE-enabled toolchain for a while, this patch uses macros that generate explicit opcodes in place of assembler mnemonics. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NAlex Bennée <alex.bennee@linaro.org> Acked-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 04 8月, 2017 2 次提交
-
-
由 Dave Martin 提交于
Support for kernel-mode NEON to be nested and/or used in hardirq context adds significant complexity, and the benefits may be marginal. In practice, kernel-mode NEON is not used in hardirq context, and is rarely used in softirq context (by certain mac80211 drivers). This patch implements an arm64 may_use_simd() function to allow clients to check whether kernel-mode NEON is usable in the current context, and simplifies kernel_neon_{begin,end}() to handle only saving of the task FPSIMD state (if any). Without nesting, there is no other state to save. The partial fpsimd save/restore functions become redundant as a result of these changes, so they are removed too. The save/restore model is changed to operate directly on task_struct without additional percpu storage. This simplifies the code and saves a bit of memory, but means that softirqs must now be disabled when manipulating the task fpsimd state from task context: correspondingly, preempt_{en,dis}sable() calls are upgraded to local_bh_{en,dis}able() as appropriate. fpsimd_thread_switch() already runs with hardirqs disabled and so is already protected from softirqs. These changes should make it easier to support kernel-mode NEON in the presence of the Scalable Vector extension in the future. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
由 Dave Martin 提交于
In order to be able to cope with kernel-mode NEON being unavailable in hardirq/nmi context and non-nestable, we need special handling for EFI runtime service calls that may be made during an interrupt that interrupted a kernel_neon_begin()..._end() block. This will occur if the kernel tries to write diagnostic data to EFI persistent storage during a panic triggered by an NMI for example. EFI runtime services specify an ABI that clobbers the FPSIMD state, rather than being able to use it optionally as an accelerator. This means that EFI is really a special case and can be handled specially. To enable EFI calls from interrupts, this patch creates dedicated __efi_fpsimd_{begin,end}() helpers solely for this purpose, which save/restore to a separate percpu buffer if called in a context where kernel_neon_begin() is not usable. Signed-off-by: NDave Martin <Dave.Martin@arm.com> Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 08 5月, 2014 3 次提交
-
-
由 Ard Biesheuvel 提交于
This patch modifies kernel_neon_begin() and kernel_neon_end(), so they may be called from any context. To address the case where only a couple of registers are needed, kernel_neon_begin_partial(u32) is introduced which takes as a parameter the number of bottom 'n' NEON q-registers required. To mark the end of such a partial section, the regular kernel_neon_end() should be used. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
-
由 Ard Biesheuvel 提交于
If a task gets scheduled out and back in again and nothing has touched its FPSIMD state in the mean time, there is really no reason to reload it from memory. Similarly, repeated calls to kernel_neon_begin() and kernel_neon_end() will preserve and restore the FPSIMD state every time. This patch defers the FPSIMD state restore to the last possible moment, i.e., right before the task returns to userland. If a task does not return to userland at all (for any reason), the existing FPSIMD state is preserved and may be reused by the owning task if it gets scheduled in again on the same CPU. This patch adds two more functions to abstract away from straight FPSIMD register file saves and restores: - fpsimd_restore_current_state -> ensure current's FPSIMD state is loaded - fpsimd_flush_task_state -> invalidate live copies of a task's FPSIMD state Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
-
由 Ard Biesheuvel 提交于
There are two tacit assumptions in the FPSIMD handling code that will no longer hold after the next patch that optimizes away some FPSIMD state restores: . the FPSIMD registers of this CPU contain the userland FPSIMD state of task 'current'; . when switching to a task, its FPSIMD state will always be restored from memory. This patch adds the following functions to abstract away from straight FPSIMD register file saves and restores: - fpsimd_preserve_current_state -> ensure current's FPSIMD state is saved - fpsimd_update_current_state -> replace current's FPSIMD state Where necessary, the signal handling and fork code are updated to use the above wrappers instead of poking into the FPSIMD registers directly. Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
-
- 09 11月, 2012 1 次提交
-
-
由 Will Deacon 提交于
struct user_fp does not exist for arm64, so use struct user_fpsimd_state instead for the ELF core dumping definitions. Furthermore, since we use regset-based core dumping, we do not need definitions for dump_task_regs and dump_fpu. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 17 9月, 2012 1 次提交
-
-
由 Catalin Marinas 提交于
This patch adds support for FP/ASIMD register bank saving and restoring during context switch and FP exception handling to generate SIGFPE. There are 32 128-bit registers and the context switching is currently done non-lazily. Benchmarks on real hardware are required before implementing lazy FP state saving/restoring. Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com> Acked-by: NTony Lindgren <tony@atomide.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NNicolas Pitre <nico@linaro.org> Acked-by: NOlof Johansson <olof@lixom.net> Acked-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
-