1. 12 7月, 2018 1 次提交
    • M
      arm64: move sve_user_{enable,disable} to <asm/fpsimd.h> · f9209e26
      Mark Rutland 提交于
      In subsequent patches, we'll want to make use of sve_user_enable() and
      sve_user_disable() outside of kernel/fpsimd.c. Let's move these to
      <asm/fpsimd.h> where we can make use of them.
      
      To avoid ifdeffery in sequences like:
      
      if (system_supports_sve() && some_condition)
      	sve_user_disable();
      
      ... empty stubs are provided when support for SVE is not enabled. Note
      that system_supports_sve() contains as IS_ENABLED(CONFIG_ARM64_SVE), so
      the sve_user_disable() call should be optimized away entirely when
      CONFIG_ARM64_SVE is not selected.
      
      To ensure that this is the case, the stub definitions contain a
      BUILD_BUG(), as we do for other stubs for which calls should always be
      optimized away when the relevant config option is not selected.
      
      At the same time, the include list of <asm/fpsimd.h> is sorted while
      adding <asm/sysreg.h>.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NDave Martin <dave.martin@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f9209e26
  2. 25 5月, 2018 3 次提交
    • D
      arm64/sve: Move sve_pffr() to fpsimd.h and make inline · 9a6e5948
      Dave Martin 提交于
      In order to make sve_save_state()/sve_load_state() more easily
      reusable and to get rid of a potential branch on context switch
      critical paths, this patch makes sve_pffr() inline and moves it to
      fpsimd.h.
      
      <asm/processor.h> must be included in fpsimd.h in order to make
      this work, and this creates an #include cycle that is tricky to
      avoid without modifying core code, due to the way the PR_SVE_*()
      prctl helpers are included in the core prctl implementation.
      
      Instead of breaking the cycle, this patch defers inclusion of
      <asm/fpsimd.h> in <asm/processor.h> until the point where it is
      actually needed: i.e., immediately before the prctl definitions.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      9a6e5948
    • D
      arm64/sve: Move read_zcr_features() out of cpufeature.h · 31dc52b3
      Dave Martin 提交于
      Having read_zcr_features() inline in cpufeature.h results in that
      header requiring #includes which make it hard to include
      <asm/fpsimd.h> elsewhere without triggering header inclusion
      cycles.
      
      This is not a hot-path function and arguably should not be in
      cpufeature.h in the first place, so this patch moves it to
      fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y.
      
      This allows some SVE-related #includes to be dropped from
      cpufeature.h, which will ease future maintenance.
      
      A couple of missing #includes of <asm/fpsimd.h> are exposed by this
      change under arch/arm64/.  This patch adds the missing #includes as
      necessary.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      31dc52b3
    • D
      KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing · e6b673b7
      Dave Martin 提交于
      This patch refactors KVM to align the host and guest FPSIMD
      save/restore logic with each other for arm64.  This reduces the
      number of redundant save/restore operations that must occur, and
      reduces the common-case IRQ blackout time during guest exit storms
      by saving the host state lazily and optimising away the need to
      restore the host state before returning to the run loop.
      
      Four hooks are defined in order to enable this:
      
       * kvm_arch_vcpu_run_map_fp():
         Called on PID change to map necessary bits of current to Hyp.
      
       * kvm_arch_vcpu_load_fp():
         Set up FP/SIMD for entering the KVM run loop (parse as
         "vcpu_load fp").
      
       * kvm_arch_vcpu_ctxsync_fp():
         Get FP/SIMD into a safe state for re-enabling interrupts after a
         guest exit back to the run loop.
      
         For arm64 specifically, this involves updating the host kernel's
         FPSIMD context tracking metadata so that kernel-mode NEON use
         will cause the vcpu's FPSIMD state to be saved back correctly
         into the vcpu struct.  This must be done before re-enabling
         interrupts because kernel-mode NEON may be used by softirqs.
      
       * kvm_arch_vcpu_put_fp():
         Save guest FP/SIMD state back to memory and dissociate from the
         CPU ("vcpu_put fp").
      
      Also, the arm64 FPSIMD context switch code is updated to enable it
      to save back FPSIMD state for a vcpu, not just current.  A few
      helpers drive this:
      
       * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp):
         mark this CPU as having context fp (which may belong to a vcpu)
         currently loaded in its registers.  This is the non-task
         equivalent of the static function fpsimd_bind_to_cpu() in
         fpsimd.c.
      
       * task_fpsimd_save():
         exported to allow KVM to save the guest's FPSIMD state back to
         memory on exit from the run loop.
      
       * fpsimd_flush_state():
         invalidate any context's FPSIMD state that is currently loaded.
         Used to disassociate the vcpu from the CPU regs on run loop exit.
      
      These changes allow the run loop to enable interrupts (and thus
      softirqs that may use kernel-mode NEON) without having to save the
      guest's FPSIMD state eagerly.
      
      Some new vcpu_arch fields are added to make all this work.  Because
      host FPSIMD state can now be saved back directly into current's
      thread_struct as appropriate, host_cpu_context is no longer used
      for preserving the FPSIMD state.  However, it is still needed for
      preserving other things such as the host's system registers.  To
      avoid ABI churn, the redundant storage space in host_cpu_context is
      not removed for now.
      
      arch/arm is not addressed by this patch and continues to use its
      current save/restore logic.  It could provide implementations of
      the helpers later if desired.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      e6b673b7
  3. 28 3月, 2018 1 次提交
    • D
      arm64: fpsimd: Split cpu field out from struct fpsimd_state · 20b85472
      Dave Martin 提交于
      In preparation for using a common representation of the FPSIMD
      state for tasks and KVM vcpus, this patch separates out the "cpu"
      field that is used to track the cpu on which the state was most
      recently loaded.
      
      This will allow common code to operate on task and vcpu contexts
      without requiring the cpu field to be stored at the same offset
      from the FPSIMD register data in both cases.  This should avoid the
      need for messing with the definition of those parts of struct
      vcpu_arch that are exposed in the KVM user ABI.
      
      The resulting change is also convenient for grouping and defining
      the set of thread_struct fields that are supposed to be accessible
      to copy_{to,from}_user(), which includes user_fpsimd_state but
      should exclude the cpu field.  This patch does not amend the
      usercopy whitelist to match: that will be addressed in a subsequent
      patch.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      [will: inline fpsimd_flush_state for now]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      20b85472
  4. 27 3月, 2018 2 次提交
  5. 16 1月, 2018 1 次提交
    • D
      arm64: fpsimd: Fix state leakage when migrating after sigreturn · 0abdeff5
      Dave Martin 提交于
      When refactoring the sigreturn code to handle SVE, I changed the
      sigreturn implementation to store the new FPSIMD state from the
      user sigframe into task_struct before reloading the state into the
      CPU regs.  This makes it easier to convert the data for SVE when
      needed.
      
      However, it turns out that the fpsimd_state structure passed into
      fpsimd_update_current_state is not fully initialised, so assigning
      the structure as a whole corrupts current->thread.fpsimd_state.cpu
      with uninitialised data.
      
      This means that if the garbage data written to .cpu happens to be a
      valid cpu number, and the task is subsequently migrated to the cpu
      identified by the that number, and then tries to enter userspace,
      the CPU FPSIMD regs will be assumed to be correct for the task and
      not reloaded as they should be.  This can result in returning to
      userspace with the FPSIMD registers containing data that is stale or
      that belongs to another task or to the kernel.
      
      Knowingly handing around a kernel structure that is incompletely
      initialised with user data is a potential source of mistakes,
      especially across source file boundaries.  To help avoid a repeat
      of this issue, this patch adapts the relevant internal API to hand
      around the user-accessible subset only: struct user_fpsimd_state.
      
      To avoid future surprises, this patch also converts all uses of
      struct fpsimd_state that really only access the user subset, to use
      struct user_fpsimd_state.  A few missing consts are added to
      function prototypes for good measure.
      
      Thanks to Will for spotting the cause of the bug here.
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      0abdeff5
  6. 03 11月, 2017 8 次提交
    • D
      arm64/sve: KVM: Prevent guests from using SVE · 17eed27b
      Dave Martin 提交于
      Until KVM has full SVE support, guests must not be allowed to
      execute SVE instructions.
      
      This patch enables the necessary traps, and also ensures that the
      traps are disabled again on exit from the guest so that the host
      can still use SVE if it wants to.
      
      On guest exit, high bits of the SVE Zn registers may have been
      clobbered as a side-effect the execution of FPSIMD instructions in
      the guest.  The existing KVM host FPSIMD restore code is not
      sufficient to restore these bits, so this patch explicitly marks
      the CPU as not containing cached vector state for any task, thus
      forcing a reload on the next return to userspace.  This is an
      interim measure, in advance of adding full SVE awareness to KVM.
      
      This marking of cached vector state in the CPU as invalid is done
      using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c.  Due
      to the repeated use of this rather obscure operation, it makes
      sense to factor it out as a separate helper with a clearer name.
      This patch factors it out as fpsimd_flush_cpu_state(), and ports
      all callers to use it.
      
      As a side effect of this refactoring, a this_cpu_write() in
      fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
      should be fine, since cpu_pm_enter() is supposed to be called only
      with interrupts disabled.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      17eed27b
    • D
      arm64/sve: Add prctl controls for userspace vector length management · 2d2123bc
      Dave Martin 提交于
      This patch adds two arm64-specific prctls, to permit userspace to
      control its vector length:
      
       * PR_SVE_SET_VL: set the thread's SVE vector length and vector
         length inheritance mode.
      
       * PR_SVE_GET_VL: get the same information.
      
      Although these prctls resemble instruction set features in the SVE
      architecture, they provide additional control: the vector length
      inheritance mode is Linux-specific and nothing to do with the
      architecture, and the architecture does not permit EL0 to set its
      own vector length directly.  Both can be used in portable tools
      without requiring the use of SVE instructions.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: Fixed up prctl constants to avoid clash with PDEATHSIG]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2d2123bc
    • D
      arm64/sve: ptrace and ELF coredump support · 43d4da2c
      Dave Martin 提交于
      This patch defines and implements a new regset NT_ARM_SVE, which
      describes a thread's SVE register state.  This allows a debugger to
      manipulate the SVE state, as well as being included in ELF
      coredumps for post-mortem debugging.
      
      Because the regset size and layout are dependent on the thread's
      current vector length, it is not possible to define a C struct to
      describe the regset contents as is done for existing regsets.
      Instead, and for the same reasons, NT_ARM_SVE is based on the
      freeform variable-layout approach used for the SVE signal frame.
      
      Additionally, to reduce debug overhead when debugging threads that
      might or might not have live SVE register state, NT_ARM_SVE may be
      presented in one of two different formats: the old struct
      user_fpsimd_state format is embedded for describing the state of a
      thread with no live SVE state, whereas a new variable-layout
      structure is embedded for describing live SVE state.  This avoids a
      debugger needing to poll NT_PRFPREG in addition to NT_ARM_SVE, and
      allows existing userspace code to handle the non-SVE case without
      too much modification.
      
      For this to work, NT_ARM_SVE is defined with a fixed-format header
      of type struct user_sve_header, which the recipient can use to
      figure out the content, size and layout of the reset of the regset.
      Accessor macros are defined to allow the vector-length-dependent
      parts of the regset to be manipulated.
      Signed-off-by: NAlan Hayward <alan.hayward@arm.com>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Cc: Okamoto Takayuki <tokamoto@jp.fujitsu.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      43d4da2c
    • D
      arm64/sve: Probe SVE capabilities and usable vector lengths · 2e0f2478
      Dave Martin 提交于
      This patch uses the cpufeatures framework to determine common SVE
      capabilities and vector lengths, and configures the runtime SVE
      support code appropriately.
      
      ZCR_ELx is not really a feature register, but it is convenient to
      use it as a template for recording the maximum vector length
      supported by a CPU, using the LEN field.  This field is similar to
      a feature field in that it is a contiguous bitfield for which we
      want to determine the minimum system-wide value.  This patch adds
      ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate
      custom code to populate it.  Finding the minimum supported value of
      the LEN field is left to the cpufeatures framework in the usual
      way.
      
      The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,
      so for now we just require it to be zero.
      
      Note that much of this code is dormant and SVE still won't be used
      yet, since system_supports_sve() remains hardwired to false.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2e0f2478
    • D
      arm64/sve: Backend logic for setting the vector length · 7582e220
      Dave Martin 提交于
      This patch implements the core logic for changing a task's vector
      length on request from userspace.  This will be used by the ptrace
      and prctl frontends that are implemented in later patches.
      
      The SVE architecture permits, but does not require, implementations
      to support vector lengths that are not a power of two.  To handle
      this, logic is added to check a requested vector length against a
      possibly sparse bitmap of available vector lengths at runtime, so
      that the best supported value can be chosen.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7582e220
    • D
      arm64/sve: Signal handling support · 8cd969d2
      Dave Martin 提交于
      This patch implements support for saving and restoring the SVE
      registers around signals.
      
      A fixed-size header struct sve_context is always included in the
      signal frame encoding the thread's vector length at the time of
      signal delivery, optionally followed by a variable-layout structure
      encoding the SVE registers.
      
      Because of the need to preserve backwards compatibility, the FPSIMD
      view of the SVE registers is always dumped as a struct
      fpsimd_context in the usual way, in addition to any sve_context.
      
      The SVE vector registers are dumped in full, including bits 127:0
      of each register which alias the corresponding FPSIMD vector
      registers in the hardware.  To avoid any ambiguity about which
      alias to restore during sigreturn, the kernel always restores bits
      127:0 of each SVE vector register from the fpsimd_context in the
      signal frame (which must be present): userspace needs to take this
      into account if it wants to modify the SVE vector register contents
      on return from a signal.
      
      FPSR and FPCR, which are used by both FPSIMD and SVE, are not
      included in sve_context because they are always present in
      fpsimd_context anyway.
      
      For signal delivery, a new helper
      fpsimd_signal_preserve_current_state() is added to update _both_
      the FPSIMD and SVE views in the task struct, to make it easier to
      populate this information into the signal frame.  Because of the
      redundancy between the two views of the state, only one is updated
      otherwise.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8cd969d2
    • D
      arm64/sve: Core task context handling · bc0ee476
      Dave Martin 提交于
      This patch adds the core support for switching and managing the SVE
      architectural state of user tasks.
      
      Calls to the existing FPSIMD low-level save/restore functions are
      factored out as new functions task_fpsimd_{save,load}(), since SVE
      now dynamically may or may not need to be handled at these points
      depending on the kernel configuration, hardware features discovered
      at boot, and the runtime state of the task.  To make these
      decisions as fast as possible, const cpucaps are used where
      feasible, via the system_supports_sve() helper.
      
      The SVE registers are only tracked for threads that have explicitly
      used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
      FPSIMD view of the architectural state is stored in
      thread.fpsimd_state as usual.
      
      When in use, the SVE registers are not stored directly in
      thread_struct due to their potentially large and variable size.
      Because the task_struct slab allocator must be configured very
      early during kernel boot, it is also tricky to configure it
      correctly to match the maximum vector length provided by the
      hardware, since this depends on examining secondary CPUs as well as
      the primary.  Instead, a pointer sve_state in thread_struct points
      to a dynamically allocated buffer containing the SVE register data,
      and code is added to allocate and free this buffer at appropriate
      times.
      
      TIF_SVE is set when taking an SVE access trap from userspace, if
      suitable hardware support has been detected.  This enables SVE for
      the thread: a subsequent return to userspace will disable the trap
      accordingly.  If such a trap is taken without sufficient system-
      wide hardware support, SIGILL is sent to the thread instead as if
      an undefined instruction had been executed: this may happen if
      userspace tries to use SVE in a system where not all CPUs support
      it for example.
      
      The kernel will clear TIF_SVE and disable SVE for the thread
      whenever an explicit syscall is made by userspace.  For backwards
      compatibility reasons and conformance with the spirit of the base
      AArch64 procedure call standard, the subset of the SVE register
      state that aliases the FPSIMD registers is still preserved across a
      syscall even if this happens.  The remainder of the SVE register
      state logically becomes zero at syscall entry, though the actual
      zeroing work is currently deferred until the thread next tries to
      use SVE, causing another trap to the kernel.  This implementation
      is suboptimal: in the future, the fastpath case may be optimised
      to zero the registers in-place and leave SVE enabled for the task,
      where beneficial.
      
      TIF_SVE is also cleared in the following slowpath cases, which are
      taken as reasonable hints that the task may no longer use SVE:
       * exec
       * fork and clone
      
      Code is added to sync data between thread.fpsimd_state and
      thread.sve_state whenever enabling/disabling SVE, in a manner
      consistent with the SVE architectural programmer's model.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: added #include to fix allnoconfig build]
      [will: use enable_daif in do_sve_acc]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bc0ee476
    • D
      arm64/sve: Low-level SVE architectural state manipulation functions · 1fc5dce7
      Dave Martin 提交于
      Manipulating the SVE architectural state, including the vector and
      predicate registers, first-fault register and the vector length,
      requires the use of dedicated instructions added by SVE.
      
      This patch adds suitable assembly functions for saving and
      restoring the SVE registers and querying the vector length.
      Setting of the vector length is done as part of register restore.
      
      Since people building kernels may not all get an SVE-enabled
      toolchain for a while, this patch uses macros that generate
      explicit opcodes in place of assembler mnemonics.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1fc5dce7
  7. 04 8月, 2017 2 次提交
    • D
      arm64: neon: Remove support for nested or hardirq kernel-mode NEON · cb84d11e
      Dave Martin 提交于
      Support for kernel-mode NEON to be nested and/or used in hardirq
      context adds significant complexity, and the benefits may be
      marginal.  In practice, kernel-mode NEON is not used in hardirq
      context, and is rarely used in softirq context (by certain mac80211
      drivers).
      
      This patch implements an arm64 may_use_simd() function to allow
      clients to check whether kernel-mode NEON is usable in the current
      context, and simplifies kernel_neon_{begin,end}() to handle only
      saving of the task FPSIMD state (if any).  Without nesting, there
      is no other state to save.
      
      The partial fpsimd save/restore functions become redundant as a
      result of these changes, so they are removed too.
      
      The save/restore model is changed to operate directly on
      task_struct without additional percpu storage.  This simplifies the
      code and saves a bit of memory, but means that softirqs must now be
      disabled when manipulating the task fpsimd state from task context:
      correspondingly, preempt_{en,dis}sable() calls are upgraded to
      local_bh_{en,dis}able() as appropriate.  fpsimd_thread_switch()
      already runs with hardirqs disabled and so is already protected
      from softirqs.
      
      These changes should make it easier to support kernel-mode NEON in
      the presence of the Scalable Vector extension in the future.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      cb84d11e
    • D
      arm64: neon: Allow EFI runtime services to use FPSIMD in irq context · 4328825d
      Dave Martin 提交于
      In order to be able to cope with kernel-mode NEON being unavailable
      in hardirq/nmi context and non-nestable, we need special handling
      for EFI runtime service calls that may be made during an interrupt
      that interrupted a kernel_neon_begin()..._end() block.  This will
      occur if the kernel tries to write diagnostic data to EFI
      persistent storage during a panic triggered by an NMI for example.
      
      EFI runtime services specify an ABI that clobbers the FPSIMD state,
      rather than being able to use it optionally as an accelerator.
      This means that EFI is really a special case and can be handled
      specially.
      
      To enable EFI calls from interrupts, this patch creates dedicated
      __efi_fpsimd_{begin,end}() helpers solely for this purpose, which
      save/restore to a separate percpu buffer if called in a context
      where kernel_neon_begin() is not usable.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      4328825d
  8. 08 5月, 2014 3 次提交
    • A
      arm64: add support for kernel mode NEON in interrupt context · 190f1ca8
      Ard Biesheuvel 提交于
      This patch modifies kernel_neon_begin() and kernel_neon_end(), so
      they may be called from any context. To address the case where only
      a couple of registers are needed, kernel_neon_begin_partial(u32) is
      introduced which takes as a parameter the number of bottom 'n' NEON
      q-registers required. To mark the end of such a partial section, the
      regular kernel_neon_end() should be used.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      190f1ca8
    • A
      arm64: defer reloading a task's FPSIMD state to userland resume · 005f78cd
      Ard Biesheuvel 提交于
      If a task gets scheduled out and back in again and nothing has touched
      its FPSIMD state in the mean time, there is really no reason to reload
      it from memory. Similarly, repeated calls to kernel_neon_begin() and
      kernel_neon_end() will preserve and restore the FPSIMD state every time.
      
      This patch defers the FPSIMD state restore to the last possible moment,
      i.e., right before the task returns to userland. If a task does not return to
      userland at all (for any reason), the existing FPSIMD state is preserved
      and may be reused by the owning task if it gets scheduled in again on the
      same CPU.
      
      This patch adds two more functions to abstract away from straight FPSIMD
      register file saves and restores:
      - fpsimd_restore_current_state -> ensure current's FPSIMD state is loaded
      - fpsimd_flush_task_state -> invalidate live copies of a task's FPSIMD state
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      005f78cd
    • A
      arm64: add abstractions for FPSIMD state manipulation · c51f9269
      Ard Biesheuvel 提交于
      There are two tacit assumptions in the FPSIMD handling code that will no longer
      hold after the next patch that optimizes away some FPSIMD state restores:
      . the FPSIMD registers of this CPU contain the userland FPSIMD state of
        task 'current';
      . when switching to a task, its FPSIMD state will always be restored from
        memory.
      
      This patch adds the following functions to abstract away from straight FPSIMD
      register file saves and restores:
      - fpsimd_preserve_current_state -> ensure current's FPSIMD state is saved
      - fpsimd_update_current_state -> replace current's FPSIMD state
      
      Where necessary, the signal handling and fork code are updated to use the above
      wrappers instead of poking into the FPSIMD registers directly.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      c51f9269
  9. 09 11月, 2012 1 次提交
  10. 17 9月, 2012 1 次提交