1. 12 12月, 2017 1 次提交
    • S
      arm64: Add software workaround for Falkor erratum 1041 · 932b50c7
      Shanker Donthineni 提交于
      The ARM architecture defines the memory locations that are permitted
      to be accessed as the result of a speculative instruction fetch from
      an exception level for which all stages of translation are disabled.
      Specifically, the core is permitted to speculatively fetch from the
      4KB region containing the current program counter 4K and next 4K.
      
      When translation is changed from enabled to disabled for the running
      exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the
      Falkor core may errantly speculatively access memory locations outside
      of the 4KB region permitted by the architecture. The errant memory
      access may lead to one of the following unexpected behaviors.
      
      1) A System Error Interrupt (SEI) being raised by the Falkor core due
         to the errant memory access attempting to access a region of memory
         that is protected by a slave-side memory protection unit.
      2) Unpredictable device behavior due to a speculative read from device
         memory. This behavior may only occur if the instruction cache is
         disabled prior to or coincident with translation being changed from
         enabled to disabled.
      
      The conditions leading to this erratum will not occur when either of the
      following occur:
       1) A higher exception level disables translation of a lower exception level
         (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0).
       2) An exception level disabling its stage-1 translation if its stage-2
          translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1
          to 0 when HCR_EL2[VM] has a value of 1).
      
      To avoid the errant behavior, software must execute an ISB immediately
      prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0.
      Signed-off-by: NShanker Donthineni <shankerd@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      932b50c7
  2. 07 12月, 2017 3 次提交
    • D
      arm64/sve: Avoid dereference of dead task_struct in KVM guest entry · cb968afc
      Dave Martin 提交于
      When deciding whether to invalidate FPSIMD state cached in the cpu,
      the backend function sve_flush_cpu_state() attempts to dereference
      __this_cpu_read(fpsimd_last_state).  However, this is not safe:
      there is no guarantee that this task_struct pointer is still valid,
      because the task could have exited in the meantime.
      
      This means that we need another means to get the appropriate value
      of TIF_SVE for the associated task.
      
      This patch solves this issue by adding a cached copy of the TIF_SVE
      flag in fpsimd_last_state, which we can check without dereferencing
      the task pointer.
      
      In particular, although this patch is not a KVM fix per se, this
      means that this check is now done safely in the KVM world switch
      path (which is currently the only user of this code).
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      cb968afc
    • D
      arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu. · 8884b7bd
      Dave Martin 提交于
      There is currently some duplicate logic to associate current's
      FPSIMD context with the cpu when loading FPSIMD state into the cpu
      regs.
      
      Subsequent patches will update that logic, so in order to ensure it
      only needs to be done in one place, this patch factors the relevant
      code out into a new function fpsimd_bind_to_cpu().
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8884b7bd
    • D
      arm64: fpsimd: Prevent registers leaking from dead tasks · 071b6d4a
      Dave Martin 提交于
      Currently, loading of a task's fpsimd state into the CPU registers
      is skipped if that task's state is already present in the registers
      of that CPU.
      
      However, the code relies on the struct fpsimd_state * (and by
      extension struct task_struct *) to unambiguously identify a task.
      
      There is a particular case in which this doesn't work reliably:
      when a task exits, its task_struct may be recycled to describe a
      new task.
      
      Consider the following scenario:
      
       1) Task P loads its fpsimd state onto cpu C.
              per_cpu(fpsimd_last_state, C) := P;
              P->thread.fpsimd_state.cpu := C;
      
       2) Task X is scheduled onto C and loads its fpsimd state on C.
              per_cpu(fpsimd_last_state, C) := X;
              X->thread.fpsimd_state.cpu := C;
      
       3) X exits, causing X's task_struct to be freed.
      
       4) P forks a new child T, which obtains X's recycled task_struct.
      	T == X.
      	T->thread.fpsimd_state.cpu == C (inherited from P).
      
       5) T is scheduled on C.
      	T's fpsimd state is not loaded, because
      	per_cpu(fpsimd_last_state, C) == T (== X) &&
      	T->thread.fpsimd_state.cpu == C.
      
              (This is the check performed by fpsimd_thread_switch().)
      
      So, T gets X's registers because the last registers loaded onto C
      were those of X, in (2).
      
      This patch fixes the problem by ensuring that the sched-in check
      fails in (5): fpsimd_flush_task_state(T) is called when T is
      forked, so that T->thread.fpsimd_state.cpu == C cannot be true.
      This relies on the fact that T is not schedulable until after
      copy_thread() completes.
      
      Once T's fpsimd state has been loaded on some CPU C there may still
      be other cpus D for which per_cpu(fpsimd_last_state, D) ==
      &X->thread.fpsimd_state.  But D is necessarily != C in this case,
      and the check in (5) must fail.
      
      An alternative fix would be to do refcounting on task_struct.  This
      would result in each CPU holding a reference to the last task whose
      fpsimd state was loaded there.  It's not clear whether this is
      preferable, and it involves higher overhead than the fix proposed
      in this patch.  It would also move all the task_struct freeing
      work into the context switch critical section, or otherwise some
      deferred cleanup mechanism would need to be introduced, neither of
      which seems obviously justified.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 005f78cd ("arm64: defer reloading a task's FPSIMD state to userland resume")
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [will: word-smithed the comment so it makes more sense]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      071b6d4a
  3. 01 12月, 2017 5 次提交
    • Y
      arm64: cpu_ops: Add missing 'const' qualifiers · 770ba060
      Yury Norov 提交于
      Building the kernel with an LTO-enabled GCC spits out the following "const"
      warning for the cpu_ops code:
      
        mm/percpu.c:2168:20: error: pcpu_fc_names causes a section type conflict
        with dt_supported_cpu_ops
        const char * const pcpu_fc_names[PCPU_FC_NR] __initconst = {
                ^
        arch/arm64/kernel/cpu_ops.c:34:37: note: ‘dt_supported_cpu_ops’ was declared here
        static const struct cpu_operations *dt_supported_cpu_ops[] __initconst = {
      
      Fix it by adding missed const qualifiers.
      Signed-off-by: NYury Norov <ynorov@caviumnetworks.com>
      Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      770ba060
    • X
      arm64: perf: remove unsupported events for Cortex-A73 · f8ada189
      Xu YiPing 提交于
      bus access read/write events are not supported in A73, based on the
      Cortex-A73 TRM r0p2, section 11.9 Events (pages 11-457 to 11-460).
      
      Fixes: 5561b6c5 "arm64: perf: add support for Cortex-A73"
      Acked-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NXu YiPing <xuyiping@hisilicon.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f8ada189
    • D
      arm64: fpsimd: Fix failure to restore FPSIMD state after signals · 9de52a75
      Dave Martin 提交于
      The fpsimd_update_current_state() function is responsible for
      loading the FPSIMD state from the user signal frame into the
      current task during sigreturn.  When implementing support for SVE,
      conditional code was added to this function in order to handle the
      case where SVE state need to be loaded for the task and merged with
      the FPSIMD data from the signal frame; however, the FPSIMD-only
      case was unintentionally dropped.
      
      As a result of this, sigreturn does not currently restore the
      FPSIMD state of the task, except in the case where the system
      supports SVE and the signal frame contains SVE state in addition to
      FPSIMD state.
      
      This patch fixes this bug by making the copy-in of the FPSIMD data
      from the signal frame to thread_struct unconditional.
      
      This remains a performance regression from v4.14, since the FPSIMD
      state is now copied into thread_struct and then loaded back,
      instead of _only_ being loaded into the CPU FPSIMD registers.
      However, it is essential to call task_fpsimd_load() here anyway in
      order to ensure that the SVE enable bit in CPACR_EL1 is set
      correctly before returning to userspace.  This could use some
      refactoring, but since sigreturn is not a fast path I have kept
      this patch as a pure fix and left the refactoring for later.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Fixes: 8cd969d2 ("arm64/sve: Signal handling support")
      Reported-by: NAlex Bennée <alex.bennee@linaro.org>
      Tested-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9de52a75
    • A
      arm64: ftrace: emit ftrace-mod.o contents through code · be0f272b
      Ard Biesheuvel 提交于
      When building the arm64 kernel with both CONFIG_ARM64_MODULE_PLTS and
      CONFIG_DYNAMIC_FTRACE enabled, the ftrace-mod.o object file is built
      with the kernel and contains a trampoline that is linked into each
      module, so that modules can be loaded far away from the kernel and
      still reach the ftrace entry point in the core kernel with an ordinary
      relative branch, as is emitted by the compiler instrumentation code
      dynamic ftrace relies on.
      
      In order to be able to build out of tree modules, this object file
      needs to be included into the linux-headers or linux-devel packages,
      which is undesirable, as it makes arm64 a special case (although a
      precedent does exist for 32-bit PPC).
      
      Given that the trampoline essentially consists of a PLT entry, let's
      not bother with a source or object file for it, and simply patch it
      in whenever the trampoline is being populated, using the existing
      PLT support routines.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      be0f272b
    • A
      arm64: module-plts: factor out PLT generation code for ftrace · 7e8b9c1d
      Ard Biesheuvel 提交于
      To allow the ftrace trampoline code to reuse the PLT entry routines,
      factor it out and move it into asm/module.h.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7e8b9c1d
  4. 03 11月, 2017 22 次提交
    • D
      arm64/sve: Detect SVE and activate runtime support · 43994d82
      Dave Martin 提交于
      This patch enables detection of hardware SVE support via the
      cpufeatures framework, and reports its presence to the kernel and
      userspace via the new ARM64_SVE cpucap and HWCAP_SVE hwcap
      respectively.
      
      Userspace can also detect SVE using ID_AA64PFR0_EL1, using the
      cpufeatures MRS emulation.
      
      When running on hardware that supports SVE, this enables runtime
      kernel support for SVE, and allows user tasks to execute SVE
      instructions and make of the of the SVE-specific user/kernel
      interface extensions implemented by this series.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      43994d82
    • D
      arm64/sve: KVM: Prevent guests from using SVE · 17eed27b
      Dave Martin 提交于
      Until KVM has full SVE support, guests must not be allowed to
      execute SVE instructions.
      
      This patch enables the necessary traps, and also ensures that the
      traps are disabled again on exit from the guest so that the host
      can still use SVE if it wants to.
      
      On guest exit, high bits of the SVE Zn registers may have been
      clobbered as a side-effect the execution of FPSIMD instructions in
      the guest.  The existing KVM host FPSIMD restore code is not
      sufficient to restore these bits, so this patch explicitly marks
      the CPU as not containing cached vector state for any task, thus
      forcing a reload on the next return to userspace.  This is an
      interim measure, in advance of adding full SVE awareness to KVM.
      
      This marking of cached vector state in the CPU as invalid is done
      using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c.  Due
      to the repeated use of this rather obscure operation, it makes
      sense to factor it out as a separate helper with a clearer name.
      This patch factors it out as fpsimd_flush_cpu_state(), and ports
      all callers to use it.
      
      As a side effect of this refactoring, a this_cpu_write() in
      fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
      should be fine, since cpu_pm_enter() is supposed to be called only
      with interrupts disabled.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      17eed27b
    • D
      arm64/sve: Add sysctl to set the default vector length for new processes · 4ffa09a9
      Dave Martin 提交于
      Because of the effect of SVE on the size of the signal frame, the
      default vector length used for new processes involves a tradeoff
      between performance of SVE-enabled software on the one hand, and
      reliability of non-SVE-aware software on the other hand.
      
      For this reason, the best choice depends on the repertoire of
      userspace software in use and is thus best left up to distro
      maintainers, sysadmins and developers.
      
      If CONFIG_SYSCTL and CONFIG_PROC_SYSCTL are enabled, this patch
      exposes the default vector length in
      /proc/sys/abi/sve_default_vector_length, where boot scripts or the
      adventurous can poke it.
      
      In common with other arm64 ABI sysctls, this control is currently
      global: setting it requires CAP_SYS_ADMIN in the root user
      namespace, but the value set is effective for subsequent execs in
      all namespaces.  The control only affects _new_ processes, however:
      changing it does not affect the vector length of any existing
      process.
      
      The intended usage model is that if userspace is known to be fully
      SVE-tolerant (or a developer is curious to find out) then this
      parameter can be cranked up during system startup.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      4ffa09a9
    • D
      arm64/sve: Add prctl controls for userspace vector length management · 2d2123bc
      Dave Martin 提交于
      This patch adds two arm64-specific prctls, to permit userspace to
      control its vector length:
      
       * PR_SVE_SET_VL: set the thread's SVE vector length and vector
         length inheritance mode.
      
       * PR_SVE_GET_VL: get the same information.
      
      Although these prctls resemble instruction set features in the SVE
      architecture, they provide additional control: the vector length
      inheritance mode is Linux-specific and nothing to do with the
      architecture, and the architecture does not permit EL0 to set its
      own vector length directly.  Both can be used in portable tools
      without requiring the use of SVE instructions.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: Fixed up prctl constants to avoid clash with PDEATHSIG]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2d2123bc
    • D
      arm64/sve: ptrace and ELF coredump support · 43d4da2c
      Dave Martin 提交于
      This patch defines and implements a new regset NT_ARM_SVE, which
      describes a thread's SVE register state.  This allows a debugger to
      manipulate the SVE state, as well as being included in ELF
      coredumps for post-mortem debugging.
      
      Because the regset size and layout are dependent on the thread's
      current vector length, it is not possible to define a C struct to
      describe the regset contents as is done for existing regsets.
      Instead, and for the same reasons, NT_ARM_SVE is based on the
      freeform variable-layout approach used for the SVE signal frame.
      
      Additionally, to reduce debug overhead when debugging threads that
      might or might not have live SVE register state, NT_ARM_SVE may be
      presented in one of two different formats: the old struct
      user_fpsimd_state format is embedded for describing the state of a
      thread with no live SVE state, whereas a new variable-layout
      structure is embedded for describing live SVE state.  This avoids a
      debugger needing to poll NT_PRFPREG in addition to NT_ARM_SVE, and
      allows existing userspace code to handle the non-SVE case without
      too much modification.
      
      For this to work, NT_ARM_SVE is defined with a fixed-format header
      of type struct user_sve_header, which the recipient can use to
      figure out the content, size and layout of the reset of the regset.
      Accessor macros are defined to allow the vector-length-dependent
      parts of the regset to be manipulated.
      Signed-off-by: NAlan Hayward <alan.hayward@arm.com>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Cc: Okamoto Takayuki <tokamoto@jp.fujitsu.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      43d4da2c
    • D
      arm64/sve: Preserve SVE registers around EFI runtime service calls · fdfa976c
      Dave Martin 提交于
      The EFI runtime services ABI allows EFI to make free use of the
      FPSIMD registers during EFI runtime service calls, subject to the
      callee-save requirements of the AArch64 procedure call standard.
      
      However, the SVE architecture allows upper bits of the SVE vector
      registers to be zeroed as a side-effect of FPSIMD V-register
      writes.  This means that the SVE vector registers must be saved in
      their entirety in order to avoid data loss: non-SVE-aware EFI
      implementations cannot restore them correctly.
      
      The non-IRQ case is already handled gracefully by
      kernel_neon_begin().  For the IRQ case, this patch allocates a
      suitable per-CPU stash buffer for the full SVE register state and
      uses it to preserve the affected registers around EFI calls.  It is
      currently unclear how the EFI runtime services ABI will be
      clarified with respect to SVE, so it safest to assume that the
      predicate registers and FFR must be saved and restored too.
      
      No attempt is made to restore the restore the vector length after
      a call, for now.  It is deemed rather insane for EFI to change it,
      and contemporary EFI implementations certainly won't.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fdfa976c
    • D
      arm64/sve: Preserve SVE registers around kernel-mode NEON use · 1bd3f936
      Dave Martin 提交于
      Kernel-mode NEON will corrupt the SVE vector registers, due to the
      way they alias the FPSIMD vector registers in the hardware.
      
      This patch ensures that any live SVE register content for the task
      is saved by kernel_neon_begin().  The data will be restored in the
      usual way on return to userspace.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1bd3f936
    • D
      arm64/sve: Probe SVE capabilities and usable vector lengths · 2e0f2478
      Dave Martin 提交于
      This patch uses the cpufeatures framework to determine common SVE
      capabilities and vector lengths, and configures the runtime SVE
      support code appropriately.
      
      ZCR_ELx is not really a feature register, but it is convenient to
      use it as a template for recording the maximum vector length
      supported by a CPU, using the LEN field.  This field is similar to
      a feature field in that it is a contiguous bitfield for which we
      want to determine the minimum system-wide value.  This patch adds
      ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate
      custom code to populate it.  Finding the minimum supported value of
      the LEN field is left to the cpufeatures framework in the usual
      way.
      
      The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,
      so for now we just require it to be zero.
      
      Note that much of this code is dormant and SVE still won't be used
      yet, since system_supports_sve() remains hardwired to false.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2e0f2478
    • D
      arm64: cpufeature: Move sys_caps_initialised declarations · 8f1eec57
      Dave Martin 提交于
      update_cpu_features() currently cannot tell whether it is being
      called during early or late secondary boot.  This doesn't
      desperately matter for anything it currently does.
      
      However, SVE will need to know here whether the set of available
      vector lengths is known or still to be determined when booting a
      CPU, so that it can be updated appropriately.
      
      This patch simply moves the sys_caps_initialised stuff to the top
      of the file so that it can be used more widely.  There doesn't seem
      to be a more obvious place to put it.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8f1eec57
    • D
      arm64/sve: Backend logic for setting the vector length · 7582e220
      Dave Martin 提交于
      This patch implements the core logic for changing a task's vector
      length on request from userspace.  This will be used by the ptrace
      and prctl frontends that are implemented in later patches.
      
      The SVE architecture permits, but does not require, implementations
      to support vector lengths that are not a power of two.  To handle
      this, logic is added to check a requested vector length against a
      possibly sparse bitmap of available vector lengths at runtime, so
      that the best supported value can be chosen.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7582e220
    • D
      arm64/sve: Signal handling support · 8cd969d2
      Dave Martin 提交于
      This patch implements support for saving and restoring the SVE
      registers around signals.
      
      A fixed-size header struct sve_context is always included in the
      signal frame encoding the thread's vector length at the time of
      signal delivery, optionally followed by a variable-layout structure
      encoding the SVE registers.
      
      Because of the need to preserve backwards compatibility, the FPSIMD
      view of the SVE registers is always dumped as a struct
      fpsimd_context in the usual way, in addition to any sve_context.
      
      The SVE vector registers are dumped in full, including bits 127:0
      of each register which alias the corresponding FPSIMD vector
      registers in the hardware.  To avoid any ambiguity about which
      alias to restore during sigreturn, the kernel always restores bits
      127:0 of each SVE vector register from the fpsimd_context in the
      signal frame (which must be present): userspace needs to take this
      into account if it wants to modify the SVE vector register contents
      on return from a signal.
      
      FPSR and FPCR, which are used by both FPSIMD and SVE, are not
      included in sve_context because they are always present in
      fpsimd_context anyway.
      
      For signal delivery, a new helper
      fpsimd_signal_preserve_current_state() is added to update _both_
      the FPSIMD and SVE views in the task struct, to make it easier to
      populate this information into the signal frame.  Because of the
      redundancy between the two views of the state, only one is updated
      otherwise.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8cd969d2
    • D
      arm64/sve: Support vector length resetting for new processes · 79ab047c
      Dave Martin 提交于
      It's desirable to be able to reset the vector length to some sane
      default for new processes, since the new binary and its libraries
      may or may not be SVE-aware.
      
      This patch tracks the desired post-exec vector length (if any) in a
      new thread member sve_vl_onexec, and adds a new thread flag
      TIF_SVE_VL_INHERIT to control whether to inherit or reset the
      vector length.  Currently these are inactive.  Subsequent patches
      will provide the capability to configure them.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      79ab047c
    • D
      arm64/sve: Core task context handling · bc0ee476
      Dave Martin 提交于
      This patch adds the core support for switching and managing the SVE
      architectural state of user tasks.
      
      Calls to the existing FPSIMD low-level save/restore functions are
      factored out as new functions task_fpsimd_{save,load}(), since SVE
      now dynamically may or may not need to be handled at these points
      depending on the kernel configuration, hardware features discovered
      at boot, and the runtime state of the task.  To make these
      decisions as fast as possible, const cpucaps are used where
      feasible, via the system_supports_sve() helper.
      
      The SVE registers are only tracked for threads that have explicitly
      used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
      FPSIMD view of the architectural state is stored in
      thread.fpsimd_state as usual.
      
      When in use, the SVE registers are not stored directly in
      thread_struct due to their potentially large and variable size.
      Because the task_struct slab allocator must be configured very
      early during kernel boot, it is also tricky to configure it
      correctly to match the maximum vector length provided by the
      hardware, since this depends on examining secondary CPUs as well as
      the primary.  Instead, a pointer sve_state in thread_struct points
      to a dynamically allocated buffer containing the SVE register data,
      and code is added to allocate and free this buffer at appropriate
      times.
      
      TIF_SVE is set when taking an SVE access trap from userspace, if
      suitable hardware support has been detected.  This enables SVE for
      the thread: a subsequent return to userspace will disable the trap
      accordingly.  If such a trap is taken without sufficient system-
      wide hardware support, SIGILL is sent to the thread instead as if
      an undefined instruction had been executed: this may happen if
      userspace tries to use SVE in a system where not all CPUs support
      it for example.
      
      The kernel will clear TIF_SVE and disable SVE for the thread
      whenever an explicit syscall is made by userspace.  For backwards
      compatibility reasons and conformance with the spirit of the base
      AArch64 procedure call standard, the subset of the SVE register
      state that aliases the FPSIMD registers is still preserved across a
      syscall even if this happens.  The remainder of the SVE register
      state logically becomes zero at syscall entry, though the actual
      zeroing work is currently deferred until the thread next tries to
      use SVE, causing another trap to the kernel.  This implementation
      is suboptimal: in the future, the fastpath case may be optimised
      to zero the registers in-place and leave SVE enabled for the task,
      where beneficial.
      
      TIF_SVE is also cleared in the following slowpath cases, which are
      taken as reasonable hints that the task may no longer use SVE:
       * exec
       * fork and clone
      
      Code is added to sync data between thread.fpsimd_state and
      thread.sve_state whenever enabling/disabling SVE, in a manner
      consistent with the SVE architectural programmer's model.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: added #include to fix allnoconfig build]
      [will: use enable_daif in do_sve_acc]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bc0ee476
    • D
      arm64/sve: Low-level CPU setup · 22043a3c
      Dave Martin 提交于
      To enable the kernel to use SVE, SVE traps from EL1 to EL2 must be
      disabled.  To take maximum advantage of the hardware, the full
      available vector length also needs to be enabled for EL1 by
      programming ZCR_EL2.LEN.  (The kernel will program ZCR_EL1.LEN as
      required, but this cannot override the limit set by ZCR_EL2.)
      
      This patch makes the appropriate changes to the EL2 early setup
      code.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      22043a3c
    • D
      arm64/sve: Low-level SVE architectural state manipulation functions · 1fc5dce7
      Dave Martin 提交于
      Manipulating the SVE architectural state, including the vector and
      predicate registers, first-fault register and the vector length,
      requires the use of dedicated instructions added by SVE.
      
      This patch adds suitable assembly functions for saving and
      restoring the SVE registers and querying the vector length.
      Setting of the vector length is done as part of register restore.
      
      Since people building kernels may not all get an SVE-enabled
      toolchain for a while, this patch uses macros that generate
      explicit opcodes in place of assembler mnemonics.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1fc5dce7
    • D
      arm64/sve: System register and exception syndrome definitions · 67236564
      Dave Martin 提交于
      The SVE architecture adds some system registers, ID register fields
      and a dedicated ESR exception class.
      
      This patch adds the appropriate definitions that will be needed by
      the kernel.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67236564
    • D
      arm64: fpsimd: Simplify uses of {set,clear}_ti_thread_flag() · 9cf5b54f
      Dave Martin 提交于
      The existing FPSIMD context switch code contains a couple of
      instances of {set,clear}_ti_thread(task_thread_info(task)).  Since
      there are thread flag manipulators that operate directly on
      task_struct, this verbosity isn't strictly needed.
      
      For consistency, this patch simplifies the affected calls.  This
      should have no impact on behaviour.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9cf5b54f
    • D
      arm64: Port deprecated instruction emulation to new sysctl interface · 38b9aeb3
      Dave Martin 提交于
      Currently, armv8_deprected.c takes charge of the "abi" sysctl
      directory, which makes life difficult for other code that wants to
      register sysctls in the same directory.
      
      There is a "new" [1] sysctl registration interface that removes the
      need to define ctl_tables for parent directories explicitly, which
      is ideal here.
      
      This patch ports register_insn_emulation_sysctl() over to the
      register_sysctl() interface and removes the redundant ctl_table for
      "abi".
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      
      [1] fea478d4 (sysctl: Add register_sysctl for normal sysctl
      users)
      The commit message notes an intent to port users of the
      pre-existing interfaces over to register_sysctl(), though the
      number of users of the new interface currently appears negligible.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      38b9aeb3
    • D
      arm64: signal: Verify extra data is user-readable in sys_rt_sigreturn · abf73988
      Dave Martin 提交于
      Currently sys_rt_sigreturn() verifies that the base sigframe is
      readable, but no similar check is performed on the extra data to
      which an extra_context record points.
      
      This matters because the extra data will be read with the
      unprotected user accessors.  However, this is not a problem at
      present because the extra data base address is required to be
      exactly at the end of the base sigframe.  So, there would need to
      be a non-user-readable kernel address within about 59K
      (SIGFRAME_MAXSZ - sizeof(struct rt_sigframe)) of some address for
      which access_ok(VERIFY_READ) returns true, in order for sigreturn
      to be able to read kernel memory that should be inaccessible to the
      user task.  This is currently impossible due to the untranslatable
      address hole between the TTBR0 and TTBR1 address ranges.
      
      Disappearance of the hole between the TTBR0 and TTBR1 mapping
      ranges would require the VA size for TTBR0 and TTBR1 to grow to at
      least 55 bits, and either the disabling of tagged pointers for
      userspace or enabling of tagged pointers for kernel space; none of
      which is currently envisaged.
      
      Even so, it is wrong to use the unprotected user accessors without
      an accompanying access_ok() check.
      
      To avoid the potential for future surprises, this patch does an
      explicit access_ok() check on the extra data space when parsing an
      extra_context record.
      
      Fixes: 33f08261 ("arm64: signal: Allow expansion of the signal frame")
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      abf73988
    • D
      arm64: fpsimd: Correctly annotate exception helpers called from asm · 94ef7ecb
      Dave Martin 提交于
      A couple of FPSIMD exception handling functions that are called
      from entry.S are currently not annotated as such.
      
      This is not a big deal since asmlinkage does nothing on arm/arm64,
      but fixing the annotations is more consistent and may help avoid
      future surprises.
      
      This patch adds appropriate asmlinkage annotations for
      do_fpsimd_acc() and do_fpsimd_exc().
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      94ef7ecb
    • J
      arm64: Fix static use of function graph · d125bffc
      Julien Thierry 提交于
      Function graph does not work currently when CONFIG_DYNAMIC_TRACE is not
      set. This is because ftrace_function_trace is not always set to ftrace_stub
      when function_graph is in use.
      
      Do not skip checking of graph tracer functions when ftrace_function_trace
      is set.
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Reviewed-by: NAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d125bffc
    • M
      arm64: ensure __dump_instr() checks addr_limit · 7a7003b1
      Mark Rutland 提交于
      It's possible for a user to deliberately trigger __dump_instr with a
      chosen kernel address.
      
      Let's avoid problems resulting from this by using get_user() rather than
      __get_user(), ensuring that we don't erroneously access kernel memory.
      
      Where we use __dump_instr() on kernel text, we already switch to
      KERNEL_DS, so this shouldn't adversely affect those cases.
      
      Fixes: 60ffc30d ("arm64: Exception handling")
      Cc: stable@vger.kernel.org
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      7a7003b1
  5. 02 11月, 2017 9 次提交