1. 11 12月, 2018 3 次提交
    • S
      arm64: mm: introduce 52-bit userspace support · 67e7fdfc
      Steve Capper 提交于
      On arm64 there is optional support for a 52-bit virtual address space.
      To exploit this one has to be running with a 64KB page size and be
      running on hardware that supports this.
      
      For an arm64 kernel supporting a 48 bit VA with a 64KB page size,
      some changes are needed to support a 52-bit userspace:
       * TCR_EL1.T0SZ needs to be 12 instead of 16,
       * TASK_SIZE needs to reflect the new size.
      
      This patch implements the above when the support for 52-bit VAs is
      detected at early boot time.
      
      On arm64 userspace addresses translation is controlled by TTBR0_EL1. As
      well as userspace, TTBR0_EL1 controls:
       * The identity mapping,
       * EFI runtime code.
      
      It is possible to run a kernel with an identity mapping that has a
      larger VA size than userspace (and for this case __cpu_set_tcr_t0sz()
      would set TCR_EL1.T0SZ as appropriate). However, when the conditions for
      52-bit userspace are met; it is possible to keep TCR_EL1.T0SZ fixed at
      12. Thus in this patch, the TCR_EL1.T0SZ size changing logic is
      disabled.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67e7fdfc
    • S
      arm64: mm: Define arch_get_mmap_end, arch_get_mmap_base · e5d99157
      Steve Capper 提交于
      Now that we have DEFAULT_MAP_WINDOW defined, we can arch_get_mmap_end
      and arch_get_mmap_base helpers to allow for high addresses in mmap.
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e5d99157
    • S
      arm64: mm: Introduce DEFAULT_MAP_WINDOW · 363524d2
      Steve Capper 提交于
      We wish to introduce a 52-bit virtual address space for userspace but
      maintain compatibility with software that assumes the maximum VA space
      size is 48 bit.
      
      In order to achieve this, on 52-bit VA systems, we make mmap behave as
      if it were running on a 48-bit VA system (unless userspace explicitly
      requests a VA where addr[51:48] != 0).
      
      On a system running a 52-bit userspace we need TASK_SIZE to represent
      the 52-bit limit as it is used in various places to distinguish between
      kernelspace and userspace addresses.
      
      Thus we need a new limit for mmap, stack, ELF loader and EFI (which uses
      TTBR0) to represent the non-extended VA space.
      
      This patch introduces DEFAULT_MAP_WINDOW and DEFAULT_MAP_WINDOW_64 and
      switches the appropriate logic to use that instead of TASK_SIZE.
      Signed-off-by: NSteve Capper <steve.capper@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      363524d2
  2. 09 11月, 2018 1 次提交
  3. 31 10月, 2018 1 次提交
  4. 15 9月, 2018 2 次提交
  5. 26 7月, 2018 1 次提交
  6. 06 7月, 2018 1 次提交
    • M
      arm64: use PSR_AA32 definitions · d64567f6
      Mark Rutland 提交于
      Some code cares about the SPSR_ELx format for exceptions taken from
      AArch32 to inspect or manipulate the SPSR_ELx value, which is already in
      the SPSR_ELx format, and not in the AArch32 PSR format.
      
      To separate these from cases where we care about the AArch32 PSR format,
      migrate these cases to use the PSR_AA32_* definitions rather than
      COMPAT_PSR_*.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d64567f6
  7. 01 6月, 2018 1 次提交
    • D
      arm64: signal: Report signal frame size to userspace via auxv · 94b07c1f
      Dave Martin 提交于
      Stateful CPU architecture extensions may require the signal frame
      to grow to a size that exceeds the arch's MINSIGSTKSZ #define.
      However, changing this #define is an ABI break.
      
      To allow userspace the option of determining the signal frame size
      in a more forwards-compatible way, this patch adds a new auxv entry
      tagged with AT_MINSIGSTKSZ, which provides the maximum signal frame
      size that the process can observe during its lifetime.
      
      If AT_MINSIGSTKSZ is absent from the aux vector, the caller can
      assume that the MINSIGSTKSZ #define is sufficient.  This allows for
      a consistent interface with older kernels that do not provide
      AT_MINSIGSTKSZ.
      
      The idea is that libc could expose this via sysconf() or some
      similar mechanism.
      
      There is deliberately no AT_SIGSTKSZ.  The kernel knows nothing
      about userspace's own stack overheads and should not pretend to
      know.
      
      For arm64:
      
      The primary motivation for this interface is the Scalable Vector
      Extension, which can require at least 4KB or so of extra space
      in the signal frame for the largest hardware implementations.
      
      To determine the correct value, a "Christmas tree" mode (via the
      add_all argument) is added to setup_sigframe_layout(), to simulate
      addition of all possible records to the signal frame at maximum
      possible size.
      
      If this procedure goes wrong somehow, resulting in a stupidly large
      frame layout and hence failure of sigframe_alloc() to allocate a
      record to the frame, then this is indicative of a kernel bug.  In
      this case, we WARN() and no attempt is made to populate
      AT_MINSIGSTKSZ for userspace.
      
      For arm64 SVE:
      
      The SVE context block in the signal frame needs to be considered
      too when computing the maximum possible signal frame size.
      
      Because the size of this block depends on the vector length, this
      patch computes the size based not on the thread's current vector
      length but instead on the maximum possible vector length: this
      determines the maximum size of SVE context block that can be
      observed in any signal frame for the lifetime of the process.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      94b07c1f
  8. 25 5月, 2018 3 次提交
    • D
      arm64/sve: Move sve_pffr() to fpsimd.h and make inline · 9a6e5948
      Dave Martin 提交于
      In order to make sve_save_state()/sve_load_state() more easily
      reusable and to get rid of a potential branch on context switch
      critical paths, this patch makes sve_pffr() inline and moves it to
      fpsimd.h.
      
      <asm/processor.h> must be included in fpsimd.h in order to make
      this work, and this creates an #include cycle that is tricky to
      avoid without modifying core code, due to the way the PR_SVE_*()
      prctl helpers are included in the core prctl implementation.
      
      Instead of breaking the cycle, this patch defers inclusion of
      <asm/fpsimd.h> in <asm/processor.h> until the point where it is
      actually needed: i.e., immediately before the prctl definitions.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      9a6e5948
    • D
      arm64/sve: Move read_zcr_features() out of cpufeature.h · 31dc52b3
      Dave Martin 提交于
      Having read_zcr_features() inline in cpufeature.h results in that
      header requiring #includes which make it hard to include
      <asm/fpsimd.h> elsewhere without triggering header inclusion
      cycles.
      
      This is not a hot-path function and arguably should not be in
      cpufeature.h in the first place, so this patch moves it to
      fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y.
      
      This allows some SVE-related #includes to be dropped from
      cpufeature.h, which will ease future maintenance.
      
      A couple of missing #includes of <asm/fpsimd.h> are exposed by this
      change under arch/arm64/.  This patch adds the missing #includes as
      necessary.
      
      No functional change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      31dc52b3
    • D
      arm64: fpsimd: Eliminate task->mm checks · df3fb968
      Dave Martin 提交于
      Currently the FPSIMD handling code uses the condition task->mm ==
      NULL as a hint that task has no FPSIMD register context.
      
      The ->mm check is only there to filter out tasks that cannot
      possibly have FPSIMD context loaded, for optimisation purposes.
      Also, TIF_FOREIGN_FPSTATE must always be checked anyway before
      saving FPSIMD context back to memory.  For these reasons, the ->mm
      checks are not useful, providing that TIF_FOREIGN_FPSTATE is
      maintained in a consistent way for all threads.
      
      The context switch logic is already deliberately optimised to defer
      reloads of the regs until ret_to_user (or sigreturn as a special
      case), and save them only if they have been previously loaded.
      These paths are the only places where the wrong_task and wrong_cpu
      conditions can be made false, by calling fpsimd_bind_task_to_cpu().
      Kernel threads by definition never reach these paths.  As a result,
      the wrong_task and wrong_cpu tests in fpsimd_thread_switch() will
      always yield true for kernel threads.
      
      This patch removes the redundant checks and special-case code,
      ensuring that TIF_FOREIGN_FPSTATE is set whenever a kernel thread
      is scheduled in, and ensures that this flag is set for the init
      task.  The fpsimd_flush_task_state() call already present in
      copy_thread() ensures the same for any new task.
      
      With TIF_FOREIGN_FPSTATE always set for kernel threads, this patch
      ensures that no extra context save work is added for kernel
      threads, and eliminates the redundant context saving that may
      currently occur for kernel threads that have acquired an mm via
      use_mm().
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      df3fb968
  9. 28 3月, 2018 2 次提交
    • D
      arm64: uaccess: Fix omissions from usercopy whitelist · 65896545
      Dave Martin 提交于
      When the hardend usercopy support was added for arm64, it was
      concluded that all cases of usercopy into and out of thread_struct
      were statically sized and so didn't require explicit whitelisting
      of the appropriate fields in thread_struct.
      
      Testing with usercopy hardening enabled has revealed that this is
      not the case for certain ptrace regset manipulation calls on arm64.
      This occurs because the sizes of usercopies associated with the
      regset API are dynamic by construction, and because arm64 does not
      always stage such copies via the stack: indeed the regset API is
      designed to avoid the need for that by adding some bounds checking.
      
      This is currently believed to affect only the fpsimd and TLS
      registers.
      
      Because the whitelisted fields in thread_struct must be contiguous,
      this patch groups them together in a nested struct.  It is also
      necessary to be able to determine the location and size of that
      struct, so rather than making the struct anonymous (which would
      save on edits elsewhere) or adding an anonymous union containing
      named and unnamed instances of the same struct (gross), this patch
      gives the struct a name and makes the necessary edits to code that
      references it (noisy but simple).
      
      Care is needed to ensure that the new struct does not contain
      padding (which the usercopy hardening would fail to protect).
      
      For this reason, the presence of tp2_value is made unconditional,
      since a padding field would be needed there in any case.  This pads
      up to the 16-byte alignment required by struct user_fpsimd_state.
      Acked-by: NKees Cook <keescook@chromium.org>
      Reported-by: NMark Rutland <mark.rutland@arm.com>
      Fixes: 9e8084d3 ("arm64: Implement thread_struct whitelist for hardened usercopy")
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      65896545
    • D
      arm64: fpsimd: Split cpu field out from struct fpsimd_state · 20b85472
      Dave Martin 提交于
      In preparation for using a common representation of the FPSIMD
      state for tasks and KVM vcpus, this patch separates out the "cpu"
      field that is used to track the cpu on which the state was most
      recently loaded.
      
      This will allow common code to operate on task and vcpu contexts
      without requiring the cpu field to be stored at the same offset
      from the FPSIMD register data in both cases.  This should avoid the
      need for messing with the definition of those parts of struct
      vcpu_arch that are exposed in the KVM user ABI.
      
      The resulting change is also convenient for grouping and defining
      the set of thread_struct fields that are supposed to be accessible
      to copy_{to,from}_user(), which includes user_fpsimd_state but
      should exclude the cpu field.  This patch does not amend the
      usercopy whitelist to match: that will be addressed in a subsequent
      patch.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      [will: inline fpsimd_flush_state for now]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      20b85472
  10. 27 3月, 2018 1 次提交
    • D
      arm64: capabilities: Update prototype for enable call back · c0cda3b8
      Dave Martin 提交于
      We issue the enable() call back for all CPU hwcaps capabilities
      available on the system, on all the CPUs. So far we have ignored
      the argument passed to the call back, which had a prototype to
      accept a "void *" for use with on_each_cpu() and later with
      stop_machine(). However, with commit 0a0d111d
      ("arm64: cpufeature: Pass capability structure to ->enable callback"),
      there are some users of the argument who wants the matching capability
      struct pointer where there are multiple matching criteria for a single
      capability. Clean up the declaration of the call back to make it clear.
      
       1) Renamed to cpu_enable(), to imply taking necessary actions on the
          called CPU for the entry.
       2) Pass const pointer to the capability, to allow the call back to
          check the entry. (e.,g to check if any action is needed on the CPU)
       3) We don't care about the result of the call back, turning this to
          a void.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: NJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: NDave Martin <dave.martin@arm.com>
      [suzuki: convert more users, rename call back and drop results]
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c0cda3b8
  11. 07 2月, 2018 1 次提交
    • R
      arm64: Make USER_DS an inclusive limit · 51369e39
      Robin Murphy 提交于
      Currently, USER_DS represents an exclusive limit while KERNEL_DS is
      inclusive. In order to do some clever trickery for speculation-safe
      masking, we need them both to behave equivalently - there aren't enough
      bits to make KERNEL_DS exclusive, so we have precisely one option. This
      also happens to correct a longstanding false negative for a range
      ending on the very top byte of kernel memory.
      
      Mark Rutland points out that we've actually got the semantics of
      addresses vs. segments muddled up in most of the places we need to
      amend, so shuffle the {USER,KERNEL}_DS definitions around such that we
      can correct those properly instead of just pasting "-1"s everywhere.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      51369e39
  12. 16 1月, 2018 2 次提交
    • J
      arm64: kernel: Prepare for a DISR user · 68ddbf09
      James Morse 提交于
      KVM would like to consume any pending SError (or RAS error) after guest
      exit. Today it has to unmask SError and use dsb+isb to synchronise the
      CPU. With the RAS extensions we can use ESB to synchronise any pending
      SError.
      
      Add the necessary macros to allow DISR to be read and converted to an
      ESR.
      
      We clear the DISR register when we enable the RAS cpufeature, and the
      kernel has not executed any ESB instructions. Any value we find in DISR
      must have belonged to firmware. Executing an ESB instruction is the
      only way to update DISR, so we can expect firmware to have handled
      any deferred SError. By the same logic we clear DISR in the idle path.
      Reviewed-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      68ddbf09
    • K
      arm64: Implement thread_struct whitelist for hardened usercopy · 9e8084d3
      Kees Cook 提交于
      While ARM64 carries FPU state in the thread structure that is saved and
      restored during signal handling, it doesn't need to declare a usercopy
      whitelist, since existing accessors are all either using a bounce buffer
      (for which whitelisting isn't checking the slab), are statically sized
      (which will bypass the hardened usercopy check), or both.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: Dave Martin <Dave.Martin@arm.com>
      Cc: zijun_hu <zijun_hu@htc.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: NKees Cook <keescook@chromium.org>
      9e8084d3
  13. 03 11月, 2017 3 次提交
    • D
      arm64/sve: Add prctl controls for userspace vector length management · 2d2123bc
      Dave Martin 提交于
      This patch adds two arm64-specific prctls, to permit userspace to
      control its vector length:
      
       * PR_SVE_SET_VL: set the thread's SVE vector length and vector
         length inheritance mode.
      
       * PR_SVE_GET_VL: get the same information.
      
      Although these prctls resemble instruction set features in the SVE
      architecture, they provide additional control: the vector length
      inheritance mode is Linux-specific and nothing to do with the
      architecture, and the architecture does not permit EL0 to set its
      own vector length directly.  Both can be used in portable tools
      without requiring the use of SVE instructions.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: Fixed up prctl constants to avoid clash with PDEATHSIG]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2d2123bc
    • D
      arm64/sve: Support vector length resetting for new processes · 79ab047c
      Dave Martin 提交于
      It's desirable to be able to reset the vector length to some sane
      default for new processes, since the new binary and its libraries
      may or may not be SVE-aware.
      
      This patch tracks the desired post-exec vector length (if any) in a
      new thread member sve_vl_onexec, and adds a new thread flag
      TIF_SVE_VL_INHERIT to control whether to inherit or reset the
      vector length.  Currently these are inactive.  Subsequent patches
      will provide the capability to configure them.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      79ab047c
    • D
      arm64/sve: Core task context handling · bc0ee476
      Dave Martin 提交于
      This patch adds the core support for switching and managing the SVE
      architectural state of user tasks.
      
      Calls to the existing FPSIMD low-level save/restore functions are
      factored out as new functions task_fpsimd_{save,load}(), since SVE
      now dynamically may or may not need to be handled at these points
      depending on the kernel configuration, hardware features discovered
      at boot, and the runtime state of the task.  To make these
      decisions as fast as possible, const cpucaps are used where
      feasible, via the system_supports_sve() helper.
      
      The SVE registers are only tracked for threads that have explicitly
      used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
      FPSIMD view of the architectural state is stored in
      thread.fpsimd_state as usual.
      
      When in use, the SVE registers are not stored directly in
      thread_struct due to their potentially large and variable size.
      Because the task_struct slab allocator must be configured very
      early during kernel boot, it is also tricky to configure it
      correctly to match the maximum vector length provided by the
      hardware, since this depends on examining secondary CPUs as well as
      the primary.  Instead, a pointer sve_state in thread_struct points
      to a dynamically allocated buffer containing the SVE register data,
      and code is added to allocate and free this buffer at appropriate
      times.
      
      TIF_SVE is set when taking an SVE access trap from userspace, if
      suitable hardware support has been detected.  This enables SVE for
      the thread: a subsequent return to userspace will disable the trap
      accordingly.  If such a trap is taken without sufficient system-
      wide hardware support, SIGILL is sent to the thread instead as if
      an undefined instruction had been executed: this may happen if
      userspace tries to use SVE in a system where not all CPUs support
      it for example.
      
      The kernel will clear TIF_SVE and disable SVE for the thread
      whenever an explicit syscall is made by userspace.  For backwards
      compatibility reasons and conformance with the spirit of the base
      AArch64 procedure call standard, the subset of the SVE register
      state that aliases the FPSIMD registers is still preserved across a
      syscall even if this happens.  The remainder of the SVE register
      state logically becomes zero at syscall entry, though the actual
      zeroing work is currently deferred until the thread next tries to
      use SVE, causing another trap to the kernel.  This implementation
      is suboptimal: in the future, the fastpath case may be optimised
      to zero the registers in-place and leave SVE enabled for the task,
      where beneficial.
      
      TIF_SVE is also cleared in the following slowpath cases, which are
      taken as reasonable hints that the task may no longer use SVE:
       * exec
       * fork and clone
      
      Code is added to sync data between thread.fpsimd_state and
      thread.sve_state whenever enabling/disabling SVE, in a manner
      consistent with the SVE architectural programmer's model.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Alex Bennée <alex.bennee@linaro.org>
      [will: added #include to fix allnoconfig build]
      [will: use enable_daif in do_sve_acc]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bc0ee476
  14. 02 10月, 2017 1 次提交
  15. 16 8月, 2017 1 次提交
    • A
      arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP · 34be98f4
      Ard Biesheuvel 提交于
      For historical reasons, we leave the top 16 bytes of our task and IRQ
      stacks unused, a practice used to ensure that the SP can always be
      masked to find the base of the current stack (historically, where
      thread_info could be found).
      
      However, this is not necessary, as:
      
      * When an exception is taken from a task stack, we decrement the SP by
        S_FRAME_SIZE and stash the exception registers before we compare the
        SP against the task stack. In such cases, the SP must be at least
        S_FRAME_SIZE below the limit, and can be safely masked to determine
        whether the task stack is in use.
      
      * When transitioning to an IRQ stack, we'll place a dummy frame onto the
        IRQ stack before enabling asynchronous exceptions, or executing code
        we expect to trigger faults. Thus, if an exception is taken from the
        IRQ stack, the SP must be at least 16 bytes below the limit.
      
      * We no longer mask the SP to find the thread_info, which is now found
        via sp_el0. Note that historically, the offset was critical to ensure
        that cpu_switch_to() found the correct stack for new threads that
        hadn't yet executed ret_from_fork().
      
      Given that, this initial offset serves no purpose, and can be removed.
      This brings us in-line with other architectures (e.g. x86) which do not
      rely on this masking.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: rebase, kill THREAD_START_SP, commit msg additions]
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      34be98f4
  16. 07 8月, 2017 2 次提交
    • D
      arm64: Abstract syscallno manipulation · 17c28958
      Dave Martin 提交于
      The -1 "no syscall" value is written in various ways, shared with
      the user ABI in some places, and generally obscure.
      
      This patch attempts to make things a little more consistent and
      readable by replacing all these uses with a single #define.  A
      couple of symbolic helpers are provided to clarify the intent
      further.
      
      Because the in-syscall check in do_signal() is changed from >= 0 to
      != NO_SYSCALL by this patch, different behaviour may be observable
      if syscallno is set to values less than -1 by a tracer.  However,
      this is not different from the behaviour that is already observable
      if a tracer sets syscallno to a value >= __NR_(compat_)syscalls.
      
      It appears that this can cause spurious syscall restarting, but
      that is not a new behaviour either, and does not appear harmful.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      17c28958
    • D
      arm64: syscallno is secretly an int, make it official · 35d0e6fb
      Dave Martin 提交于
      The upper 32 bits of the syscallno field in thread_struct are
      handled inconsistently, being sometimes zero extended and sometimes
      sign-extended.  In fact, only the lower 32 bits seem to have any
      real significance for the behaviour of the code: it's been OK to
      handle the upper bits inconsistently because they don't matter.
      
      Currently, the only place I can find where those bits are
      significant is in calling trace_sys_enter(), which may be
      unintentional: for example, if a compat tracer attempts to cancel a
      syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the
      syscall-enter-stop, it will be traced as syscall 4294967295
      rather than -1 as might be expected (and as occurs for a native
      tracer doing the same thing).  Elsewhere, reads of syscallno cast
      it to an int or truncate it.
      
      There's also a conspicuous amount of code and casting to bodge
      around the fact that although semantically an int, syscallno is
      stored as a u64.
      
      Let's not pretend any more.
      
      In order to preserve the stp x instruction that stores the syscall
      number in entry.S, this patch special-cases the layout of struct
      pt_regs for big endian so that the newly 32-bit syscallno field
      maps onto the low bits of the stored value.  This is not beautiful,
      but benchmarking of the getpid syscall on Juno suggests indicates a
      minor slowdown if the stp is split into an stp x and stp w.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      35d0e6fb
  17. 22 6月, 2017 1 次提交
    • D
      arm64: ptrace: Flush user-RW TLS reg to thread_struct before reading · 936eb65c
      Dave Martin 提交于
      When reading current's user-writable TLS register (which occurs
      when dumping core for native tasks), it is possible that userspace
      has modified it since the time the task was last scheduled out.
      The new TLS register value is not guaranteed to have been written
      immediately back to thread_struct in this case.
      
      As a result, a coredump can capture stale data for this register.
      Reading the register for a stopped task via ptrace is unaffected.
      
      For native tasks, this patch explicitly flushes the TPIDR_EL0
      register back to thread_struct before dumping when operating on
      current, thus ensuring that coredump contents are up to date.  For
      compat tasks, the TLS register is not user-writable and so cannot
      be out of sync, so no flush is required in compat_tls_get().
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      936eb65c
  18. 23 3月, 2017 1 次提交
  19. 10 1月, 2017 1 次提交
    • J
      arm64: Remove useless UAO IPI and describe how this gets enabled · c8b06e3f
      James Morse 提交于
      Since its introduction, the UAO enable call was broken, and useless.
      commit 2a6dcb2b ("arm64: cpufeature: Schedule enable() calls instead
      of calling them via IPI"), fixed the framework so that these calls
      are scheduled, so that they can modify PSTATE.
      
      Now it is just useless. Remove it. UAO is enabled by the code patching
      which causes get_user() and friends to use the 'ldtr' family of
      instructions. This relies on the PSTATE.UAO bit being set to match
      addr_limit, which we do in uao_thread_switch() called via __switch_to().
      
      All that is needed to enable UAO is patch the code, and call schedule().
      __apply_alternatives_multi_stop() calls stop_machine() when it modifies
      the kernel text to enable the alternatives, (including the UAO code in
      uao_thread_switch()). Once stop_machine() has finished __switch_to() is
      called to reschedule the original task, this causes PSTATE.UAO to be set
      appropriately. An explicit enable() call is not needed.
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      c8b06e3f
  20. 17 11月, 2016 1 次提交
  21. 16 11月, 2016 2 次提交
    • C
      locking/core, arch: Remove cpu_relax_lowlatency() · 5bd0b85b
      Christian Borntraeger 提交于
      As there are no users left, we can remove cpu_relax_lowlatency()
      implementations from every architecture.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Noam Camus <noamc@ezchip.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Cc: <linux-arch@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      5bd0b85b
    • C
      locking/core: Introduce cpu_relax_yield() · 79ab11cd
      Christian Borntraeger 提交于
      For spinning loops people do often use barrier() or cpu_relax().
      For most architectures cpu_relax and barrier are the same, but on
      some architectures cpu_relax can add some latency.
      For example on power,sparc64 and arc, cpu_relax can shift the CPU
      towards other hardware threads in an SMT environment.
      On s390 cpu_relax does even more, it uses an hypercall to the
      hypervisor to give up the timeslice.
      In contrast to the SMT yielding this can result in larger latencies.
      In some places this latency is unwanted, so another variant
      "cpu_relax_lowlatency" was introduced. Before this is used in more
      and more places, lets revert the logic and provide a cpu_relax_yield
      that can be called in places where yielding is more important than
      latency. By default this is the same as cpu_relax on all architectures.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Noam Camus <noamc@ezchip.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      79ab11cd
  22. 20 10月, 2016 1 次提交
    • J
      arm64: cpufeature: Schedule enable() calls instead of calling them via IPI · 2a6dcb2b
      James Morse 提交于
      The enable() call for a cpufeature/errata is called using on_each_cpu().
      This issues a cross-call IPI to get the work done. Implicitly, this
      stashes the running PSTATE in SPSR when the CPU receives the IPI, and
      restores it when we return. This means an enable() call can never modify
      PSTATE.
      
      To allow PAN to do this, change the on_each_cpu() call to use
      stop_machine(). This schedules the work on each CPU which allows
      us to modify PSTATE.
      
      This involves changing the protype of all the enable() functions.
      
      enable_cpu_capabilities() is called during boot and enables the feature
      on all online CPUs. This path now uses stop_machine(). CPU features for
      hotplug'd CPUs are enabled by verify_local_cpu_features() which only
      acts on the local CPU, and can already modify the running PSTATE as it
      is called from secondary_start_kernel().
      Reported-by: NTony Thompson <anthony.thompson@arm.com>
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2a6dcb2b
  23. 01 9月, 2016 1 次提交
  24. 01 7月, 2016 1 次提交
    • A
      arm64: trap userspace "dc cvau" cache operation on errata-affected core · 7dd01aef
      Andre Przywara 提交于
      The ARM errata 819472, 826319, 827319 and 824069 for affected
      Cortex-A53 cores demand to promote "dc cvau" instructions to
      "dc civac". Since we allow userspace to also emit those instructions,
      we should make sure that "dc cvau" gets promoted there too.
      So lets grasp the nettle here and actually trap every userland cache
      maintenance instruction once we detect at least one affected core in
      the system.
      We then emulate the instruction by executing it on behalf of userland,
      promoting "dc cvau" to "dc civac" on the way and injecting access
      fault back into userspace.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      [catalin.marinas@arm.com: s/set_segfault/arm64_notify_segfault/]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      7dd01aef
  25. 19 2月, 2016 1 次提交
    • J
      arm64: kernel: Add support for User Access Override · 57f4959b
      James Morse 提交于
      'User Access Override' is a new ARMv8.2 feature which allows the
      unprivileged load and store instructions to be overridden to behave in
      the normal way.
      
      This patch converts {get,put}_user() and friends to use ldtr*/sttr*
      instructions - so that they can only access EL0 memory, then enables
      UAO when fs==KERNEL_DS so that these functions can access kernel memory.
      
      This allows user space's read/write permissions to be checked against the
      page tables, instead of testing addr<USER_DS, then using the kernel's
      read/write permissions.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      [catalin.marinas@arm.com: move uao_thread_switch() above dsb()]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      57f4959b
  26. 16 2月, 2016 2 次提交
  27. 21 10月, 2015 1 次提交
    • S
      arm64: Delay cpu feature capability checks · dbb4e152
      Suzuki K. Poulose 提交于
      At the moment we run through the arm64_features capability list for
      each CPU and set the capability if one of the CPU supports it. This
      could be problematic in a heterogeneous system with differing capabilities.
      Delay the CPU feature checks until all the enabled CPUs are up(i.e,
      smp_cpus_done(), so that we can make better decisions based on the
      overall system capability. Once we decide and advertise the capabilities
      the alternatives can be applied. From this state, we cannot roll back
      a feature to disabled based on the values from a new hotplugged CPU,
      due to the runtime patching and other reasons. So, for all new CPUs,
      we need to make sure that they have the established system capabilities.
      Failing which, we bring the CPU down, preventing it from turning online.
      Once the capabilities are decided, any new CPU booting up goes through
      verification to ensure that it has all the enabled capabilities and also
      invokes the respective enable() method on the CPU.
      
      The CPU errata checks are not delayed and is still executed per-CPU
      to detect the respective capabilities. If we ever come across a non-errata
      capability that needs to be checked on each-CPU, we could introduce them via
      a new capability table(or introduce a flag), which can be processed per CPU.
      
      The next patch will make the feature checks use the system wide
      safe value of a feature register.
      
      NOTE: The enable() methods associated with the capability is scheduled
      on all the CPUs (which is the only use case at the moment). If we need
      a different type of 'enable()' which only needs to be run once on any CPU,
      we should be able to handle that when needed.
      Signed-off-by: NSuzuki K. Poulose <suzuki.poulose@arm.com>
      Tested-by: NDave Martin <Dave.Martin@arm.com>
      [catalin.marinas@arm.com: static variable and coding style fixes]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      dbb4e152
  28. 27 7月, 2015 1 次提交