1. 20 10月, 2016 1 次提交
    • A
      arm64: Cortex-A53 errata workaround: check for kernel addresses · 87261d19
      Andre Przywara 提交于
      Commit 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on
      errata-affected core") adds code to execute cache maintenance instructions
      in the kernel on behalf of userland on CPUs with certain ARM CPU errata.
      It turns out that the address hasn't been checked to be a valid user
      space address, allowing userland to clean cache lines in kernel space.
      Fix this by introducing an address check before executing the
      instructions on behalf of userland.
      
      Since the address doesn't come via a syscall parameter, we can't just
      reject tagged pointers and instead have to remove the tag when checking
      against the user address limit.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
      Reported-by: NKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      [will: rework commit message + replace access_ok with max_user_addr()]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      87261d19
  2. 19 10月, 2016 1 次提交
    • W
      arm64: percpu: rewrite ll/sc loops in assembly · 1e6e57d9
      Will Deacon 提交于
      Writing the outer loop of an LL/SC sequence using do {...} while
      constructs potentially allows the compiler to hoist memory accesses
      between the STXR and the branch back to the LDXR. On CPUs that do not
      guarantee forward progress of LL/SC loops when faced with memory
      accesses to the same ERG (up to 2k) between the failed STXR and the
      branch back, we may end up livelocking.
      
      This patch avoids this issue in our percpu atomics by rewriting the
      outer loop as part of the LL/SC inline assembly block.
      
      Cc: <stable@vger.kernel.org>
      Fixes: f97fc810 ("arm64: percpu: Implement this_cpu operations")
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1e6e57d9
  3. 18 10月, 2016 1 次提交
    • W
      arm64: sysreg: Fix use of XZR in write_sysreg_s · 91cb163e
      Will Deacon 提交于
      Commit 8a71f0c6 ("arm64: sysreg: replace open-coded mrs_s/msr_s with
      {read,write}_sysreg_s") introduced a write_sysreg_s macro for writing
      to system registers that are not supported by binutils.
      
      Unfortunately, this was implemented with the wrong template (%0 vs %x0),
      so in the case that we are writing a constant 0, we will generate
      invalid instruction syntax and bail with a cryptic assembler error:
      
        | Error: constant expression required
      
      This patch fixes the template.
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      91cb163e
  4. 17 10月, 2016 1 次提交
  5. 12 10月, 2016 1 次提交
  6. 28 9月, 2016 1 次提交
  7. 24 9月, 2016 2 次提交
  8. 23 9月, 2016 1 次提交
  9. 22 9月, 2016 2 次提交
  10. 16 9月, 2016 1 次提交
  11. 13 9月, 2016 1 次提交
  12. 12 9月, 2016 2 次提交
    • M
      arm64/kvm: use alternative auto-nop · e506236a
      Mark Rutland 提交于
      Make use of the new alternative_if and alternative_else_nop_endif and
      get rid of our open-coded NOP sleds, making the code simpler to read.
      
      Note that for __kvm_call_hyp the branch to __vhe_hyp_call has been moved
      out of the alternative sequence, and in the default case there will be
      four additional NOPs executed.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: kvmarm@lists.cs.columbia.edu
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e506236a
    • M
      arm64: alternative: add auto-nop infrastructure · 792d4737
      Mark Rutland 提交于
      In some cases, one side of an alternative sequence is simply a number of
      NOPs used to balance the other side. Keeping track of this manually is
      tedious, and the presence of large chains of NOPs makes the code more
      painful to read than necessary.
      
      To ameliorate matters, this patch adds a new alternative_else_nop_endif,
      which automatically balances an alternative sequence with a trivial NOP
      sled.
      
      In many cases, we would like a NOP-sled in the default case, and
      instructions patched in in the presence of a feature. To enable the NOPs
      to be generated automatically for this case, this patch also adds a new
      alternative_if, and updates alternative_else and alternative_endif to
      work with either alternative_if or alternative_endif.
      
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      [will: use new nops macro to generate nop sequences]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      792d4737
  13. 10 9月, 2016 3 次提交
  14. 09 9月, 2016 16 次提交
    • R
      arm64: Remove shadowed asm-generic headers · 0e27a7fc
      Robin Murphy 提交于
      We've grown our own versions of bug.h, ftrace.h, pci.h and topology.h,
      so generating the generic ones as well is unnecessary and a potential
      source of build hiccups. At the very least, having them present has
      confused my source-indexing tool, and that simply will not do.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0e27a7fc
    • S
      arm64: Work around systems with mismatched cache line sizes · 116c81f4
      Suzuki K Poulose 提交于
      Systems with differing CPU i-cache/d-cache line sizes can cause
      problems with the cache management by software when the execution
      is migrated from one to another. Usually, the application reads
      the cache size on a CPU and then uses that length to perform cache
      operations. However, if it gets migrated to another CPU with a smaller
      cache line size, things could go completely wrong. To prevent such
      cases, always use the smallest cache line size among the CPUs. The
      kernel CPU feature infrastructure already keeps track of the safe
      value for all CPUID registers including CTR. This patch works around
      the problem by :
      
      For kernel, dynamically patch the kernel to read the cache size
      from the system wide copy of CTR_EL0.
      
      For applications, trap read accesses to CTR_EL0 (by clearing the SCTLR.UCT)
      and emulate the mrs instruction to return the system wide safe value
      of CTR_EL0.
      
      For faster access (i.e, avoiding to lookup the system wide value of CTR_EL0
      via read_system_reg), we keep track of the pointer to table entry for
      CTR_EL0 in the CPU feature infrastructure.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      116c81f4
    • S
      arm64: Refactor sysinstr exception handling · 9dbd5bb2
      Suzuki K Poulose 提交于
      Right now we trap some of the user space data cache operations
      based on a few Errata (ARM 819472, 826319, 827319 and 824069).
      We need to trap userspace access to CTR_EL0, if we detect mismatched
      cache line size. Since both these traps share the EC, refactor
      the handler a little bit to make it a bit more reader friendly.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9dbd5bb2
    • S
      arm64: Introduce raw_{d,i}cache_line_size · 072f0a63
      Suzuki K Poulose 提交于
      On systems with mismatched i/d cache min line sizes, we need to use
      the smallest size possible across all CPUs. This will be done by fetching
      the system wide safe value from CPU feature infrastructure.
      However the some special users(e.g kexec, hibernate) would need the line
      size on the CPU (rather than the system wide), when either the system
      wide feature may not be accessible or it is guranteed that the caller
      executes with a gurantee of no migration.
      Provide another helper which will fetch cache line size on the current CPU.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NGeoff Levand <geoff@infradead.org>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      072f0a63
    • S
      arm64: insn: Add helpers for adrp offsets · 46084bc2
      Suzuki K Poulose 提交于
      Adds helpers for decoding/encoding the PC relative addresses for adrp.
      This will be used for handling dynamic patching of 'adrp' instructions
      in alternative code patching.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      46084bc2
    • S
      arm64: Rearrange CPU errata workaround checks · c47a1900
      Suzuki K Poulose 提交于
      Right now we run through the work around checks on a CPU
      from __cpuinfo_store_cpu. There are some problems with that:
      
      1) We initialise the system wide CPU feature registers only after the
      Boot CPU updates its cpuinfo. Now, if a work around depends on the
      variance of a CPU ID feature (e.g, check for Cache Line size mismatch),
      we have no way of performing it cleanly for the boot CPU.
      
      2) It is out of place, invoked from __cpuinfo_store_cpu() in cpuinfo.c. It
      is not an obvious place for that.
      
      This patch rearranges the CPU specific capability(aka work around) checks.
      
      1) At the moment we use verify_local_cpu_capabilities() to check if a new
      CPU has all the system advertised features. Use this for the secondary CPUs
      to perform the work around check. For that we rename
        verify_local_cpu_capabilities() => check_local_cpu_capabilities()
      which:
      
         If the system wide capabilities haven't been initialised (i.e, the CPU
         is activated at the boot), update the system wide detected work arounds.
      
         Otherwise (i.e a CPU hotplugged in later) verify that this CPU conforms to the
         system wide capabilities.
      
      2) Boot CPU updates the work arounds from smp_prepare_boot_cpu() after we have
      initialised the system wide CPU feature values.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c47a1900
    • S
      arm64: Use consistent naming for errata handling · 89ba2645
      Suzuki K Poulose 提交于
      This is a cosmetic change to rename the functions dealing with
      the errata work arounds to be more consistent with their naming.
      
      1) check_local_cpu_errata() => update_cpu_errata_workarounds()
      check_local_cpu_errata() actually updates the system's errata work
      arounds. So rename it to reflect the same.
      
      2) verify_local_cpu_errata() => verify_local_cpu_errata_workarounds()
      Use errata_workarounds instead of _errata.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      89ba2645
    • S
      arm64: Set the safe value for L1 icache policy · ee7bc638
      Suzuki K Poulose 提交于
      Right now we use 0 as the safe value for CTR_EL0:L1Ip, which is
      not defined at the moment. The safer value for the L1Ip should be
      the weakest of the policies, which happens to be AIVIVT. While at it,
      fix the comment about safe_val.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ee7bc638
    • C
      arm64: use preempt_disable_notrace in _percpu_read/write · 2b974344
      Chunyan Zhang 提交于
      When debug preempt or preempt tracer is enabled, preempt_count_add/sub()
      can be traced by function and function graph tracing, and
      preempt_disable/enable() would call preempt_count_add/sub(), so in Ftrace
      subsystem we should use preempt_disable/enable_notrace instead.
      
      In the commit 345ddcc8 ("ftrace: Have set_ftrace_pid use the bitmap
      like events do") the function this_cpu_read() was added to
      trace_graph_entry(), and if this_cpu_read() calls preempt_disable(), graph
      tracer will go into a recursive loop, even if the tracing_on is
      disabled.
      
      So this patch change to use preempt_enable/disable_notrace instead in
      this_cpu_read().
      
      Since Yonghui Yang helped a lot to find the root cause of this problem,
      so also add his SOB.
      Signed-off-by: NYonghui Yang <mark.yang@spreadtrum.com>
      Signed-off-by: NChunyan Zhang <zhang.chunyan@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      2b974344
    • W
      arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb() · 872c63fb
      Will Deacon 提交于
      smp_mb__before_spinlock() is intended to upgrade a spin_lock() operation
      to a full barrier, such that prior stores are ordered with respect to
      loads and stores occuring inside the critical section.
      
      Unfortunately, the core code defines the barrier as smp_wmb(), which
      is insufficient to provide the required ordering guarantees when used in
      conjunction with our load-acquire-based spinlock implementation.
      
      This patch overrides the arm64 definition of smp_mb__before_spinlock()
      to map to a full smp_mb().
      
      Cc: <stable@vger.kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reported-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      872c63fb
    • M
      arm64: simplify contextidr_thread_switch · d3ea42aa
      Mark Rutland 提交于
      When CONFIG_PID_IN_CONTEXTIDR is not selected, we use an empty stub
      definition of contextidr_thread_switch(). As everything we rely upon
      exists regardless of CONFIG_PID_IN_CONTEXTIDR, we don't strictly require
      an empty stub.
      
      By using IS_ENABLED() rather than ifdeffery, we avoid duplication, and
      get compiler coverage on all the code even when CONFIG_PID_IN_CONTEXTIDR
      is not selected and the code is optimised away.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d3ea42aa
    • M
      arm64: simplify sysreg manipulation · adf75899
      Mark Rutland 提交于
      A while back we added {read,write}_sysreg accessors to handle accesses
      to system registers, without the usual boilerplate asm volatile,
      temporary variable, etc.
      
      This patch makes use of these across arm64 to make code shorter and
      clearer. For sequences with a trailing ISB, the existing isb() macro is
      also used so that asm blocks can be removed entirely.
      
      A few uses of inline assembly for msr/mrs are left as-is. Those
      manipulating sp_el0 for the current thread_info value have special
      clobber requiremends.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      adf75899
    • M
      arm64/kvm: use {read,write}_sysreg() · 1f3d8699
      Mark Rutland 提交于
      A while back we added {read,write}_sysreg accessors to handle accesses
      to system registers, without the usual boilerplate asm volatile,
      temporary variable, etc.
      
      This patch makes use of these in the arm64 KVM code to make the code
      shorter and clearer.
      
      At the same time, a comment style violation next to a system register
      access is fixed up in reset_pmcr, and comments describing whether
      operations are reads or writes are removed as this is now painfully
      obvious.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1f3d8699
    • M
      arm64: dcc: simplify accessors · d0a69d9f
      Mark Rutland 提交于
      A while back we added {read,write}_sysreg accessors to handle accesses
      to system registers, without the usual boilerplate asm volatile,
      temporary variable, etc.
      
      This patch makes use of these in the arm64 DCC accessors to make the
      code shorter and clearer.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d0a69d9f
    • M
      arm64: arch_timer: simplify accessors · cd5f22d7
      Mark Rutland 提交于
      A while back we added {read,write}_sysreg accessors to handle accesses
      to system registers, without the usual boilerplate asm volatile,
      temporary variable, etc.
      
      This patch makes use of these in the arm64 arch timer accessors to make
      the code shorter and clearer.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      cd5f22d7
    • M
      arm64: sysreg: allow write_sysreg to use XZR · 7aff4a2d
      Mark Rutland 提交于
      Currently write_sysreg has to allocate a temporary register to write
      zero to a system register, which is unfortunate given that the MSR
      instruction accepts XZR as an operand.
      
      Allow XZR to be used when appropriate by fiddling with the assembly
      constraints.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7aff4a2d
  15. 08 9月, 2016 6 次提交