1. 12 11月, 2016 4 次提交
    • M
      arm64: make cpu number a percpu variable · 57c82954
      Mark Rutland 提交于
      In the absence of CONFIG_THREAD_INFO_IN_TASK, core code maintains
      thread_info::cpu, and low-level architecture code can access this to
      build raw_smp_processor_id(). With CONFIG_THREAD_INFO_IN_TASK, core code
      maintains task_struct::cpu, which for reasons of hte header soup is not
      accessible to low-level arch code.
      
      Instead, we can maintain a percpu variable containing the cpu number.
      
      For both the old and new implementation of raw_smp_processor_id(), we
      read a syreg into a GPR, add an offset, and load the result. As the
      offset is now larger, it may not be folded into the load, but otherwise
      the assembly shouldn't change much.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      57c82954
    • M
      arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx · 623b476f
      Mark Rutland 提交于
      When returning from idle, we rely on the fact that thread_info lives at
      the end of the kernel stack, and restore this by masking the saved stack
      pointer. Subsequent patches will sever the relationship between the
      stack and thread_info, and to cater for this we must save/restore sp_el0
      explicitly, storing it in cpu_suspend_ctx.
      
      As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
      extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
      in the same way, which simplifies the code, avoiding pointer chasing on
      the restore path (as we no longer need to load thread_info::cpu followed
      by the relevant slot in __per_cpu_offset based on this).
      
      This patch stashes both registers in cpu_suspend_ctx.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      623b476f
    • M
      arm64: factor out current_stack_pointer · a9ea0017
      Mark Rutland 提交于
      We define current_stack_pointer in <asm/thread_info.h>, though other
      files and header relying upon it do not have this necessary include, and
      are thus fragile to changes in the header soup.
      
      Subsequent patches will affect the header soup such that directly
      including <asm/thread_info.h> may result in a circular header include in
      some of these cases, so we can't simply include <asm/thread_info.h>.
      
      Instead, factor current_thread_info into its own header, and have all
      existing users include this explicitly.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a9ea0017
    • M
      arm64: thread_info remove stale items · dcbe0285
      Mark Rutland 提交于
      We have a comment claiming __switch_to() cares about where cpu_context
      is located relative to cpu_domain in thread_info. However arm64 has
      never had a thread_info::cpu_domain field, and neither __switch_to nor
      cpu_switch_to care where the cpu_context field is relative to others.
      
      Additionally, the init_thread_info alias is never used anywhere in the
      kernel, and will shortly become problematic when thread_info is moved
      into task_struct.
      
      This patch removes both.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NLaura Abbott <labbott@redhat.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      dcbe0285
  2. 10 11月, 2016 1 次提交
  3. 08 11月, 2016 7 次提交
  4. 27 10月, 2016 1 次提交
  5. 22 10月, 2016 1 次提交
    • W
      arm64: KVM: Take S1 walks into account when determining S2 write faults · 60e21a0e
      Will Deacon 提交于
      The WnR bit in the HSR/ESR_EL2 indicates whether a data abort was
      generated by a read or a write instruction. For stage 2 data aborts
      generated by a stage 1 translation table walk (i.e. the actual page
      table access faults at EL2), the WnR bit therefore reports whether the
      instruction generating the walk was a load or a store, *not* whether the
      page table walker was reading or writing the entry.
      
      For page tables marked as read-only at stage 2 (e.g. due to KSM merging
      them with the tables from another guest), this could result in livelock,
      where a page table walk generated by a load instruction attempts to
      set the access flag in the stage 1 descriptor, but fails to trigger
      CoW in the host since only a read fault is reported.
      
      This patch modifies the arm64 kvm_vcpu_dabt_iswrite function to
      take into account stage 2 faults in stage 1 walks. Since DBM cannot be
      disabled at EL2 for CPUs that implement it, we assume that these faults
      are always causes by writes, avoiding the livelock situation at the
      expense of occasional, spurious CoWs.
      
      We could, in theory, do a bit better by checking the guest TCR
      configuration and inspecting the page table to see why the PTE faulted.
      However, I doubt this is measurable in practice, and the threat of
      livelock is real.
      
      Cc: <stable@vger.kernel.org>
      Cc: Julien Grall <julien.grall@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      60e21a0e
  6. 20 10月, 2016 3 次提交
    • J
      arm64: suspend: Reconfigure PSTATE after resume from idle · d0854412
      James Morse 提交于
      The suspend/resume path in kernel/sleep.S, as used by cpu-idle, does not
      save/restore PSTATE. As a result of this cpufeatures that were detected
      and have bits in PSTATE get lost when we resume from idle.
      
      UAO gets set appropriately on the next context switch. PAN will be
      re-enabled next time we return from user-space, but on a preemptible
      kernel we may run work accessing user space before this point.
      
      Add code to re-enable theses two features in __cpu_suspend_exit().
      We re-use uao_thread_switch() passing current.
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d0854412
    • J
      arm64: cpufeature: Schedule enable() calls instead of calling them via IPI · 2a6dcb2b
      James Morse 提交于
      The enable() call for a cpufeature/errata is called using on_each_cpu().
      This issues a cross-call IPI to get the work done. Implicitly, this
      stashes the running PSTATE in SPSR when the CPU receives the IPI, and
      restores it when we return. This means an enable() call can never modify
      PSTATE.
      
      To allow PAN to do this, change the on_each_cpu() call to use
      stop_machine(). This schedules the work on each CPU which allows
      us to modify PSTATE.
      
      This involves changing the protype of all the enable() functions.
      
      enable_cpu_capabilities() is called during boot and enables the feature
      on all online CPUs. This path now uses stop_machine(). CPU features for
      hotplug'd CPUs are enabled by verify_local_cpu_features() which only
      acts on the local CPU, and can already modify the running PSTATE as it
      is called from secondary_start_kernel().
      Reported-by: NTony Thompson <anthony.thompson@arm.com>
      Reported-by: NVladimir Murzin <vladimir.murzin@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2a6dcb2b
    • A
      arm64: Cortex-A53 errata workaround: check for kernel addresses · 87261d19
      Andre Przywara 提交于
      Commit 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on
      errata-affected core") adds code to execute cache maintenance instructions
      in the kernel on behalf of userland on CPUs with certain ARM CPU errata.
      It turns out that the address hasn't been checked to be a valid user
      space address, allowing userland to clean cache lines in kernel space.
      Fix this by introducing an address check before executing the
      instructions on behalf of userland.
      
      Since the address doesn't come via a syscall parameter, we can't just
      reject tagged pointers and instead have to remove the tag when checking
      against the user address limit.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on errata-affected core")
      Reported-by: NKristina Martsenko <kristina.martsenko@arm.com>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      [will: rework commit message + replace access_ok with max_user_addr()]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      87261d19
  7. 19 10月, 2016 1 次提交
    • W
      arm64: percpu: rewrite ll/sc loops in assembly · 1e6e57d9
      Will Deacon 提交于
      Writing the outer loop of an LL/SC sequence using do {...} while
      constructs potentially allows the compiler to hoist memory accesses
      between the STXR and the branch back to the LDXR. On CPUs that do not
      guarantee forward progress of LL/SC loops when faced with memory
      accesses to the same ERG (up to 2k) between the failed STXR and the
      branch back, we may end up livelocking.
      
      This patch avoids this issue in our percpu atomics by rewriting the
      outer loop as part of the LL/SC inline assembly block.
      
      Cc: <stable@vger.kernel.org>
      Fixes: f97fc810 ("arm64: percpu: Implement this_cpu operations")
      Reviewed-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1e6e57d9
  8. 18 10月, 2016 1 次提交
    • W
      arm64: sysreg: Fix use of XZR in write_sysreg_s · 91cb163e
      Will Deacon 提交于
      Commit 8a71f0c6 ("arm64: sysreg: replace open-coded mrs_s/msr_s with
      {read,write}_sysreg_s") introduced a write_sysreg_s macro for writing
      to system registers that are not supported by binutils.
      
      Unfortunately, this was implemented with the wrong template (%0 vs %x0),
      so in the case that we are writing a constant 0, we will generate
      invalid instruction syntax and bail with a cryptic assembler error:
      
        | Error: constant expression required
      
      This patch fixes the template.
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      91cb163e
  9. 17 10月, 2016 1 次提交
  10. 12 10月, 2016 1 次提交
  11. 28 9月, 2016 1 次提交
  12. 24 9月, 2016 2 次提交
  13. 23 9月, 2016 1 次提交
  14. 22 9月, 2016 2 次提交
  15. 16 9月, 2016 1 次提交
  16. 13 9月, 2016 1 次提交
  17. 12 9月, 2016 2 次提交
    • M
      arm64/kvm: use alternative auto-nop · e506236a
      Mark Rutland 提交于
      Make use of the new alternative_if and alternative_else_nop_endif and
      get rid of our open-coded NOP sleds, making the code simpler to read.
      
      Note that for __kvm_call_hyp the branch to __vhe_hyp_call has been moved
      out of the alternative sequence, and in the default case there will be
      four additional NOPs executed.
      
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: kvmarm@lists.cs.columbia.edu
      Acked-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e506236a
    • M
      arm64: alternative: add auto-nop infrastructure · 792d4737
      Mark Rutland 提交于
      In some cases, one side of an alternative sequence is simply a number of
      NOPs used to balance the other side. Keeping track of this manually is
      tedious, and the presence of large chains of NOPs makes the code more
      painful to read than necessary.
      
      To ameliorate matters, this patch adds a new alternative_else_nop_endif,
      which automatically balances an alternative sequence with a trivial NOP
      sled.
      
      In many cases, we would like a NOP-sled in the default case, and
      instructions patched in in the presence of a feature. To enable the NOPs
      to be generated automatically for this case, this patch also adds a new
      alternative_if, and updates alternative_else and alternative_endif to
      work with either alternative_if or alternative_endif.
      
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Martin <dave.martin@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      [will: use new nops macro to generate nop sequences]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      792d4737
  18. 10 9月, 2016 3 次提交
  19. 09 9月, 2016 6 次提交
    • R
      arm64: Remove shadowed asm-generic headers · 0e27a7fc
      Robin Murphy 提交于
      We've grown our own versions of bug.h, ftrace.h, pci.h and topology.h,
      so generating the generic ones as well is unnecessary and a potential
      source of build hiccups. At the very least, having them present has
      confused my source-indexing tool, and that simply will not do.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0e27a7fc
    • S
      arm64: Work around systems with mismatched cache line sizes · 116c81f4
      Suzuki K Poulose 提交于
      Systems with differing CPU i-cache/d-cache line sizes can cause
      problems with the cache management by software when the execution
      is migrated from one to another. Usually, the application reads
      the cache size on a CPU and then uses that length to perform cache
      operations. However, if it gets migrated to another CPU with a smaller
      cache line size, things could go completely wrong. To prevent such
      cases, always use the smallest cache line size among the CPUs. The
      kernel CPU feature infrastructure already keeps track of the safe
      value for all CPUID registers including CTR. This patch works around
      the problem by :
      
      For kernel, dynamically patch the kernel to read the cache size
      from the system wide copy of CTR_EL0.
      
      For applications, trap read accesses to CTR_EL0 (by clearing the SCTLR.UCT)
      and emulate the mrs instruction to return the system wide safe value
      of CTR_EL0.
      
      For faster access (i.e, avoiding to lookup the system wide value of CTR_EL0
      via read_system_reg), we keep track of the pointer to table entry for
      CTR_EL0 in the CPU feature infrastructure.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      116c81f4
    • S
      arm64: Refactor sysinstr exception handling · 9dbd5bb2
      Suzuki K Poulose 提交于
      Right now we trap some of the user space data cache operations
      based on a few Errata (ARM 819472, 826319, 827319 and 824069).
      We need to trap userspace access to CTR_EL0, if we detect mismatched
      cache line size. Since both these traps share the EC, refactor
      the handler a little bit to make it a bit more reader friendly.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9dbd5bb2
    • S
      arm64: Introduce raw_{d,i}cache_line_size · 072f0a63
      Suzuki K Poulose 提交于
      On systems with mismatched i/d cache min line sizes, we need to use
      the smallest size possible across all CPUs. This will be done by fetching
      the system wide safe value from CPU feature infrastructure.
      However the some special users(e.g kexec, hibernate) would need the line
      size on the CPU (rather than the system wide), when either the system
      wide feature may not be accessible or it is guranteed that the caller
      executes with a gurantee of no migration.
      Provide another helper which will fetch cache line size on the current CPU.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NJames Morse <james.morse@arm.com>
      Reviewed-by: NGeoff Levand <geoff@infradead.org>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      072f0a63
    • S
      arm64: insn: Add helpers for adrp offsets · 46084bc2
      Suzuki K Poulose 提交于
      Adds helpers for decoding/encoding the PC relative addresses for adrp.
      This will be used for handling dynamic patching of 'adrp' instructions
      in alternative code patching.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      46084bc2
    • S
      arm64: Rearrange CPU errata workaround checks · c47a1900
      Suzuki K Poulose 提交于
      Right now we run through the work around checks on a CPU
      from __cpuinfo_store_cpu. There are some problems with that:
      
      1) We initialise the system wide CPU feature registers only after the
      Boot CPU updates its cpuinfo. Now, if a work around depends on the
      variance of a CPU ID feature (e.g, check for Cache Line size mismatch),
      we have no way of performing it cleanly for the boot CPU.
      
      2) It is out of place, invoked from __cpuinfo_store_cpu() in cpuinfo.c. It
      is not an obvious place for that.
      
      This patch rearranges the CPU specific capability(aka work around) checks.
      
      1) At the moment we use verify_local_cpu_capabilities() to check if a new
      CPU has all the system advertised features. Use this for the secondary CPUs
      to perform the work around check. For that we rename
        verify_local_cpu_capabilities() => check_local_cpu_capabilities()
      which:
      
         If the system wide capabilities haven't been initialised (i.e, the CPU
         is activated at the boot), update the system wide detected work arounds.
      
         Otherwise (i.e a CPU hotplugged in later) verify that this CPU conforms to the
         system wide capabilities.
      
      2) Boot CPU updates the work arounds from smp_prepare_boot_cpu() after we have
      initialised the system wide CPU feature values.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c47a1900