1. 25 11月, 2014 9 次提交
    • M
      arm64: sanity checks: ignore ID_MMFR0.AuxReg · 9760270c
      Mark Rutland 提交于
      It seems that Cortex-A53 r0p4 added support for AIFSR and ADFSR, and
      ID_MMFR0.AuxReg has been updated accordingly to report this fact. As
      Cortex-A53 could be paired with CPUs which do not implement these
      registers (e.g. all current revisions of Cortex-A57), this may trigger a
      sanity check failure at boot.
      
      The AuxReg value describes the availability of the ACTLR, AIFSR, and
      ADFSR registers, which are only of use to 32-bit guest OSs, and have
      IMPLEMENTATION DEFINED contents. Given the nature of these registers it
      is likely that KVM will need to trap accesses regardless of whether the
      CPUs are heterogeneous.
      
      This patch masks out the ID_MMFR0.AuxReg value from the sanity checks,
      preventing spurious warnings at boot time.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Reported-by: NAndre Przywara <andre.przywara@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Peter Maydell <peter.maydell@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9760270c
    • M
      arm64: topology: Fix handling of multi-level cluster MPIDR-based detection · 1cefdaea
      Mark Brown 提交于
      The only requirement the scheduler has on cluster IDs is that they must
      be unique.  When enumerating the topology based on MPIDR information the
      kernel currently generates cluster IDs by using the first level of
      affinity above the core ID (either level one or two depending on if the
      core has multiple threads) however the ARMv8 architecture allows for up
      to three levels of affinity.  This means that an ARMv8 system may
      contain cores which have MPIDRs identical other than affinity level
      three which with current code will cause us to report multiple cores
      with the same identification to the scheduler in violation of its
      uniqueness requirement.
      
      Ensure that we do not violate the scheduler requirements on systems that
      uses all the affinity levels by incorporating both affinity levels two
      and three into the cluser ID when the cores are not threaded.
      
      While no currently known hardware uses multi-level clusters it is better
      to program defensively, this will help ease bringup of systems that have
      them and will ensure that things like distribution install media do not
      need to be respun to replace kernels in order to deploy such systems.
      In the worst case the system will work but perform suboptimally until a
      kernel modified to handle the new topology better is installed, in the
      best case this will be an adequate description of such topologies for
      the scheduler to perform well.
      Signed-off-by: NMark Brown <broonie@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1cefdaea
    • A
      arm64: protect alternatives workarounds with Kconfig options · c0a01b84
      Andre Przywara 提交于
      Not all of the errata we have workarounds for apply necessarily to all
      SoCs, so people compiling a kernel for one very specific SoC may not
      need to patch the kernel.
      Introduce a new submenu in the "Platform selection" menu to allow
      people to turn off certain bugs if they are not affected. By default
      all of them are enabled.
      Normal users or distribution kernels shouldn't bother to deselect any
      bugs here, since the alternatives framework will take care of
      patching them in only if needed.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      [will: moved kconfig menu under `Kernel Features']
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c0a01b84
    • A
      arm64: add Cortex-A57 erratum 832075 workaround · 5afaa1fc
      Andre Przywara 提交于
      The ARM erratum 832075 applies to certain revisions of Cortex-A57,
      one of the workarounds is to change device loads into using
      load-aquire semantics.
      This is achieved using the alternatives framework.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5afaa1fc
    • A
      arm64: add Cortex-A53 cache errata workaround · 301bcfac
      Andre Przywara 提交于
      The ARM errata 819472, 826319, 827319 and 824069 define the same
      workaround for these hardware issues in certain Cortex-A53 parts.
      Use the new alternatives framework and the CPU MIDR detection to
      patch "cache clean" into "cache clean and invalidate" instructions if
      an affected CPU is detected at runtime.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      [will: add __maybe_unused to squash gcc warning]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      301bcfac
    • A
      arm64: detect silicon revisions and set cap bits accordingly · e116a375
      Andre Przywara 提交于
      After each CPU has been started, we iterate through a list of
      CPU features or bugs to detect CPUs which need (or could benefit
      from) kernel code patches.
      For each feature/bug there is a function which checks if that
      particular CPU is affected. We will later provide some more generic
      functions for common things like testing for certain MIDR ranges.
      We do this for every CPU to cover big.LITTLE systems properly as
      well.
      If a certain feature/bug has been detected, the capability bit will
      be set, so that later the call to apply_alternatives() will trigger
      the actual code patching.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e116a375
    • A
      arm64: add alternative runtime patching · e039ee4e
      Andre Przywara 提交于
      With a blatant copy of some x86 bits we introduce the alternative
      runtime patching "framework" to arm64.
      This is quite basic for now and we only provide the functions we need
      at this time.
      This is connected to the newly introduced feature bits.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e039ee4e
    • A
      arm64: add cpu_capabilities bitmap · 930da09f
      Andre Przywara 提交于
      For taking note if at least one CPU in the system needs a bug
      workaround or would benefit from a code optimization, we create a new
      bitmap to hold (artificial) feature bits.
      Since elf_hwcap is part of the userland ABI, we keep it alone and
      introduce a new data structure for that (along with some accessors).
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      930da09f
    • W
      arm64: fix return code check when changing emulation handler · 90963395
      Will Deacon 提交于
      update_insn_emulation_mode() returns 0 on success, so we should be
      treating any non-zero values as failure, rather than the other way
      around. Otherwise, writes to the sysctl file controlling the emulation
      are ignored and immediately rolled back.
      Reported-by: NGene Hackmann <ghackmann@google.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      90963395
  2. 21 11月, 2014 8 次提交
  3. 20 11月, 2014 2 次提交
    • S
      arm64: percpu: Implement this_cpu operations · f97fc810
      Steve Capper 提交于
      The generic this_cpu operations disable interrupts to ensure that the
      requested operation is protected from pre-emption. For arm64, this is
      overkill and can hurt throughput and latency.
      
      This patch provides arm64 specific implementations for the this_cpu
      operations. Rather than disable interrupts, we use the exclusive
      monitor or atomic operations as appropriate.
      
      The following operations are implemented: add, add_return, and, or,
      read, write, xchg. We also wire up a cmpxchg implementation from
      cmpxchg.h.
      
      Testing was performed using the percpu_test module and hackbench on a
      Juno board running 3.18-rc4.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f97fc810
    • M
      arm64: pgalloc: consistently use PGALLOC_GFP · 15670ef1
      Mark Rutland 提交于
      We currently allocate different levels of page tables with a variety of
      differing flags, and the PGALLOC_GFP flags, intended for use when
      allocating any level of page table, are only used for ptes in
      pte_alloc_one. On x86, PGALLOC_GFP is used for all page table
      allocations.
      
      Currently the major differences are:
      
      * __GFP_NOTRACK -- Needed to ensure page tables are always accessible in
        the presence of kmemcheck to prevent recursive faults. Currently
        kmemcheck cannot be selected for arm64.
      
      * __GFP_REPEAT -- Causes the allocator to try to reclaim pages and retry
        upon a failure to allocate.
      
      * __GFP_ZERO -- Sometimes passed explicitly, sometimes zalloc variants
        are used.
      
      While we've no encountered issues so far, it would be preferable to be
      consistent. This patch ensures all levels of table are allocated in the
      same manner, with PGALLOC_GFP.
      
      Cc: Steve Capper <steve.capper@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      15670ef1
  4. 19 11月, 2014 1 次提交
    • Y
      arm64/mm: Remove hack in mmap randomize layout · d6c763af
      Yann Droneaud 提交于
      Since commit 8a0a9bd4 ('random: make get_random_int() more
      random'), get_random_int() returns a random value for each call,
      so comment and hack introduced in mmap_rnd() as part of commit
      1d18c47c ('arm64: MMU fault handling and page table management')
      are incorrects.
      
      Commit 1d18c47c seems to use the same hack introduced by
      commit a5adc91a ('powerpc: Ensure random space between stack
      and mmaps'), latter copied in commit 5a0efea0 ('sparc64: Sharpen
      address space randomization calculations.').
      
      But both architectures were cleaned up as part of commit
      fa8cbaaf ('powerpc+sparc64/mm: Remove hack in mmap randomize
      layout') as hack is no more needed since commit 8a0a9bd4.
      
      So the present patch removes the comment and the hack around
      get_random_int() on AArch64's mmap_rnd().
      
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NDan McGee <dpmcgee@gmail.com>
      Signed-off-by: NYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d6c763af
  5. 17 11月, 2014 2 次提交
    • C
      arm64: Add COMPAT_HWCAP_LPAE · 7d57511d
      Catalin Marinas 提交于
      Commit a469abd0 (ARM: elf: add new hwcap for identifying atomic
      ldrd/strd instructions) introduces HWCAP_ELF for 32-bit ARM
      applications. As LPAE is always present on arm64, report the
      corresponding compat HWCAP to user space.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: <stable@vger.kernel.org> # 3.11+
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7d57511d
    • W
      mmu_gather: move minimal range calculations into generic code · fb7332a9
      Will Deacon 提交于
      On architectures with hardware broadcasting of TLB invalidation messages
      , it makes sense to reduce the range of the mmu_gather structure when
      unmapping page ranges based on the dirty address information passed to
      tlb_remove_tlb_entry.
      
      arm64 already does this by directly manipulating the start/end fields
      of the gather structure, but this confuses the generic code which
      does not expect these fields to change and can end up calculating
      invalid, negative ranges when forcing a flush in zap_pte_range.
      
      This patch moves the minimal range calculation out of the arm64 code
      and into the generic implementation, simplifying zap_pte_range in the
      process (which no longer needs to care about start/end, since they will
      point to the appropriate ranges already). With the range being tracked
      by core code, the need_flush flag is dropped in favour of checking that
      the end of the range has actually been set.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Michal Simek <monstr@monstr.eu>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb7332a9
  6. 14 11月, 2014 3 次提交
    • W
      arm64: entry: use ldp/stp instead of push/pop when saving/restoring regs · 63648dd2
      Will Deacon 提交于
      The push/pop instructions can be suboptimal when saving/restoring large
      amounts of data to/from the stack, for example on entry/exit from the
      kernel. This is because:
      
        (1) They act on descending addresses (i.e. the newly decremented sp),
            which may defeat some hardware prefetchers
      
        (2) They introduce an implicit dependency between each instruction, as
            the sp has to be updated in order to resolve the address of the
            next access.
      
      This patch removes the push/pop instructions from our kernel entry/exit
      macros in favour of ldp/stp plus offset.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      63648dd2
    • W
      arm64: entry: avoid writing lr explicitly for constructing return paths · d54e81f9
      Will Deacon 提交于
      Using an explicit adr instruction to set the link register to point at
      ret_fast_syscall/ret_to_user can defeat branch and return stack predictors.
      
      Instead, use the standard calling instructions (bl, blr) and have an
      unconditional branch as the following instruction.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d54e81f9
    • M
      arm64: Fix up /proc/cpuinfo · 44b82b77
      Mark Rutland 提交于
      Commit d7a49086 (arm64: cpuinfo: print info for all CPUs)
      attempted to clean up /proc/cpuinfo, but due to concerns regarding
      further changes was reverted in commit 5e39977e (Revert "arm64:
      cpuinfo: print info for all CPUs").
      
      There are two major issues with the arm64 /proc/cpuinfo format
      currently:
      
      * The "Features" line describes (only) the 64-bit hwcaps, which is
        problematic for some 32-bit applications which attempt to parse it. As
        the same names are used for analogous ISA features (e.g. aes) despite
        these generally being architecturally unrelated, it is not possible to
        simply append the 64-bit and 32-bit hwcaps in a manner that might not
        be misleading to some applications.
      
        Various potential solutions have appeared in vendor kernels. Typically
        the format of the Features line varies depending on whether the task
        is 32-bit.
      
      * Information is only printed regarding a single CPU. This does not
        match the ARM format, and does not provide sufficient information in
        big.LITTLE systems where CPUs are heterogeneous. The CPU information
        printed is queried from the current CPU's registers, which is racy
        w.r.t. cross-cpu migration.
      
      This patch attempts to solve these issues. The following changes are
      made:
      
      * When a task with a LINUX32 personality attempts to read /proc/cpuinfo,
        the "Features" line contains the decoded 32-bit hwcaps, as with the
        arm port. Otherwise, the decoded 64-bit hwcaps are shown. This aligns
        with the behaviour of COMPAT_UTS_MACHINE and COMPAT_ELF_PLATFORM. In
        the absense of compat support, the Features line is empty.
      
        The set of hwcaps injected into a task's auxval are unaffected.
      
      * Properties are printed per-cpu, as with the ARM port. The per-cpu
        information is queried from pre-recorded cpu information (as used by
        the sanity checks).
      
      * As with the previous attempt at fixing up /proc/cpuinfo, the hardware
        field is removed. The only users so far are 32-bit applications tied
        to particular boards, so no portable applications should be affected,
        and this should prevent future tying to particular boards.
      
      The following differences remain:
      
      * No model_name is printed, as this cannot be queried from the hardware
        and cannot be provided in a stable fashion. Use of the CPU
        {implementor,variant,part,revision} fields is sufficient to identify a
        CPU and is portable across arm and arm64.
      
      * The following system-wide properties are not provided, as they are not
        possible to provide generally. Programs relying on these are already
        tied to particular (32-bit only) boards:
        - Hardware
        - Revision
        - Serial
      
      No software has yet been identified for which these remaining
      differences are problematic.
      
      Cc: Greg Hackmann <ghackmann@google.com>
      Cc: Ian Campbell <ijc@hellion.org.uk>
      Cc: Serban Constantinescu <serban.constantinescu@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: cross-distro@lists.linaro.org
      Cc: linux-api@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      44b82b77
  7. 07 11月, 2014 9 次提交
  8. 05 11月, 2014 6 次提交