1. 20 11月, 2014 2 次提交
    • S
      arm64: percpu: Implement this_cpu operations · f97fc810
      Steve Capper 提交于
      The generic this_cpu operations disable interrupts to ensure that the
      requested operation is protected from pre-emption. For arm64, this is
      overkill and can hurt throughput and latency.
      
      This patch provides arm64 specific implementations for the this_cpu
      operations. Rather than disable interrupts, we use the exclusive
      monitor or atomic operations as appropriate.
      
      The following operations are implemented: add, add_return, and, or,
      read, write, xchg. We also wire up a cmpxchg implementation from
      cmpxchg.h.
      
      Testing was performed using the percpu_test module and hackbench on a
      Juno board running 3.18-rc4.
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f97fc810
    • M
      arm64: pgalloc: consistently use PGALLOC_GFP · 15670ef1
      Mark Rutland 提交于
      We currently allocate different levels of page tables with a variety of
      differing flags, and the PGALLOC_GFP flags, intended for use when
      allocating any level of page table, are only used for ptes in
      pte_alloc_one. On x86, PGALLOC_GFP is used for all page table
      allocations.
      
      Currently the major differences are:
      
      * __GFP_NOTRACK -- Needed to ensure page tables are always accessible in
        the presence of kmemcheck to prevent recursive faults. Currently
        kmemcheck cannot be selected for arm64.
      
      * __GFP_REPEAT -- Causes the allocator to try to reclaim pages and retry
        upon a failure to allocate.
      
      * __GFP_ZERO -- Sometimes passed explicitly, sometimes zalloc variants
        are used.
      
      While we've no encountered issues so far, it would be preferable to be
      consistent. This patch ensures all levels of table are allocated in the
      same manner, with PGALLOC_GFP.
      
      Cc: Steve Capper <steve.capper@arm.com>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      15670ef1
  2. 17 11月, 2014 2 次提交
    • C
      arm64: Add COMPAT_HWCAP_LPAE · 7d57511d
      Catalin Marinas 提交于
      Commit a469abd0 (ARM: elf: add new hwcap for identifying atomic
      ldrd/strd instructions) introduces HWCAP_ELF for 32-bit ARM
      applications. As LPAE is always present on arm64, report the
      corresponding compat HWCAP to user space.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: <stable@vger.kernel.org> # 3.11+
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7d57511d
    • W
      mmu_gather: move minimal range calculations into generic code · fb7332a9
      Will Deacon 提交于
      On architectures with hardware broadcasting of TLB invalidation messages
      , it makes sense to reduce the range of the mmu_gather structure when
      unmapping page ranges based on the dirty address information passed to
      tlb_remove_tlb_entry.
      
      arm64 already does this by directly manipulating the start/end fields
      of the gather structure, but this confuses the generic code which
      does not expect these fields to change and can end up calculating
      invalid, negative ranges when forcing a flush in zap_pte_range.
      
      This patch moves the minimal range calculation out of the arm64 code
      and into the generic implementation, simplifying zap_pte_range in the
      process (which no longer needs to care about start/end, since they will
      point to the appropriate ranges already). With the range being tracked
      by core code, the need_flush flag is dropped in favour of checking that
      the end of the range has actually been set.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
      Cc: Michal Simek <monstr@monstr.eu>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb7332a9
  3. 07 11月, 2014 2 次提交
    • G
      arm64/kvm: Fix assembler compatibility of macros · 286fb1cc
      Geoff Levand 提交于
      Some of the macros defined in kvm_arm.h are useful in assembly files, but are
      not compatible with the assembler.  Change any C language integer constant
      definitions using appended U, UL, or ULL to the UL() preprocessor macro.  Also,
      add a preprocessor include of the asm/memory.h file which defines the UL()
      macro.
      
      Fixes build errors like these when using kvm_arm.h in assembly
      source files:
      
        Error: unexpected characters following instruction at operand 3 -- `and x0,x1,#((1U<<25)-1)'
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NGeoff Levand <geoff@infradead.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      286fb1cc
    • S
      arm64: xchg: Implement cmpxchg_double · 5284e1b4
      Steve Capper 提交于
      The arm64 architecture has the ability to exclusively load and store
      a pair of registers from an address (ldxp/stxp). Also the SLUB can take
      advantage of a cmpxchg_double implementation to avoid taking some
      locks.
      
      This patch provides an implementation of cmpxchg_double for 64-bit
      pairs, and activates the logic required for the SLUB to use these
      functions (HAVE_ALIGNED_STRUCT_PAGE and HAVE_CMPXCHG_DOUBLE).
      
      Also definitions of this_cpu_cmpxchg_8 and this_cpu_cmpxchg_double_8
      are wired up to cmpxchg_local and cmpxchg_double_local (rather than the
      stock implementations that perform non-atomic operations with
      interrupts disabled) as they are used by the SLUB.
      
      On a Juno platform running on only the A57s I get quite a noticeable
      performance improvement with 5 runs of hackbench on v3.17:
      
               Baseline | With Patch
       -----------------+-----------
       Mean    119.2312 | 106.1782
       StdDev    0.4919 |   0.4494
      
      (times taken to complete `./hackbench 100 process 1000', in seconds)
      Signed-off-by: NSteve Capper <steve.capper@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5284e1b4
  4. 05 11月, 2014 1 次提交
  5. 24 10月, 2014 1 次提交
  6. 22 10月, 2014 1 次提交
  7. 21 10月, 2014 1 次提交
  8. 14 10月, 2014 1 次提交
    • C
      arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2 · 38f791a4
      Christoffer Dall 提交于
      This patch adds the necessary support for all host kernel PGSIZE and
      VA_SPACE configuration options for both EL2 and the Stage-2 page tables.
      
      However, for 40bit and 42bit PARange systems, the architecture mandates
      that VTCR_EL2.SL0 is maximum 1, resulting in fewer levels of stage-2
      pagge tables than levels of host kernel page tables.  At the same time,
      systems with a PARange > 42bit, we limit the IPA range by always setting
      VTCR_EL2.T0SZ to 24.
      
      To solve the situation with different levels of page tables for Stage-2
      translation than the host kernel page tables, we allocate a dummy PGD
      with pointers to our actual inital level Stage-2 page table, in order
      for us to reuse the kernel pgtable manipulation primitives.  Reproducing
      all these in KVM does not look pretty and unnecessarily complicates the
      32-bit side.
      
      Systems with a PARange < 40bits are not yet supported.
      
       [ I have reworked this patch from its original form submitted by
         Jungseok to take the architecture constraints into consideration.
         There were too many changes from the original patch for me to
         preserve the authorship.  Thanks to Catalin Marinas for his help in
         figuring out a good solution to this challenge.  I have also fixed
         various bugs and missing error code handling from the original
         patch. - Christoffer ]
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NJungseok Lee <jungseoklee85@gmail.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      38f791a4
  9. 10 10月, 2014 4 次提交
  10. 03 10月, 2014 3 次提交
  11. 01 10月, 2014 1 次提交
  12. 29 9月, 2014 2 次提交
  13. 26 9月, 2014 2 次提交
  14. 25 9月, 2014 2 次提交
    • C
      arm64: Fix typos in KGDB macros · 7acf71d1
      Catalin Marinas 提交于
      Some of the KGDB macros used for generating the BRK instructions had the
      wrong spelling for DBG and KGDB abbreviations.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      7acf71d1
    • M
      arm64: insn: Add return statements after BUG_ON() · a9ae04c9
      Mark Brown 提交于
      Following a recent series of enhancements to the insn code the ARMv8
      allnoconfig build has been generating a large number of warnings in the
      form of:
      
      arch/arm64/kernel/insn.c:689:8: warning: 'insn' may be used uninitialized in this function [-Wmaybe-uninitialized]
      
      This is because BUG() and related macros can be compiled out so we get
      execution paths which normally result in a panic compiling out to noops
      instead.
      
      I wasn't able to immediately identify a sensible return value to use in
      these cases so just return AARCH64_BREAK_FAULT - this is all "should
      never happen" code so hopefully it never has a practical impact.
      Signed-off-by: NMark Brown <broonie@kernel.org>
      [catalin.marinas@arm.com: AARCH64_BREAK_FAULT definition contributed by Daniel Borkmann]
      [catalin.marinas@arm.com: replace return 0 with AARCH64_BREAK_FAULT]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      a9ae04c9
  15. 24 9月, 2014 2 次提交
    • T
      kvm: Add arch specific mmu notifier for page invalidation · fe71557a
      Tang Chen 提交于
      This will be used to let the guest run while the APIC access page is
      not pinned.  Because subsequent patches will fill in the function
      for x86, place the (still empty) x86 implementation in the x86.c file
      instead of adding an inline function in kvm_host.h.
      Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      fe71557a
    • A
      kvm: Fix page ageing bugs · 57128468
      Andres Lagar-Cavilla 提交于
      1. We were calling clear_flush_young_notify in unmap_one, but we are
      within an mmu notifier invalidate range scope. The spte exists no more
      (due to range_start) and the accessed bit info has already been
      propagated (due to kvm_pfn_set_accessed). Simply call
      clear_flush_young.
      
      2. We clear_flush_young on a primary MMU PMD, but this may be mapped
      as a collection of PTEs by the secondary MMU (e.g. during log-dirty).
      This required expanding the interface of the clear_flush_young mmu
      notifier, so a lot of code has been trivially touched.
      
      3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate
      the access bit by blowing the spte. This requires proper synchronizing
      with MMU notifier consumers, like every other removal of spte's does.
      Signed-off-by: NAndres Lagar-Cavilla <andreslc@google.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      57128468
  16. 23 9月, 2014 1 次提交
    • C
      Revert "arm64: dmi: Add SMBIOS/DMI support" · 6f325eaa
      Catalin Marinas 提交于
      This reverts commit 668ebd10.
      
      ... because of lots of warnings during boot if Linux isn't started as an EFI
      application:
      
      WARNING: CPU: 4 PID: 1 at
      /work/Linux/linux-2.6-aarch64/drivers/firmware/dmi_scan.c:591 dmi_matches+0x10c/0x110()
      dmi check: not initialized yet.
      Modules linked in:
      CPU: 4 PID: 1 Comm: swapper/0 Not tainted 3.17.0-rc4+ #606
      Call trace:
      [<ffffffc000087fb0>] dump_backtrace+0x0/0x124
      [<ffffffc0000880e4>] show_stack+0x10/0x1c
      [<ffffffc0004d58f8>] dump_stack+0x74/0xb8
      [<ffffffc0000ab640>] warn_slowpath_common+0x8c/0xb4
      [<ffffffc0000ab6b4>] warn_slowpath_fmt+0x4c/0x58
      [<ffffffc0003f2d7c>] dmi_matches+0x108/0x110
      [<ffffffc0003f2da8>] dmi_check_system+0x24/0x68
      [<ffffffc0006974c4>] atkbd_init+0x10/0x34
      [<ffffffc0000814ac>] do_one_initcall+0x88/0x1a0
      [<ffffffc00067aab4>] kernel_init_freeable+0x148/0x1e8
      [<ffffffc0004d2c64>] kernel_init+0x10/0xd4
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      6f325eaa
  17. 22 9月, 2014 3 次提交
  18. 19 9月, 2014 1 次提交
  19. 14 9月, 2014 2 次提交
  20. 12 9月, 2014 2 次提交
    • L
      arm64: kernel: introduce cpu_init_idle CPU operation · d64f84f6
      Lorenzo Pieralisi 提交于
      The CPUidle subsystem on ARM64 machines requires the idle states
      implementation back-end to initialize idle states parameter upon
      boot. This patch adds a hook in the CPU operations structure that
      should be initialized by the CPU operations back-end in order to
      provide a function that initializes cpu idle states.
      
      This patch also adds the infrastructure to arm64 kernel required
      to export the CPU operations based initialization interface, so
      that drivers (ie CPUidle) can use it when they are initialized
      at probe time.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      d64f84f6
    • L
      arm64: kernel: refactor the CPU suspend API for retention states · 714f5992
      Lorenzo Pieralisi 提交于
      CPU suspend is the standard kernel interface to be used to enter
      low-power states on ARM64 systems. Current cpu_suspend implementation
      by default assumes that all low power states are losing the CPU context,
      so the CPU registers must be saved and cleaned to DRAM upon state
      entry. Furthermore, the current cpu_suspend() implementation assumes
      that if the CPU suspend back-end method returns when called, this has
      to be considered an error regardless of the return code (which can be
      successful) since the CPU was not expected to return from a code path that
      is different from cpu_resume code path - eg returning from the reset vector.
      
      All in all this means that the current API does not cope well with low-power
      states that preserve the CPU context when entered (ie retention states),
      since first of all the context is saved for nothing on state entry for
      those states and a successful state entry can return as a normal function
      return, which is considered an error by the current CPU suspend
      implementation.
      
      This patch refactors the cpu_suspend() API so that it can be split in
      two separate functionalities. The arm64 cpu_suspend API just provides
      a wrapper around CPU suspend operation hook. A new function is
      introduced (for architecture code use only) for states that require
      context saving upon entry:
      
      __cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
      
      __cpu_suspend() saves the context on function entry and calls the
      so called suspend finisher (ie fn) to complete the suspend operation.
      The finisher is not expected to return, unless it fails in which case
      the error is propagated back to the __cpu_suspend caller.
      
      The API refactoring results in the following pseudo code call sequence for a
      suspending CPU, when triggered from a kernel subsystem:
      
      /*
       * int cpu_suspend(unsigned long idx)
       * @idx: idle state index
       */
      {
      -> cpu_suspend(idx)
      	|---> CPU operations suspend hook called, if present
      		|--> if (retention_state)
      			|--> direct suspend back-end call (eg PSCI suspend)
      		     else
      			|--> __cpu_suspend(idx, &back_end_finisher);
      }
      
      By refactoring the cpu_suspend API this way, the CPU operations back-end
      has a chance to detect whether idle states require state saving or not
      and can call the required suspend operations accordingly either through
      simple function call or indirectly through __cpu_suspend() which carries out
      state saving and suspend finisher dispatching to complete idle state entry.
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NHanjun Guo <hanjun.guo@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      714f5992
  21. 11 9月, 2014 1 次提交
  22. 08 9月, 2014 3 次提交