1. 19 7月, 2016 3 次提交
  2. 17 6月, 2016 1 次提交
    • D
      arm64: kgdb: Match pstate size with gdbserver protocol · 0d15ef67
      Daniel Thompson 提交于
      Current versions of gdb do not interoperate cleanly with kgdb on arm64
      systems because gdb and kgdb do not use the same register description.
      This patch modifies kgdb to work with recent releases of gdb (>= 7.8.1).
      
      Compatibility with gdb (after the patch is applied) is as follows:
      
        gdb-7.6 and earlier  Ok
        gdb-7.7 series       Works if user provides custom target description
        gdb-7.8(.0)          Works if user provides custom target description
        gdb-7.8.1 and later  Ok
      
      When commit 44679a4f ("arm64: KGDB: Add step debugging support") was
      introduced it was paired with a gdb patch that made an incompatible
      change to the gdbserver protocol. This patch was eventually merged into
      the gdb sources:
      https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=a4d9ba85ec5597a6a556afe26b712e878374b9dd
      
      The change to the protocol was mostly made to simplify big-endian support
      inside the kernel gdb stub. Unfortunately the gdb project released
      gdb-7.7.x and gdb-7.8.0 before the protocol incompatibility was identified
      and reversed:
      https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=bdc144174bcb11e808b4e73089b850cf9620a7ee
      
      This leaves us in a position where kgdb still uses the no-longer-used
      protocol; gdb-7.8.1, which restored the original behaviour, was
      released on 2014-10-29.
      
      I don't believe it is possible to detect/correct the protocol
      incompatiblity which means the kernel must take a view about which
      version of the gdb remote protocol is "correct". This patch takes the
      view that the original/current version of the protocol is correct
      and that version found in gdb-7.7.x and gdb-7.8.0 is anomalous.
      Signed-off-by: NDaniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0d15ef67
  3. 15 6月, 2016 3 次提交
    • W
      arm64: spinlock: Ensure forward-progress in spin_unlock_wait · c56bdcac
      Will Deacon 提交于
      Rather than wait until we observe the lock being free (which might never
      happen), we can also return from spin_unlock_wait if we observe that the
      lock is now held by somebody else, which implies that it was unlocked
      but we just missed seeing it in that state.
      
      Furthermore, in such a scenario there is no longer a need to write back
      the value that we loaded, since we know that there has been a lock
      hand-off, which is sufficient to publish any stores prior to the
      unlock_wait because the ARm architecture ensures that a Store-Release
      instruction is multi-copy atomic when observed by a Load-Acquire
      instruction.
      
      The litmus test is something like:
      
      AArch64
      {
      0:X1=x; 0:X3=y;
      1:X1=y;
      2:X1=y; 2:X3=x;
      }
       P0          | P1           | P2           ;
       MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;
       STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;
       DMB SY      |              |              ;
       LDR W2,[X3] |              |              ;
      exists
      (0:X2=0 /\ 2:X0=1 /\ 2:X2=0)
      
      where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
      doing spin_lock.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c56bdcac
    • W
      arm64: spinlock: fix spin_unlock_wait for LSE atomics · 3a5facd0
      Will Deacon 提交于
      Commit d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against
      concurrent lockers") fixed spin_unlock_wait for LL/SC-based atomics under
      the premise that the LSE atomics (in particular, the LDADDA instruction)
      are indivisible.
      
      Unfortunately, these instructions are only indivisible when used with the
      -AL (full ordering) suffix and, consequently, the same issue can
      theoretically be observed with LSE atomics, where a later (in program
      order) load can be speculated before the write portion of the atomic
      operation.
      
      This patch fixes the issue by performing a CAS of the lock once we've
      established that it's unlocked, in much the same way as the LL/SC code.
      
      Fixes: d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3a5facd0
    • W
      arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks · 38b850a7
      Will Deacon 提交于
      spin_is_locked has grown two very different use-cases:
      
      (1) [The sane case] API functions may require a certain lock to be held
          by the caller and can therefore use spin_is_locked as part of an
          assert statement in order to verify that the lock is indeed held.
          For example, usage of assert_spin_locked.
      
      (2) [The insane case] There are two locks, where a CPU takes one of the
          locks and then checks whether or not the other one is held before
          accessing some shared state. For example, the "optimized locking" in
          ipc/sem.c.
      
      In the latter case, the sequence looks like:
      
        spin_lock(&sem->lock);
        if (!spin_is_locked(&sma->sem_perm.lock))
          /* Access shared state */
      
      and requires that the spin_is_locked check is ordered after taking the
      sem->lock. Unfortunately, since our spinlocks are implemented using a
      LDAXR/STXR sequence, the read of &sma->sem_perm.lock can be speculated
      before the STXR and consequently return a stale value.
      
      Whilst this hasn't been seen to cause issues in practice, PowerPC fixed
      the same issue in 51d7d520 ("powerpc: Add smp_mb() to
      arch_spin_is_locked()") and, although we did something similar for
      spin_unlock_wait in d86b8da0 ("arm64: spinlock: serialise
      spin_unlock_wait against concurrent lockers") that doesn't actually take
      care of ordering against local acquisition of a different lock.
      
      This patch adds an smp_mb() to the start of our arch_spin_is_locked and
      arch_spin_unlock_wait routines to ensure that the lock value is always
      loaded after any other locks have been taken by the current CPU.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      38b850a7
  4. 03 6月, 2016 2 次提交
    • M
      arm64: move {PAGE,CONT}_SHIFT into Kconfig · 030c4d24
      Mark Rutland 提交于
      In some cases (e.g. the awk for CONFIG_RANDOMIZE_TEXT_OFFSET) we would
      like to make use of PAGE_SHIFT outside of code that can include the
      usual header files.
      
      Add a new CONFIG_ARM64_PAGE_SHIFT for this, likewise with
      ARM64_CONT_SHIFT for consistency.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      030c4d24
    • M
      arm64: update stale PAGE_OFFSET comment · a13e3a5b
      Mark Rutland 提交于
      Commit ab893fb9 ("arm64: introduce KIMAGE_VADDR as the virtual
      base of the kernel region") logically split KIMAGE_VADDR from
      PAGE_OFFSET, and since commit f9040773 ("arm64: move kernel
      image to base of vmalloc area") the two have been distinct values.
      
      Unfortunately, neither commit updated the comment above these
      definitions, which now erroneously states that PAGE_OFFSET is the start
      of the kernel image rather than the start of the linear mapping.
      
      This patch fixes said comment, and introduces an explanation of
      KIMAGE_VADDR.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      a13e3a5b
  5. 02 6月, 2016 1 次提交
  6. 01 6月, 2016 1 次提交
  7. 31 5月, 2016 1 次提交
  8. 20 5月, 2016 4 次提交
    • C
      KVM: arm/arm64: vgic-new: Synchronize changes to active state · 35a2d585
      Christoffer Dall 提交于
      When modifying the active state of an interrupt via the MMIO interface,
      we should ensure that the write has the intended effect.
      
      If a guest sets an interrupt to active, but that interrupt is already
      flushed into a list register on a running VCPU, then that VCPU will
      write the active state back into the struct vgic_irq upon returning from
      the guest and syncing its state.  This is a non-benign race, because the
      guest can observe that an interrupt is not active, and it can have a
      reasonable expectations that other VCPUs will not ack any IRQs, and then
      set the state to active, and expect it to stay that way.  Currently we
      are not honoring this case.
      
      Thefore, change both the SACTIVE and CACTIVE mmio handlers to stop the
      world, change the irq state, potentially queue the irq if we're setting
      it to active, and then continue.
      
      We take this chance to slightly optimize these functions by not stopping
      the world when touching private interrupts where there is inherently no
      possible race.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      35a2d585
    • C
      KVM: arm/arm64: Provide functionality to pause and resume a guest · b13216cf
      Christoffer Dall 提交于
      For some rare corner cases in our VGIC emulation later we have to stop
      the guest to make sure the VGIC state is consistent.
      Provide the necessary framework to pause and resume a guest.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      b13216cf
    • C
      KVM: arm/arm64: Export mmio_read/write_bus · d5a5a0ef
      Christoffer Dall 提交于
      Rename mmio_{read,write}_bus to kvm_mmio_{read,write}_bus and export
      them out of mmio.c.
      This will be needed later for the new VGIC implementation.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      d5a5a0ef
    • H
      arch: fix has_transparent_hugepage() · fd8cfd30
      Hugh Dickins 提交于
      I've just discovered that the useful-sounding has_transparent_hugepage()
      is actually an architecture-dependent minefield: on some arches it only
      builds if CONFIG_TRANSPARENT_HUGEPAGE=y, on others it's also there when
      not, but on some of those (arm and arm64) it then gives the wrong
      answer; and on mips alone it's marked __init, which would crash if
      called later (but so far it has not been called later).
      
      Straighten this out: make it available to all configs, with a sensible
      default in asm-generic/pgtable.h, removing its definitions from those
      arches (arc, arm, arm64, sparc, tile) which are served by the default,
      adding #define has_transparent_hugepage has_transparent_hugepage to
      those (mips, powerpc, s390, x86) which need to override the default at
      runtime, and removing the __init from mips (but maybe that kind of code
      should be avoided after init: set a static variable the first time it's
      called).
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andres Lagar-Cavilla <andreslc@google.com>
      Cc: Yang Shi <yang.shi@linaro.org>
      Cc: Ning Qu <quning@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Konstantin Khlebnikov <koct9i@gmail.com>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>		[arch/arc]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[arch/s390]
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd8cfd30
  9. 13 5月, 2016 1 次提交
    • C
      KVM: halt_polling: provide a way to qualify wakeups during poll · 3491caf2
      Christian Borntraeger 提交于
      Some wakeups should not be considered a sucessful poll. For example on
      s390 I/O interrupts are usually floating, which means that _ALL_ CPUs
      would be considered runnable - letting all vCPUs poll all the time for
      transactional like workload, even if one vCPU would be enough.
      This can result in huge CPU usage for large guests.
      This patch lets architectures provide a way to qualify wakeups if they
      should be considered a good/bad wakeups in regard to polls.
      
      For s390 the implementation will fence of halt polling for anything but
      known good, single vCPU events. The s390 implementation for floating
      interrupts does a wakeup for one vCPU, but the interrupt will be delivered
      by whatever CPU checks first for a pending interrupt. We prefer the
      woken up CPU by marking the poll of this CPU as "good" poll.
      This code will also mark several other wakeup reasons like IPI or
      expired timers as "good". This will of course also mark some events as
      not sucessful. As  KVM on z runs always as a 2nd level hypervisor,
      we prefer to not poll, unless we are really sure, though.
      
      This patch successfully limits the CPU usage for cases like uperf 1byte
      transactional ping pong workload or wakeup heavy workload like OLTP
      while still providing a proper speedup.
      
      This also introduced a new vcpu stat "halt_poll_no_tuning" that marks
      wakeups that are considered not good for polling.
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: Radim Krčmář <rkrcmar@redhat.com> (for an earlier version)
      Cc: David Matlack <dmatlack@google.com>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      [Rename config symbol. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      3491caf2
  10. 10 5月, 2016 1 次提交
    • C
      kvm: arm64: Enable hardware updates of the Access Flag for Stage 2 page tables · 06485053
      Catalin Marinas 提交于
      The ARMv8.1 architecture extensions introduce support for hardware
      updates of the access and dirty information in page table entries. With
      VTCR_EL2.HA enabled (bit 21), when the CPU accesses an IPA with the
      PTE_AF bit cleared in the stage 2 page table, instead of raising an
      Access Flag fault to EL2 the CPU sets the actual page table entry bit
      (10). To ensure that kernel modifications to the page table do not
      inadvertently revert a bit set by hardware updates, certain Stage 2
      software pte/pmd operations must be performed atomically.
      
      The main user of the AF bit is the kvm_age_hva() mechanism. The
      kvm_age_hva_handler() function performs a "test and clear young" action
      on the pte/pmd. This needs to be atomic in respect of automatic hardware
      updates of the AF bit. Since the AF bit is in the same position for both
      Stage 1 and Stage 2, the patch reuses the existing
      ptep_test_and_clear_young() functionality if
      __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG is defined. Otherwise, the
      existing pte_young/pte_mkold mechanism is preserved.
      
      The kvm_set_s2pte_readonly() (and the corresponding pmd equivalent) have
      to perform atomic modifications in order to avoid a race with updates of
      the AF bit. The arm64 implementation has been re-written using
      exclusives.
      
      Currently, kvm_set_s2pte_writable() (and pmd equivalent) take a pointer
      argument and modify the pte/pmd in place. However, these functions are
      only used on local variables rather than actual page table entries, so
      it makes more sense to follow the pte_mkwrite() approach for stage 1
      attributes. The change to kvm_s2pte_mkwrite() makes it clear that these
      functions do not modify the actual page table entries.
      
      The (pte|pmd)_mkyoung() uses on Stage 2 entries (setting the AF bit
      explicitly) do not need to be modified since hardware updates of the
      dirty status are not supported by KVM, so there is no possibility of
      losing such information.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      06485053
  11. 09 5月, 2016 1 次提交
  12. 06 5月, 2016 4 次提交
  13. 03 5月, 2016 2 次提交
    • Y
      arm64: always use STRICT_MM_TYPECHECKS · 2326df55
      Yang Shi 提交于
      Inspired by the counterpart of powerpc [1], which shows there is no negative
      effect on code generation from enabling STRICT_MM_TYPECHECKS with a modern
      compiler.
      
      And, Arnd's comment [2] about that patch says STRICT_MM_TYPECHECKS could
      be default as long as the architecture can pass structures in registers as
      function arguments. ARM64 can do it as long as the size of structure <= 16
      bytes. All the page table value types are u64 on ARM64.
      
      The below disassembly demonstrates it, entry is pte_t type:
      
                  entry = arch_make_huge_pte(entry, vma, page, writable);
         0xffff00000826fc38 <+80>:    and     x0, x0, #0xfffffffffffffffd
         0xffff00000826fc3c <+84>:    mov     w3, w21
         0xffff00000826fc40 <+88>:    mov     x2, x20
         0xffff00000826fc44 <+92>:    mov     x1, x19
         0xffff00000826fc48 <+96>:    orr     x0, x0, #0x400
         0xffff00000826fc4c <+100>:   bl      0xffff00000809bcc0 <arch_make_huge_pte>
      
      [1] http://www.spinics.net/lists/linux-mm/msg105951.html
      [2] http://www.spinics.net/lists/linux-mm/msg105969.html
      
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NYang Shi <yang.shi@linaro.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2326df55
    • J
      arm64: kvm: Fix kvm teardown for systems using the extended idmap · c612505f
      James Morse 提交于
      If memory is located above 1<<VA_BITS, kvm adds an extra level to its page
      tables, merging the runtime tables and boot tables that contain the idmap.
      This lets us avoid the trampoline dance during initialisation.
      
      This also means there is no trampoline page mapped, so
      __cpu_reset_hyp_mode() can't call __kvm_hyp_reset() in this page. The good
      news is the idmap is still mapped, so we don't need the trampoline page.
      The bad news is we can't call it directly as the idmap is above
      HYP_PAGE_OFFSET, so its address is masked by kvm_call_hyp.
      
      Add a function __extended_idmap_trampoline which will branch into
      __kvm_hyp_reset in the idmap, change kvm_hyp_reset_entry() to return
      this address if __kvm_cpu_uses_extended_idmap(). In this case
      __kvm_hyp_reset() will still switch to the boot tables (which are the
      merged tables that were already in use), and branch into the idmap (where
      it already was).
      
      This fixes boot failures on these systems, where we fail to execute the
      missing trampoline page when tearing down kvm in init_subsystems():
      [    2.508922] kvm [1]: 8-bit VMID
      [    2.512057] kvm [1]: Hyp mode initialized successfully
      [    2.517242] kvm [1]: interrupt-controller@e1140000 IRQ13
      [    2.522622] kvm [1]: timer IRQ3
      [    2.525783] Kernel panic - not syncing: HYP panic:
      [    2.525783] PS:200003c9 PC:0000007ffffff820 ESR:86000005
      [    2.525783] FAR:0000007ffffff820 HPFAR:00000000003ffff0 PAR:0000000000000000
      [    2.525783] VCPU:          (null)
      [    2.525783]
      [    2.547667] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.6.0-rc5+ #1
      [    2.555137] Hardware name: Default string Default string/Default string, BIOS ROD0084E 09/03/2015
      [    2.563994] Call trace:
      [    2.566432] [<ffffff80080888d0>] dump_backtrace+0x0/0x240
      [    2.571818] [<ffffff8008088b24>] show_stack+0x14/0x20
      [    2.576858] [<ffffff80083423ac>] dump_stack+0x94/0xb8
      [    2.581899] [<ffffff8008152130>] panic+0x10c/0x250
      [    2.586677] [<ffffff8008152024>] panic+0x0/0x250
      [    2.591281] SMP: stopping secondary CPUs
      [    3.649692] SMP: failed to stop secondary CPUs 0-2,4-7
      [    3.654818] Kernel Offset: disabled
      [    3.658293] Memory Limit: none
      [    3.661337] ---[ end Kernel panic - not syncing: HYP panic:
      [    3.661337] PS:200003c9 PC:0000007ffffff820 ESR:86000005
      [    3.661337] FAR:0000007ffffff820 HPFAR:00000000003ffff0 PAR:0000000000000000
      [    3.661337] VCPU:          (null)
      [    3.661337]
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NJames Morse <james.morse@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c612505f
  14. 28 4月, 2016 15 次提交