1. 09 12月, 2015 1 次提交
    • W
      arm64: irq: fix walking from irq stack to task stack · 7596abf2
      Will Deacon 提交于
      Running with CONFIG_DEBUG_SPINLOCK=y can trigger a BUG with the new IRQ
      stack code:
      
        BUG: spinlock lockup suspected on CPU#1
      
      This is due to the IRQ_STACK_TO_TASK_STACK macro incorrectly retrieving
      the task stack pointer stashed at the top of the IRQ stack.
      
      Sayeth James:
      
      | Yup, this is what is happening. Its an off-by-one due to broken
      | thinking about how the stack works. My broken thinking was:
      |
      | >   top ------------
      | >       | dummy_lr | <- irq_stack_ptr
      | >       ------------
      | >       |   x29    |
      | >       ------------
      | >       |   x19    | <- irq_stack_ptr - 0x10
      | >       ------------
      | >       |   xzr    |
      | >       ------------
      |
      | But the stack-pointer is decreased before use. So it actually looks
      | like this:
      |
      | >       ------------
      | >       |          |  <- irq_stack_ptr
      | >   top ------------
      | >       | dummy_lr |
      | >       ------------
      | >       |   x29    | <- irq_stack_ptr - 0x10
      | >       ------------
      | >       |   x19    |
      | >       ------------
      | >       |   xzr    | <- irq_stack_ptr - 0x20
      | >       ------------
      |
      | The value being used as the original stack is x29, which in all the
      | tests is sp but without the current frames data, hence there are no
      | missing frames in the output.
      |
      | Jungseok Lee picked it up with a 32bit user space because aarch32
      | can't use x29, so it remains 0 forever. The fix he posted is correct.
      
      This patch fixes the macro and adds some of this wisdom to a comment,
      so that the layout of the IRQ stack is well understood.
      
      Cc: James Morse <james.morse@arm.com>
      Reported-by: NJungseok Lee <jungseoklee85@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7596abf2
  2. 08 12月, 2015 3 次提交
  3. 04 12月, 2015 1 次提交
    • W
      arm64: spinlock: serialise spin_unlock_wait against concurrent lockers · d86b8da0
      Will Deacon 提交于
      Boqun Feng reported a rather nasty ordering issue with spin_unlock_wait
      on architectures implementing spin_lock with LL/SC sequences and acquire
      semantics:
      
       | CPU 1                   CPU 2                     CPU 3
       | ==================      ====================      ==============
       |                                                   spin_unlock(&lock);
       |                         spin_lock(&lock):
       |                           r1 = *lock; // r1 == 0;
       |                         o = READ_ONCE(object); // reordered here
       | object = NULL;
       | smp_mb();
       | spin_unlock_wait(&lock);
       |                           *lock = 1;
       | smp_mb();
       | o->dead = true;
       |                         if (o) // true
       |                           BUG_ON(o->dead); // true!!
      
      The crux of the problem is that spin_unlock_wait(&lock) can return on
      CPU 1 whilst CPU 2 is in the process of taking the lock. This can be
      resolved by upgrading spin_unlock_wait to a LOCK operation, forcing it
      to serialise against a concurrent locker and giving it acquire semantics
      in the process (although it is not at all clear whether this is needed -
      different callers seem to assume different things about the barrier
      semantics and architectures are similarly disjoint in their
      implementations of the macro).
      
      This patch implements spin_unlock_wait using an LL/SC sequence with
      acquire semantics on arm64. For v8.1 systems with the LSE atomics, the
      exclusive writeback is omitted, since the spin_lock operation is
      indivisible and no intermediate state can be observed.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d86b8da0
  4. 02 12月, 2015 1 次提交
  5. 01 12月, 2015 1 次提交
  6. 27 11月, 2015 4 次提交
  7. 25 11月, 2015 2 次提交
    • M
      arm64: KVM: Add workaround for Cortex-A57 erratum 834220 · 498cd5c3
      Marc Zyngier 提交于
      Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults
      when a Stage 1 permission fault or device alignment fault should
      have been reported.
      
      This patch implements the workaround (which is to validate that the
      Stage-1 translation actually succeeds) by using code patching.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      498cd5c3
    • M
      arm64: KVM: Fix AArch32 to AArch64 register mapping · c0f09634
      Marc Zyngier 提交于
      When running a 32bit guest under a 64bit hypervisor, the ARMv8
      architecture defines a mapping of the 32bit registers in the 64bit
      space. This includes banked registers that are being demultiplexed
      over the 64bit ones.
      
      On exceptions caused by an operation involving a 32bit register, the
      HW exposes the register number in the ESR_EL2 register. It was so
      far understood that SW had to distinguish between AArch32 and AArch64
      accesses (based on the current AArch32 mode and register number).
      
      It turns out that I misinterpreted the ARM ARM, and the clue is in
      D1.20.1: "For some exceptions, the exception syndrome given in the
      ESR_ELx identifies one or more register numbers from the issued
      instruction that generated the exception. Where the exception is
      taken from an Exception level using AArch32 these register numbers
      give the AArch64 view of the register."
      
      Which means that the HW is already giving us the translated version,
      and that we shouldn't try to interpret it at all (for example, doing
      an MMIO operation from the IRQ mode using the LR register leads to
      very unexpected behaviours).
      
      The fix is thus not to perform a call to vcpu_reg32() at all from
      vcpu_reg(), and use whatever register number is supplied directly.
      The only case we need to find out about the mapping is when we
      actively generate a register access, which only occurs when injecting
      a fault in a guest.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      c0f09634
  8. 19 11月, 2015 1 次提交
    • W
      arm64: barriers: fix smp_load_acquire to work with const arguments · c139aa60
      Will Deacon 提交于
      A newly introduced function in include/net/sock.h passes a const
      argument to smp_load_acquire:
      
        static inline int sk_state_load(const struct sock *sk)
        {
      	return smp_load_acquire(&sk->sk_state);
        }
      
      This cause an allmodconfig build failure, since our underlying
      load-acquire implementation does not handle const types correctly:
      
        include/net/sock.h: In function 'sk_state_load':
        ./arch/arm64/include/asm/barrier.h:71:3: error: read-only variable '___p1' used as 'asm' output
           asm volatile ("ldarb %w0, %1"    \
      
      This patch fixes the problem by reusing the trick in READ_ONCE that
      loads via a non-const member of an anonymous union. This has the
      advantage of allowing us to use smp_load_acquire on packed structures
      (e.g. arch_spinlock_t) as well as primitive types.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Reported-by: NArnd Bergmann <arnd@arndb.de>
      Reported-by: NDavid Daney <david.daney@cavium.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      c139aa60
  9. 18 11月, 2015 2 次提交
  10. 17 11月, 2015 2 次提交
    • A
      arm64: do not include ptrace.h from compat.h · adc235af
      Arnd Bergmann 提交于
      including ptrace.h brings a definition of BITS_PER_PAGE into device
      drivers and cause a build warning in allmodconfig builds:
      
      drivers/block/drbd/drbd_bitmap.c:482:0: warning: "BITS_PER_PAGE" redefined
       #define BITS_PER_PAGE  (1UL << (PAGE_SHIFT + 3))
      
      This uses a slightly different way to express current_pt_regs()
      that avoids the use of the header and gets away with the already
      included asm/ptrace.h.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      adc235af
    • A
      arm64: simplify dma_get_ops · 1dccb598
      Arnd Bergmann 提交于
      Including linux/acpi.h from asm/dma-mapping.h causes tons of compile-time
      warnings, e.g.
      
       drivers/isdn/mISDN/dsp_ecdis.h:43:0: warning: "FALSE" redefined
       drivers/isdn/mISDN/dsp_ecdis.h:44:0: warning: "TRUE" redefined
       drivers/net/fddi/skfp/h/targetos.h:62:0: warning: "TRUE" redefined
       drivers/net/fddi/skfp/h/targetos.h:63:0: warning: "FALSE" redefined
      
      However, it looks like the dependency should not even there as
      I do not see why __generic_dma_ops() cares about whether we have
      an ACPI based system or not.
      
      The current behavior is to fall back to the global dma_ops when
      a device has not set its own dma_ops, but only for DT based systems.
      This seems dangerous, as a random device might have different
      requirements regarding IOMMU or coherency, so we should really
      never have that fallback and just forbid DMA when we have not
      initialized DMA for a device.
      
      This removes the global dma_ops variable and the special-casing
      for ACPI, and just returns the dma ops that got set for the
      device, or the dummy_dma_ops if none were present.
      
      The original code has apparently been copied from arm32 where we
      rely on it for ISA devices things like the floppy controller, but
      we should have no such devices on ARM64.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      [catalin.marinas@arm.com: removed acpi_disabled check in arch_setup_dma_ops()]
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      1dccb598
  11. 12 11月, 2015 1 次提交
  12. 09 11月, 2015 1 次提交
    • A
      arm64: fix R/O permissions of FDT mapping · fb226c3d
      Ard Biesheuvel 提交于
      The mapping permissions of the FDT are set to 'PAGE_KERNEL | PTE_RDONLY'
      in an attempt to map the FDT as read-only. However, not only does this
      break at build time under STRICT_MM_TYPECHECKS (since the two terms are
      of different types in that case), it also results in both the PTE_WRITE
      and PTE_RDONLY attributes to be set, which means the region is still
      writable under ARMv8.1 DBM (and an attempted write will simply clear the
      PT_RDONLY bit).
      
      So instead, define PAGE_KERNEL_RO (which already has an established
      meaning across architectures) and use that instead.
      Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      fb226c3d
  13. 06 11月, 2015 1 次提交
    • L
      arm64: cmpxchg_dbl: fix return value type · 57a65667
      Lorenzo Pieralisi 提交于
      The current arm64 __cmpxchg_double{_mb} implementations carry out the
      compare exchange by first comparing the old values passed in to the
      values read from the pointer provided and by stashing the cumulative
      bitwise difference in a 64-bit register.
      
      By comparing the register content against 0, it is possible to detect if
      the values read differ from the old values passed in, so that the compare
      exchange detects whether it has to bail out or carry on completing the
      operation with the exchange.
      
      Given the current implementation, to detect the cmpxchg operation
      status, the __cmpxchg_double{_mb} functions should return the 64-bit
      stashed bitwise difference so that the caller can detect cmpxchg failure
      by comparing the return value content against 0. The current implementation
      declares the return value as an int, which means that the 64-bit
      value stashing the bitwise difference is truncated before being
      returned to the __cmpxchg_double{_mb} callers, which means that
      any bitwise difference present in the top 32 bits goes undetected,
      triggering false positives and subsequent kernel failures.
      
      This patch fixes the issue by declaring the arm64 __cmpxchg_double{_mb}
      return values as a long, so that the bitwise difference is
      properly propagated on failure, restoring the expected behaviour.
      
      Fixes: e9a4b795 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
      Cc: <stable@vger.kernel.org> # 4.3+
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      57a65667
  14. 30 10月, 2015 1 次提交
  15. 29 10月, 2015 3 次提交
  16. 23 10月, 2015 5 次提交
    • C
      arm/arm64: KVM: Improve kvm_exit tracepoint · b5905dc1
      Christoffer Dall 提交于
      The ARM architecture only saves the exit class to the HSR (ESR_EL2 for
      arm64) on synchronous exceptions, not on asynchronous exceptions like an
      IRQ.  However, we only report the exception class on kvm_exit, which is
      confusing because an IRQ looks like it exited at some PC with the same
      reason as the previous exit.  Add a lookup table for the exception index
      and prepend the kvm_exit tracepoint text with the exception type to
      clarify this situation.
      
      Also resolve the exception class (EC) to a human-friendly text version
      so the trace output becomes immediately usable for debugging this code.
      
      Cc: Wei Huang <wei@redhat.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      b5905dc1
    • E
      KVM: arm/arm64: implement kvm_arm_[halt,resume]_guest · 3b92830a
      Eric Auger 提交于
      We introduce kvm_arm_halt_guest and resume functions. They
      will be used for IRQ forward state change.
      
      Halt is synchronous and prevents the guest from being re-entered.
      We use the same mechanism put in place for PSCI former pause,
      now renamed power_off. A new flag is introduced in arch vcpu state,
      pause, only meant to be used by those functions.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3b92830a
    • E
      KVM: arm/arm64: rename pause into power_off · 3781528e
      Eric Auger 提交于
      The kvm_vcpu_arch pause field is renamed into power_off to prepare
      for the introduction of a new pause field. Also vcpu_pause is renamed
      into vcpu_sleep since we will sleep until both power_off and pause are
      false.
      Signed-off-by: NEric Auger <eric.auger@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3781528e
    • C
      arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block · d35268da
      Christoffer Dall 提交于
      We currently schedule a soft timer every time we exit the guest if the
      timer did not expire while running the guest.  This is really not
      necessary, because the only work we do in the timer work function is to
      kick the vcpu.
      
      Kicking the vcpu does two things:
      (1) If the vpcu thread is on a waitqueue, make it runnable and remove it
      from the waitqueue.
      (2) If the vcpu is running on a different physical CPU from the one
      doing the kick, it sends a reschedule IPI.
      
      The second case cannot happen, because the soft timer is only ever
      scheduled when the vcpu is not running.  The first case is only relevant
      when the vcpu thread is on a waitqueue, which is only the case when the
      vcpu thread has called kvm_vcpu_block().
      
      Therefore, we only need to make sure a timer is scheduled for
      kvm_vcpu_block(), which we do by encapsulating all calls to
      kvm_vcpu_block() with kvm_timer_{un}schedule calls.
      
      Additionally, we only schedule a soft timer if the timer is enabled and
      unmasked, since it is useless otherwise.
      
      Note that theoretically userspace can use the SET_ONE_REG interface to
      change registers that should cause the timer to fire, even if the vcpu
      is blocked without a scheduled timer, but this case was not supported
      before this patch and we leave it for future work for now.
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      d35268da
    • C
      KVM: Add kvm_arch_vcpu_{un}blocking callbacks · 3217f7c2
      Christoffer Dall 提交于
      Some times it is useful for architecture implementations of KVM to know
      when the VCPU thread is about to block or when it comes back from
      blocking (arm/arm64 needs to know this to properly implement timers, for
      example).
      
      Therefore provide a generic architecture callback function in line with
      what we do elsewhere for KVM generic-arch interactions.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      3217f7c2
  17. 21 10月, 2015 10 次提交