1. 15 6月, 2016 3 次提交
    • W
      arm64: spinlock: Ensure forward-progress in spin_unlock_wait · c56bdcac
      Will Deacon 提交于
      Rather than wait until we observe the lock being free (which might never
      happen), we can also return from spin_unlock_wait if we observe that the
      lock is now held by somebody else, which implies that it was unlocked
      but we just missed seeing it in that state.
      
      Furthermore, in such a scenario there is no longer a need to write back
      the value that we loaded, since we know that there has been a lock
      hand-off, which is sufficient to publish any stores prior to the
      unlock_wait because the ARm architecture ensures that a Store-Release
      instruction is multi-copy atomic when observed by a Load-Acquire
      instruction.
      
      The litmus test is something like:
      
      AArch64
      {
      0:X1=x; 0:X3=y;
      1:X1=y;
      2:X1=y; 2:X3=x;
      }
       P0          | P1           | P2           ;
       MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;
       STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;
       DMB SY      |              |              ;
       LDR W2,[X3] |              |              ;
      exists
      (0:X2=0 /\ 2:X0=1 /\ 2:X2=0)
      
      where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
      doing spin_lock.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c56bdcac
    • W
      arm64: spinlock: fix spin_unlock_wait for LSE atomics · 3a5facd0
      Will Deacon 提交于
      Commit d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against
      concurrent lockers") fixed spin_unlock_wait for LL/SC-based atomics under
      the premise that the LSE atomics (in particular, the LDADDA instruction)
      are indivisible.
      
      Unfortunately, these instructions are only indivisible when used with the
      -AL (full ordering) suffix and, consequently, the same issue can
      theoretically be observed with LSE atomics, where a later (in program
      order) load can be speculated before the write portion of the atomic
      operation.
      
      This patch fixes the issue by performing a CAS of the lock once we've
      established that it's unlocked, in much the same way as the LL/SC code.
      
      Fixes: d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3a5facd0
    • W
      arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks · 38b850a7
      Will Deacon 提交于
      spin_is_locked has grown two very different use-cases:
      
      (1) [The sane case] API functions may require a certain lock to be held
          by the caller and can therefore use spin_is_locked as part of an
          assert statement in order to verify that the lock is indeed held.
          For example, usage of assert_spin_locked.
      
      (2) [The insane case] There are two locks, where a CPU takes one of the
          locks and then checks whether or not the other one is held before
          accessing some shared state. For example, the "optimized locking" in
          ipc/sem.c.
      
      In the latter case, the sequence looks like:
      
        spin_lock(&sem->lock);
        if (!spin_is_locked(&sma->sem_perm.lock))
          /* Access shared state */
      
      and requires that the spin_is_locked check is ordered after taking the
      sem->lock. Unfortunately, since our spinlocks are implemented using a
      LDAXR/STXR sequence, the read of &sma->sem_perm.lock can be speculated
      before the STXR and consequently return a stale value.
      
      Whilst this hasn't been seen to cause issues in practice, PowerPC fixed
      the same issue in 51d7d520 ("powerpc: Add smp_mb() to
      arch_spin_is_locked()") and, although we did something similar for
      spin_unlock_wait in d86b8da0 ("arm64: spinlock: serialise
      spin_unlock_wait against concurrent lockers") that doesn't actually take
      care of ordering against local acquisition of a different lock.
      
      This patch adds an smp_mb() to the start of our arch_spin_is_locked and
      arch_spin_unlock_wait routines to ensure that the lock value is always
      loaded after any other locks have been taken by the current CPU.
      Reported-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      38b850a7
  2. 04 12月, 2015 1 次提交
    • W
      arm64: spinlock: serialise spin_unlock_wait against concurrent lockers · d86b8da0
      Will Deacon 提交于
      Boqun Feng reported a rather nasty ordering issue with spin_unlock_wait
      on architectures implementing spin_lock with LL/SC sequences and acquire
      semantics:
      
       | CPU 1                   CPU 2                     CPU 3
       | ==================      ====================      ==============
       |                                                   spin_unlock(&lock);
       |                         spin_lock(&lock):
       |                           r1 = *lock; // r1 == 0;
       |                         o = READ_ONCE(object); // reordered here
       | object = NULL;
       | smp_mb();
       | spin_unlock_wait(&lock);
       |                           *lock = 1;
       | smp_mb();
       | o->dead = true;
       |                         if (o) // true
       |                           BUG_ON(o->dead); // true!!
      
      The crux of the problem is that spin_unlock_wait(&lock) can return on
      CPU 1 whilst CPU 2 is in the process of taking the lock. This can be
      resolved by upgrading spin_unlock_wait to a LOCK operation, forcing it
      to serialise against a concurrent locker and giving it acquire semantics
      in the process (although it is not at all clear whether this is needed -
      different callers seem to assume different things about the barrier
      semantics and architectures are similarly disjoint in their
      implementations of the macro).
      
      This patch implements spin_unlock_wait using an LL/SC sequence with
      acquire semantics on arm64. For v8.1 systems with the LSE atomics, the
      exclusive writeback is omitted, since the spin_lock operation is
      indivisible and no intermediate state can be observed.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d86b8da0
  3. 28 7月, 2015 1 次提交
    • W
      arm64: spinlock: fix ll/sc unlock on big-endian systems · c1d7cd22
      Will Deacon 提交于
      When unlocking a spinlock, we perform a read-modify-write on the owner
      ticket in order to increment it and store it back with release
      semantics.
      
      In the LL/SC case, we load the 16-bit ticket using a 32-bit load and
      therefore store back the wrong halfword on a big-endian system,
      corrupting the lock after the first unlock and killing the system dead.
      
      This patch fixes the unlock code to use 16-bit accessors consistently.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      c1d7cd22
  4. 27 7月, 2015 2 次提交
  5. 18 12月, 2014 1 次提交
  6. 08 2月, 2014 1 次提交
  7. 24 10月, 2013 2 次提交
  8. 08 6月, 2013 1 次提交
  9. 12 2月, 2013 1 次提交
    • W
      arm64: atomics: fix grossly inconsistent asm constraints for exclusives · 3a0310eb
      Will Deacon 提交于
      Our uses of inline asm constraints for atomic operations are fairly
      wild and varied. We basically need to guarantee the following:
      
        1. Any instructions with barrier implications
           (load-acquire/store-release) have a "memory" clobber
      
        2. When performing exclusive accesses, the addresing mode is generated
           using the "Q" constraint
      
        3. Atomic blocks which use the condition flags, have a "cc" clobber
      
      This patch addresses these concerns which, as well as fixing the
      semantics of the code, stops GCC complaining about impossible asm
      constraints.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      3a0310eb
  10. 17 9月, 2012 1 次提交