提交 · c6935931c1894ff857616ff8549b61236a19148f · openanolis / cloud-kernel

15 6月, 2016 3 次提交

arm64: spinlock: Ensure forward-progress in spin_unlock_wait · c56bdcac

由 Will Deacon 提交于 6月 02, 2016

Rather than wait until we observe the lock being free (which might never
happen), we can also return from spin_unlock_wait if we observe that the
lock is now held by somebody else, which implies that it was unlocked
but we just missed seeing it in that state.

Furthermore, in such a scenario there is no longer a need to write back
the value that we loaded, since we know that there has been a lock
hand-off, which is sufficient to publish any stores prior to the
unlock_wait because the ARm architecture ensures that a Store-Release
instruction is multi-copy atomic when observed by a Load-Acquire
instruction.

The litmus test is something like:

AArch64
{
0:X1=x; 0:X3=y;
1:X1=y;
2:X1=y; 2:X3=x;
}
 P0          | P1           | P2           ;
 MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;
 STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;
 DMB SY      |              |              ;
 LDR W2,[X3] |              |              ;
exists
(0:X2=0 /\ 2:X0=1 /\ 2:X2=0)

where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
doing spin_lock.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c56bdcac

arm64: spinlock: fix spin_unlock_wait for LSE atomics · 3a5facd0

由 Will Deacon 提交于 6月 08, 2016

Commit d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against
concurrent lockers") fixed spin_unlock_wait for LL/SC-based atomics under
the premise that the LSE atomics (in particular, the LDADDA instruction)
are indivisible.

Unfortunately, these instructions are only indivisible when used with the
-AL (full ordering) suffix and, consequently, the same issue can
theoretically be observed with LSE atomics, where a later (in program
order) load can be speculated before the write portion of the atomic
operation.

This patch fixes the issue by performing a CAS of the lock once we've
established that it's unlocked, in much the same way as the LL/SC code.

Fixes: d86b8da0 ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3a5facd0

arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks · 38b850a7

由 Will Deacon 提交于 6月 02, 2016

spin_is_locked has grown two very different use-cases:

(1) [The sane case] API functions may require a certain lock to be held
    by the caller and can therefore use spin_is_locked as part of an
    assert statement in order to verify that the lock is indeed held.
    For example, usage of assert_spin_locked.

(2) [The insane case] There are two locks, where a CPU takes one of the
    locks and then checks whether or not the other one is held before
    accessing some shared state. For example, the "optimized locking" in
    ipc/sem.c.

In the latter case, the sequence looks like:

  spin_lock(&sem->lock);
  if (!spin_is_locked(&sma->sem_perm.lock))
    /* Access shared state */

and requires that the spin_is_locked check is ordered after taking the
sem->lock. Unfortunately, since our spinlocks are implemented using a
LDAXR/STXR sequence, the read of &sma->sem_perm.lock can be speculated
before the STXR and consequently return a stale value.

Whilst this hasn't been seen to cause issues in practice, PowerPC fixed
the same issue in 51d7d520 ("powerpc: Add smp_mb() to
arch_spin_is_locked()") and, although we did something similar for
spin_unlock_wait in d86b8da0 ("arm64: spinlock: serialise
spin_unlock_wait against concurrent lockers") that doesn't actually take
care of ordering against local acquisition of a different lock.

This patch adds an smp_mb() to the start of our arch_spin_is_locked and
arch_spin_unlock_wait routines to ensure that the lock value is always
loaded after any other locks have been taken by the current CPU.
Reported-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

38b850a7

04 12月, 2015 1 次提交

arm64: spinlock: serialise spin_unlock_wait against concurrent lockers · d86b8da0

由 Will Deacon 提交于 11月 19, 2015

Boqun Feng reported a rather nasty ordering issue with spin_unlock_wait
on architectures implementing spin_lock with LL/SC sequences and acquire
semantics:

 | CPU 1                   CPU 2                     CPU 3
 | ==================      ====================      ==============
 |                                                   spin_unlock(&lock);
 |                         spin_lock(&lock):
 |                           r1 = *lock; // r1 == 0;
 |                         o = READ_ONCE(object); // reordered here
 | object = NULL;
 | smp_mb();
 | spin_unlock_wait(&lock);
 |                           *lock = 1;
 | smp_mb();
 | o->dead = true;
 |                         if (o) // true
 |                           BUG_ON(o->dead); // true!!

The crux of the problem is that spin_unlock_wait(&lock) can return on
CPU 1 whilst CPU 2 is in the process of taking the lock. This can be
resolved by upgrading spin_unlock_wait to a LOCK operation, forcing it
to serialise against a concurrent locker and giving it acquire semantics
in the process (although it is not at all clear whether this is needed -
different callers seem to assume different things about the barrier
semantics and architectures are similarly disjoint in their
implementations of the macro).

This patch implements spin_unlock_wait using an LL/SC sequence with
acquire semantics on arm64. For v8.1 systems with the LSE atomics, the
exclusive writeback is omitted, since the spin_lock operation is
indivisible and no intermediate state can be observed.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

d86b8da0

28 7月, 2015 1 次提交

arm64: spinlock: fix ll/sc unlock on big-endian systems · c1d7cd22

由 Will Deacon 提交于 7月 28, 2015

When unlocking a spinlock, we perform a read-modify-write on the owner
ticket in order to increment it and store it back with release
semantics.

In the LL/SC case, we load the 16-bit ticket using a 32-bit load and
therefore store back the wrong halfword on a big-endian system,
corrupting the lock after the first unlock and killing the system dead.

This patch fixes the unlock code to use 16-bit accessors consistently.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c1d7cd22

27 7月, 2015 2 次提交

arm64: locks: patch in lse instructions when supported by the CPU · 81bb5c64

由 Will Deacon 提交于 2月 10, 2015

On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.

This patch introduces runtime patching of our locking functions so that
LSE atomic instructions are used for spinlocks and rwlocks.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

81bb5c64

arm64: rwlocks: don't fail trylock purely due to contention · 9511ca19

由 Will Deacon 提交于 7月 22, 2015

STXR can fail for a number of reasons, so don't fail an rwlock trylock
operation simply because the STXR reported failure.

I'm not aware of any issues with the current code, but this makes it
consistent with spin_trylock and also other architectures (e.g. arch/arm).
Reported-by: NCatalin Marinas <catalin.marinas@arm.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9511ca19

18 12月, 2014 1 次提交

arm64/spinlock: Replace ACCESS_ONCE READ_ONCE · af2e7aae

由 Christian Borntraeger 提交于 11月 24, 2014

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the spinlock code to replace ACCESS_ONCE with READ_ONCE.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

af2e7aae

08 2月, 2014 1 次提交

arm64: asm: remove redundant "cc" clobbers · 95c41896

由 Will Deacon 提交于 2月 04, 2014

cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers
from inline asm blocks that only use these instructions to implement
conditional branches.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

95c41896

24 10月, 2013 2 次提交

arm64: lockref: add support for lockless lockrefs using cmpxchg · 5686b06c

由 Will Deacon 提交于 10月 09, 2013

Our spinlocks are only 32-bit (2x16-bit tickets) and our cmpxchg can
deal with 8-bytes (as one would hope!).

This patch wires up the cmpxchg-based lockless lockref implementation
for arm64.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

5686b06c

arm64: locks: introduce ticket-based spinlock implementation · 52ea2a56

由 Will Deacon 提交于 10月 09, 2013

This patch introduces a ticket lock implementation for arm64, along the
same lines as the implementation for arch/arm/.
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

52ea2a56

08 6月, 2013 1 次提交

arm64: spinlock: retry trylock operation if strex fails on free lock · 4ecf7ccb

由 Catalin Marinas 提交于 5月 31, 2013

An exclusive store instruction may fail for reasons other than lock
contention (e.g. a cache eviction during the critical section) so, in
line with other architectures using similar exclusive instructions
(alpha, mips, powerpc), retry the trylock operation if the lock appears
to be free but the strex reported failure.
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Reported-by: NTony Thompson <anthony.thompson@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>

4ecf7ccb

12 2月, 2013 1 次提交

arm64: atomics: fix grossly inconsistent asm constraints for exclusives · 3a0310eb

由 Will Deacon 提交于 2月 04, 2013

Our uses of inline asm constraints for atomic operations are fairly
wild and varied. We basically need to guarantee the following:

  1. Any instructions with barrier implications
     (load-acquire/store-release) have a "memory" clobber

  2. When performing exclusive accesses, the addresing mode is generated
     using the "Q" constraint

  3. Atomic blocks which use the condition flags, have a "cc" clobber

This patch addresses these concerns which, as well as fixing the
semantics of the code, stops GCC complaining about impossible asm
constraints.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

3a0310eb

17 9月, 2012 1 次提交

arm64: SMP support · 08e875c1

由 Catalin Marinas 提交于 3月 05, 2012

This patch adds SMP initialisation and spinlocks implementation for
AArch64. The spinlock support uses the new load-acquire/store-release
instructions to avoid explicit barriers. The architecture also specifies
that an event is automatically generated when clearing the exclusive
monitor state to wake up processors in WFE, so there is no need for an
explicit DSB/SEV instruction sequence. The SEVL instruction is used to
set the exclusive monitor locally as there is no conditional WFE and a
branch is more expensive.

For the SMP booting protocol, see Documentation/arm64/booting.txt.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NTony Lindgren <tony@atomide.com>
Acked-by: NNicolas Pitre <nico@linaro.org>
Acked-by: NOlof Johansson <olof@lixom.net>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

08e875c1

openanolis / cloud-kernel 大约 1 年 前同步成功

openanolis / cloud-kernel
大约 1 年前同步成功