locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb()

The qspinlock slowpath must ensure that the MCS node is fully initialised before it can be reached by another other CPU. This is currently enforced by using a RELEASE operation when updating the tail and also when linking the node into the waitqueue, since the control dependency off xchg_tail is insufficient to enforce sufficient ordering, see: 95bcade3 ("locking/qspinlock: Ensure node is initialised before updating prev->next") Back-to-back RELEASE operations may be expensive on some architectures, particularly those that implement them using fences under the hood. We can replace the two RELEASE operations with a single smp_wmb() fence and use RELAXED operations for the subsequent publishing of the node. Signed-off-by: N Will Deacon <will.deacon@arm.com> Acked-by: N Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: N Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-12-git-send-email-will.deacon@arm.comSigned-off-by: N Ingo Molnar <mingo@kernel.org>

locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb()
The qspinlock slowpath must ensure that the MCS node is fully initialised before it can be reached by another other CPU. This is currently enforced by using a RELEASE operation when updating the tail and also when linking the node into the waitqueue, since the control dependency off xchg_tail is insufficient to enforce sufficient ordering, see: 95bcade3 ("locking/qspinlock: Ensure node is initialised before updating prev->next") Back-to-back RELEASE operations may be expensive on some architectures, particularly those that implement them using fences under the hood. We can replace the two RELEASE operations with a single smp_wmb() fence and use RELAXED operations for the subsequent publishing of the node. Signed-off-by: N Will Deacon <will.deacon@arm.com> Acked-by: N Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: N Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-12-git-send-email-will.deacon@arm.comSigned-off-by: N Ingo Molnar <mingo@kernel.org>
9d4646d1 · Will Deacon · Ingo Molnar · 626e5fbc · 9d4646d1
隐藏空白更改
内联并排

Showing with 17 addition and 16 deletion

kernel/locking/qspinlock.c kernel/locking/qspinlock.c +17 -16

未找到文件。
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -164,10 +164,10 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock)
 static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
 {
 	/*
-	 * Use release semantics to make sure that the MCS node is properly
+	 * We can use relaxed semantics since the caller ensures that the
-	 * initialized before changing the tail code.
+	 * MCS node is properly initialized before updating the tail.
 	 */
-	return (u32)xchg_release(&lock->tail,
+	return (u32)xchg_relaxed(&lock->tail,
 				 tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET;
 }
@@ -212,10 +212,11 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
 	for (;;) {
 		new = (val & _Q_LOCKED_PENDING_MASK) | tail;
 		/*
-		 * Use release semantics to make sure that the MCS node is
+		 * We can use relaxed semantics since the caller ensures that
-		 * properly initialized before changing the tail code.
+		 * the MCS node is properly initialized before updating the
+		 * tail.
 		 */
-		old = atomic_cmpxchg_release(&lock->val, val, new);
+		old = atomic_cmpxchg_relaxed(&lock->val, val, new);
 		if (old == val)
 			break;
@@ -388,12 +389,18 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 		goto release;
 	/*
+	 * Ensure that the initialisation of @node is complete before we
+	 * publish the updated tail via xchg_tail() and potentially link
+	 * @node into the waitqueue via WRITE_ONCE(prev->next, node) below.
+	 */
+	smp_wmb();
+	/*
+	 * Publish the updated tail.
 	 * We have already touched the queueing cacheline; don't bother with
 	 * pending stuff.
 	 *
 	 * p,*,* -> n,*,*
-	 *
-	 * RELEASE, such that the stores to @node must be complete.
 	 */
 	old = xchg_tail(lock, tail);
 	next = NULL;
@@ -405,14 +412,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	if (old & _Q_TAIL_MASK) {
 		prev = decode_tail(old);
-		/*
+		/* Link @node into the waitqueue. */
-		 * We must ensure that the stores to @node are observed before
+		WRITE_ONCE(prev->next, node);
-		 * the write to prev->next. The address dependency from
-		 * xchg_tail is not sufficient to ensure this because the read
-		 * component of xchg_tail is unordered with respect to the
-		 * initialisation of @node.
-		 */
-		smp_store_release(&prev->next, node);
 		pv_wait_node(node, prev);
 		arch_mcs_spin_lock_contended(&node->locked);