提交 · ed6aefed726a305bd36344e230d2a9e9301226fc · openanolis / cloud-kernel

02 6月, 2016 3 次提交

Revert "ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff" · ed6aefed

由 Vineet Gupta 提交于 5月 31, 2016

This reverts commit e78fdfef.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

ed6aefed

Revert "ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle" · 819f3602

由 Vineet Gupta 提交于 5月 31, 2016

This reverts commit b89aa12c.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL so revert the whole delayed retry series !
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

819f3602

Revert "ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff" · 42316a20

由 Vineet Gupta 提交于 5月 31, 2016

This reverts commit 10971638.

The issue was fixed in hardware in HS2.1C release and there are no known
external users of affected RTL - so revert thw whole delayed retry
series !
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

42316a20

09 5月, 2016 1 次提交

ARC: rwlock: disable interrupts in !LLSC variant · 2a1021fc

由 Noam Camus 提交于 6月 09, 2015

If we hold rwlock and interrupt occures we may
end up spinning on it for ever during softirq.
Note that this lock is an internal lock
and since the lock is free to be used from any context,
the lock needs to be IRQ-safe.

Below you may see an example for interrupt we get while
nl_table_lock is holding its rw->lock_mutex and we spinned
on it for ever.

The concept for the fix was taken from SPARC.

[2015-05-12 19:16:12] Stack Trace:
[2015-05-12 19:16:12]   arc_unwind_core+0xb8/0x11c
[2015-05-12 19:16:12]   dump_stack+0x68/0xac
[2015-05-12 19:16:12]   _raw_read_lock+0xa8/0xac
[2015-05-12 19:16:12]   netlink_broadcast_filtered+0x56/0x35c
[2015-05-12 19:16:12]   nlmsg_notify+0x42/0xa4
[2015-05-12 19:16:13]   neigh_update+0x1fe/0x44c
[2015-05-12 19:16:13]   neigh_event_ns+0x40/0xa4
[2015-05-12 19:16:13]   arp_process+0x46e/0x5a8
[2015-05-12 19:16:13]   __netif_receive_skb_core+0x358/0x500
[2015-05-12 19:16:13]   process_backlog+0x92/0x154
[2015-05-12 19:16:13]   net_rx_action+0xb8/0x188
[2015-05-12 19:16:13]   __do_softirq+0xda/0x1d8
[2015-05-12 19:16:14]   irq_exit+0x8a/0x8c
[2015-05-12 19:16:14]   arch_do_IRQ+0x6c/0xa8
[2015-05-12 19:16:14]   handle_interrupt_level1+0xe4/0xf0
Signed-off-by: NNoam Camus <noamc@ezchip.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>

2a1021fc

07 8月, 2015 1 次提交

ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff · 10971638

由 Vineet Gupta 提交于 8月 07, 2015

The increment of delay counter was 2 instructions:
Arithmatic Shfit Left (ASL) + set to 1 on overflow

This can be done in 1 using ROtate Left (ROL)
Suggested-by: NNigel Topham <ntopham@synopsys.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

10971638

04 8月, 2015 4 次提交

ARCv2: spinlock/rwlock: Reset retry delay when starting a new spin-wait cycle · b89aa12c

由 Vineet Gupta 提交于 7月 21, 2015

The previous commit for delayed retry of SCOND needs some fine tuning
for spin locks.

The backoff from delayed retry in conjunction with spin looping of lock
itself can potentially cause the delay counter to reach high values.
So to provide fairness to any lock operation, after a lock "seems"
available (i.e. just before first SCOND try0, reset the delay counter
back to starting value of 1

Essentially reset delay to 1 for a new spin-wait-loop-acquire cycle.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

b89aa12c

ARCv2: spinlock/rwlock/atomics: Delayed retry of failed SCOND with exponential backoff · e78fdfef

由 Vineet Gupta 提交于 7月 14, 2015

This is to workaround the llock/scond livelock

HS38x4 could get into a LLOCK/SCOND livelock in case of multiple overlapping
coherency transactions in the SCU. The exclusive line state keeps rotating
among contenting cores leading to a never ending cycle. So break the cycle
by deferring the retry of failed exclusive access (SCOND). The actual delay
needed is function of number of contending cores as well as the unrelated
coherency traffic from other cores. To keep the code simple, start off with
small delay of 1 which would suffice most cases and in case of contention
double the delay. Eventually the delay is sufficient such that the coherency
pipeline is drained, thus a subsequent exclusive access would succeed.

Link: http://lkml.kernel.org/r/1438612568-28265-1-git-send-email-vgupta@synopsys.comAcked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

e78fdfef

ARC: LLOCK/SCOND based rwlock · 69cbe630

由 Vineet Gupta 提交于 7月 16, 2015

With LLOCK/SCOND, the rwlock counter can be atomically updated w/o need
for a guarding spin lock.

This in turn elides the EXchange instruction based spinning which causes
the cacheline transition to exclusive state and concurrent spinning
across cores would cause the line to keep bouncing around.
LLOCK/SCOND based implementation is superior as spinning on LLOCK keeps
the cacheline in shared state.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

69cbe630

ARC: LLOCK/SCOND based spin_lock · ae7eae9e

由 Vineet Gupta 提交于 7月 14, 2015

Current spin_lock uses EXchange instruction to implement the atomic test
and set of lock location (reads orig value and ST 1). This however forces
the cacheline into exclusive state (because of the ST) and concurrent
loops in multiple cores will bounce the line around between cores.

Instead, use LLOCK/SCOND to implement the atomic test and set which is
better as line is in shared state while lock is spinning on LLOCK

The real motivation of this change however is to make way for future
changes in atomics to implement delayed retry (with backoff).
Initial experiment with delayed retry in atomics combined with orig
EX based spinlock was a total disaster (broke even LMBench) as
struct sock has a cache line sharing an atomic_t and spinlock. The
tight spinning on lock, caused the atomic retry to keep backing off
such that it would never finish.
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

ae7eae9e

25 6月, 2015 1 次提交

ARC: add smp barriers around atomics per Documentation/atomic_ops.txt · 2576c28e

由 Vineet Gupta 提交于 11月 20, 2014

 - arch_spin_lock/unlock were lacking the ACQUIRE/RELEASE barriers
   Since ARCv2 only provides load/load, store/store and all/all, we need
   the full barrier

 - LLOCK/SCOND based atomics, bitops, cmpxchg, which return modified
   values were lacking the explicit smp barriers.

 - Non LLOCK/SCOND varaints don't need the explicit barriers since that
   is implicity provided by the spin locks used to implement the
   critical section (the spin lock barriers in turn are also fixed in
   this commit as explained above

Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

2576c28e

27 9月, 2013 1 次提交

ARC: Workaround spinlock livelock in SMP SystemC simulation · 6c00350b

由 Vineet Gupta 提交于 9月 25, 2013

Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.

So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.

The spinlock operation itself looks as follows:

	mov reg, 1		; 1=locked, 0=unlocked
retry:
	EX reg, [lock]		; load existing, store 1, atomically
	BREQ reg, 1, rety	; if already locked, retry

In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.

Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:

           core1                                   core2
         --------                                --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read)            - LOCKED
3. spin unlock  [ST 0]     - UNLOCKED
                                         spin lock [EX r=0,w=1] - LOCKED
                      -- resched core 1----

5. spin lock [EX r=1] - ALREADY-LOCKED

                      -- resched core 2----
6.                                       rwlock(Write) - READER-LOCKED
7.                                       spin unlock [ST 0]
8.                                       rwlock failed, retry again

9.                                       spin lock  [EX r=0, w=1]
                      -- resched core 1----

10  spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
                      -- resched core 2----
...
...

The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

6c00350b

11 2月, 2013 1 次提交

ARC: Spinlock/rwlock/mutex primitives · 6e35fa2d

由 Vineet Gupta 提交于 1月 18, 2013

Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

6e35fa2d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功