- 06 10月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
As of 654672d4 (locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations) and 6d79ef2d (locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'), weakly ordered archs can benefit from more relaxed use of barriers when locking and unlocking, instead of regular full barrier semantics. While currently only arm64 supports such optimizations, updating corresponding locking primitives serves for other archs to immediately benefit as well, once the necessary machinery is implemented of course. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul E.McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1443643395-17016-4-git-send-email-dave@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 9月, 2015 1 次提交
-
-
由 Juri Lelli 提交于
rt_mutex_waiter_less() check of task deadlines is open coded. Since this is subject to wraparound bugs, make it use the correct helper. Reported-by: NLuca Abeni <luca.abeni@unitn.it> Signed-off-by: NJuri Lelli <juri.lelli@arm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441188096-23021-4-git-send-email-juri.lelli@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 20 7月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
No one uses this anymore, and this is not the first time the idea of replacing it with a (now possible) userspace side. Lock stealing logic was removed long ago in when the lock was granted to the highest prio. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Darren Hart <dvhart@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1435782588-4177-2-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 20 6月, 2015 2 次提交
-
-
由 Davidlohr Bueso 提交于
... as of fb00aca4 (rtmutex: Turn the plist into an rb-tree) we no longer use plists for queuing any waiters. Update stale comments. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1432056298-18738-4-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
wake_futex_pi() wakes the task before releasing the hash bucket lock (HB). The first thing the woken up task usually does is to acquire the lock which requires the HB lock. On SMP Systems this leads to blocking on the HB lock which is released by the owner shortly after. This patch rearranges the unlock path by first releasing the HB lock and then waking up the task. [ tglx: Fixed up the rtmutex unlock path ] Originally-from: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Link: http://lkml.kernel.org/r/20150617083350.GA2433@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 19 6月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
Mark the task for later wakeup after the wait_lock has been released. This way, once the next task is awoken, it will have a better chance to of finding the wait_lock free when continuing executing in __rt_mutex_slowlock() when trying to acquire the rtmutex, calling try_to_take_rt_mutex(). Upon contended scenarios, other tasks attempting take the lock may acquire it first, right after the wait_lock is released, but (a) this can also occur with the current code, as it relies on the spinlock fairness, and (b) we are dealing with the top-waiter anyway, so it will always take the lock next. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1432056298-18738-2-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 14 5月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
rt_mutex_trylock() must be called from thread context. It can be called from atomic regions (preemption or interrupts disabled), but not from hard/softirq/nmi context. Add a warning to alert abusers. The reasons for this are: 1) There is a potential deadlock in the slowpath 2) Another cpu which blocks on the rtmutex will boost the task which allegedly locked the rtmutex, but that cannot work because the hard/softirq context borrows the task context. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de>
-
- 13 5月, 2015 1 次提交
-
-
The rtmutex code is the only user of __HAVE_ARCH_CMPXCHG and we have a few other user of cmpxchg() which do not care about __HAVE_ARCH_CMPXCHG. This define was first introduced in 23f78d4a ("[PATCH] pi-futex: rt mutex core") which is v2.6.18. The generic cmpxchg was introduced later in 068fbad2 ("Add cmpxchg_local to asm-generic for per cpu atomic operations") which is v2.6.25. Back then something was required to get rtmutex working with the fast path on architectures without cmpxchg and this seems to be the result. It popped up recently on rt-users because ARM (v6+) does not define __HAVE_ARCH_CMPXCHG (even that it implements it) which results in slower locking performance in the fast path. To put some numbers on it: preempt -RT, am335x, 10 loops of 100000 invocations of rt_spin_lock() + rt_spin_unlock() (time "total" is the average of the 10 loops for the 100000 invocations, "loop" is "total / 100000 * 1000"): cmpxchg | slowpath used || cmpxchg used | total | loop || total | loop --------|-----------|-------||------------|------- ARMv6 | 9129.4 us | 91 ns || 3311.9 us | 33 ns generic | 9360.2 us | 94 ns || 10834.6 us | 108 ns ----------------------------||-------------------- Forcing it to generic cmpxchg() made things worse for the slowpath and even worse in cmpxchg() path. It boils down to 14ns more per lock+unlock in a cache hot loop so it might not be that much in real world. The last test was a substitute for pre ARMv6 machine but then I was able to perform the comparison on imx28 which is ARMv5 and therefore is always is using the generic cmpxchg implementation. And the numbers: | total | loop -------- |----------- |-------- slowpath | 263937.2 us | 2639 ns cmpxchg | 16934.2 us | 169 ns -------------------------------- The numbers are larger since the machine is slower in general. However, letting rtmutex use cmpxchg() instead the slowpath seem to improve things. Since from the ARM (tested on am335x + imx28) point of view always using cmpxchg() in rt_mutex_lock() + rt_mutex_unlock() makes sense I would drop the define. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: will.deacon@arm.com Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20150225175613.GE6823@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 08 5月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
Ronny reported that the following scenario is not handled correctly: T1 (prio = 10) lock(rtmutex); T2 (prio = 20) lock(rtmutex) boost T1 T1 (prio = 20) sys_set_scheduler(prio = 30) T1 prio = 30 .... sys_set_scheduler(prio = 10) T1 prio = 30 The last step is wrong as T1 should now be back at prio 20. Commit c365c292 ("sched: Consider pi boosting in setscheduler()") only handles the case where a boosted tasks tries to lower its priority. Fix it by taking the new effective priority into account for the decision whether a change of the priority is required. Reported-by: NRonny Meeus <ronny.meeus@gmail.com> Tested-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Fixes: c365c292 ("sched: Consider pi boosting in setscheduler()") Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505051806060.4225@nanosSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 4月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
The check for hrtimer_active() after starting the timer is pointless. If the timer is inactive it has expired already and therefor the task pointer is already NULL. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.081830481@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 25 3月, 2015 1 次提交
-
-
由 Tom(JeHyeon) Yeon 提交于
The following commit changed "deadlock_detect" to "chwalk": 8930ed80 ("rtmutex: Cleanup deadlock detector debug logic") do that rename in the function's documentation as well. Signed-off-by: NTom(JeHyeon) Yeon <tom.yeon@windriver.com> Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/1426655010-31651-1-git-send-email-tom.yeon@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 3月, 2015 1 次提交
-
-
The "usual" path is: - rt_mutex_slowlock() - set_current_state() - task_blocks_on_rt_mutex() (ret 0) - __rt_mutex_slowlock() - sleep or not but do return with __set_current_state(TASK_RUNNING) - back to caller. In the early error case where task_blocks_on_rt_mutex() return -EDEADLK we never change the task's state back to RUNNING. I assume this is intended. Without this change after ww_mutex using rt_mutex the selftest passes but later I get plenty of: | bad: scheduling from the idle thread! backtraces. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: NMike Galbraith <umgwanakikbuti@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: afffc6c1 ("locking/rtmutex: Optimize setting task running after being blocked") Link: http://lkml.kernel.org/r/1425056229-22326-4-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 2月, 2015 1 次提交
-
-
With task_blocks_on_rt_mutex() returning early -EDEADLK we never add the waiter to the waitqueue. Later, we try to remove it via remove_waiter() and go boom in rt_mutex_top_waiter() because rb_entry() gives a NULL pointer. ( Tested on v3.18-RT where rtmutex is used for regular mutex and I tried to get one twice in a row. ) Not sure when this started but I guess 397335f0 ("rtmutex: Fix deadlock detector for real") or commit 3d5c9340 ("rtmutex: Handle deadlock detection smarter"). Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> # for v3.16 and later kernels Link: http://lkml.kernel.org/r/1424187823-19600-1-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 04 2月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
We explicitly mark the task running after returning from a __rt_mutex_slowlock() call, which does the actual sleeping via wait-wake-trylocking. As such, this patch does two things: (1) refactors the code so that setting current to TASK_RUNNING is done by __rt_mutex_slowlock(), and not by the callers. The downside to this is that it becomes a bit unclear when at what point we block. As such I've added a comment that the task blocks when calling __rt_mutex_slowlock() so readers can figure out when it is running again. (2) relaxes setting current's state through __set_current_state(), instead of it's more expensive barrier alternative. There was no need for the implied barrier as we're obviously not planning on blocking. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1422857784.18096.1.camel@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 8月, 2014 1 次提交
-
-
由 Davidlohr Bueso 提交于
Specifically: Documentation/locking/lockdep-design.txt Documentation/locking/lockstat.txt Documentation/locking/mutex-design.txt Documentation/locking/rt-mutex-design.txt Documentation/locking/rt-mutex.txt Documentation/locking/spinlocks.txt Documentation/locking/ww-mutex-design.txt Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: jason.low2@hp.com Cc: aswin@hp.com Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Mason <clm@fb.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: David Airlie <airlied@linux.ie> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jason Low <jason.low2@hp.com> Cc: Josef Bacik <jbacik@fusionio.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lubomir Rintel <lkundrak@v3.sk> Cc: Masanari Iida <standby24x7@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: fengguang.wu@intel.com Link: http://lkml.kernel.org/r/1406752916-3341-6-git-send-email-davidlohr@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 6月, 2014 9 次提交
-
-
由 Thomas Gleixner 提交于
In case the dead lock detector is enabled we follow the lock chain to the end in rt_mutex_adjust_prio_chain, even if we could stop earlier due to the priority/waiter constellation. But once we are no longer the top priority waiter in a certain step or the task holding the lock has already the same priority then there is no point in dequeing and enqueing along the lock chain as there is no change at all. So stop the queueing at this point. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/20140522031950.280830190@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
The conditions under which deadlock detection is conducted are unclear and undocumented. Add constants instead of using 0/1 and provide a selection function which hides the additional debug dependency from the calling code. Add comments where needed. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/20140522031949.947264874@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
The deadlock logic is only required for futexes. Remove the extra arguments for the public functions and also for the futex specific ones which get always called with deadlock detection enabled. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
Exit right away, when the removed waiter was not the top priority waiter on the lock. Get rid of the extra indent level. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
-
由 Thomas Gleixner 提交于
Add commentry to document the chain walk and the protection mechanisms and their scope. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
Add a separate local variable for the boost/deboost logic to make the code more readable. Add comments where appropriate. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
There is no point to keep the task ref across the check for lock owner. Drop the ref before that, so the protection context is clear. Found while documenting the chain walk. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
-
由 Thomas Gleixner 提交于
The current implementation of try_to_take_rtmutex() is correct, but requires more than a single brain twist to understand the clever encoded conditionals. Untangle it and document the cases proper. Looks less efficient at the first glance, but actually reduces the binary code size on x8664 by 80 bytes. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
Oleg noticed that rtmutex_slowtrylock() has a pointless check for rt_mutex_owner(lock) != current. To avoid calling try_to_take_rtmutex() we really want to check whether the lock has an owner at all or whether the trylock failed because the owner is NULL, but the RT_MUTEX_HAS_WAITERS bit is set. This covers the lock is owned by caller situation as well. We can actually do this check lockless. trylock is taking a chance whether we take lock->wait_lock to do the check or not. Add comments to the function while at it. Reported-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
-
- 16 6月, 2014 1 次提交
-
-
由 Thomas Gleixner 提交于
When the rtmutex fast path is enabled the slow unlock function can create the following situation: spin_lock(foo->m->wait_lock); foo->m->owner = NULL; rt_mutex_lock(foo->m); <-- fast path free = atomic_dec_and_test(foo->refcnt); rt_mutex_unlock(foo->m); <-- fast path if (free) kfree(foo); spin_unlock(foo->m->wait_lock); <--- Use after free. Plug the race by changing the slow unlock to the following scheme: while (!rt_mutex_has_waiters(m)) { /* Clear the waiters bit in m->owner */ clear_rt_mutex_waiters(m); owner = rt_mutex_owner(m); spin_unlock(m->wait_lock); if (cmpxchg(m->owner, owner, 0) == owner) return; spin_lock(m->wait_lock); } So in case of a new waiter incoming while the owner tries the slow path unlock we have two situations: unlock(wait_lock); lock(wait_lock); cmpxchg(p, owner, 0) == owner mark_rt_mutex_waiters(lock); acquire(lock); Or: unlock(wait_lock); lock(wait_lock); mark_rt_mutex_waiters(lock); cmpxchg(p, owner, 0) != owner enqueue_waiter(); unlock(wait_lock); lock(wait_lock); wakeup_next waiter(); unlock(wait_lock); lock(wait_lock); acquire(lock); If the fast path is disabled, then the simple m->owner = NULL; unlock(m->wait_lock); is sufficient as all access to m->owner is serialized via m->wait_lock; Also document and clarify the wakeup_next_waiter function as suggested by Oleg Nesterov. Reported-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140611183852.937945560@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 07 6月, 2014 2 次提交
-
-
由 Thomas Gleixner 提交于
When we walk the lock chain, we drop all locks after each step. So the lock chain can change under us before we reacquire the locks. That's harmless in principle as we just follow the wrong lock path. But it can lead to a false positive in the dead lock detection logic: T0 holds L0 T0 blocks on L1 held by T1 T1 blocks on L2 held by T2 T2 blocks on L3 held by T3 T4 blocks on L4 held by T4 Now we walk the chain lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 -> drop locks T2 times out and blocks on L0 Now we continue: lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all. Brad tried to work around that in the deadlock detection logic itself, but the more I looked at it the less I liked it, because it's crystal ball magic after the fact. We actually can detect a chain change very simple: lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 -> next_lock = T2->pi_blocked_on->lock; drop locks T2 times out and blocks on L0 Now we continue: lock T2 -> if (next_lock != T2->pi_blocked_on->lock) return; So if we detect that T2 is now blocked on a different lock we stop the chain walk. That's also correct in the following scenario: lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 -> next_lock = T2->pi_blocked_on->lock; drop locks T3 times out and drops L3 T2 acquires L3 and blocks on L4 now Now we continue: lock T2 -> if (next_lock != T2->pi_blocked_on->lock) return; We don't have to follow up the chain at that point, because T2 propagated our priority up to T4 already. [ Folded a cleanup patch from peterz ] Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reported-by: NBrad Mouring <bmouring@ni.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140605152801.930031935@linutronix.de Cc: stable@vger.kernel.org
-
由 Thomas Gleixner 提交于
Even in the case when deadlock detection is not requested by the caller, we can detect deadlocks. Right now the code stops the lock chain walk and keeps the waiter enqueued, even on itself. Silly not to yell when such a scenario is detected and to keep the waiter enqueued. Return -EDEADLK unconditionally and handle it at the call sites. The futex calls return -EDEADLK. The non futex ones dequeue the waiter, throw a warning and put the task into a schedule loop. Tagged for stable as it makes the code more robust. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brad Mouring <bmouring@ni.com> Link: http://lkml.kernel.org/r/20140605152801.836501969@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 28 5月, 2014 1 次提交
-
-
由 Thomas Gleixner 提交于
The current deadlock detection logic does not work reliably due to the following early exit path: /* * Drop out, when the task has no waiters. Note, * top_waiter can be NULL, when we are in the deboosting * mode! */ if (top_waiter && (!task_has_pi_waiters(task) || top_waiter != task_top_pi_waiter(task))) goto out_unlock_pi; So this not only exits when the task has no waiters, it also exits unconditionally when the current waiter is not the top priority waiter of the task. So in a nested locking scenario, it might abort the lock chain walk and therefor miss a potential deadlock. Simple fix: Continue the chain walk, when deadlock detection is enabled. We also avoid the whole enqueue, if we detect the deadlock right away (A-A). It's an optimization, but also prevents that another waiter who comes in after the detection and before the task has undone the damage observes the situation and detects the deadlock and returns -EDEADLOCK, which is wrong as the other task is not in a deadlock situation. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 23 2月, 2014 1 次提交
-
-
由 Thomas Gleixner 提交于
If a PI boosted task policy/priority is modified by a setscheduler() call we unconditionally dequeue and requeue the task if it is on the runqueue even if the new priority is lower than the current effective boosted priority. This can result in undesired reordering of the priority bucket list. If the new priority is less or equal than the current effective we just store the new parameters in the task struct and leave the scheduler class and the runqueue untouched. This is handled when the task deboosts itself. Only if the new priority is higher than the effective boosted priority we apply the change immediately. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> [ Rebase ontop of v3.14-rc1. ] Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Dario Faggioli <raistlin@linux.it> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1391803122-4425-7-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 1月, 2014 2 次提交
-
-
由 Dario Faggioli 提交于
Some method to deal with rt-mutexes and make sched_dl interact with the current PI-coded is needed, raising all but trivial issues, that needs (according to us) to be solved with some restructuring of the pi-code (i.e., going toward a proxy execution-ish implementation). This is under development, in the meanwhile, as a temporary solution, what this commits does is: - ensure a pi-lock owner with waiters is never throttled down. Instead, when it runs out of runtime, it immediately gets replenished and it's deadline is postponed; - the scheduling parameters (relative deadline and default runtime) used for that replenishments --during the whole period it holds the pi-lock-- are the ones of the waiting task with earliest deadline. Acting this way, we provide some kind of boosting to the lock-owner, still by using the existing (actually, slightly modified by the previous commit) pi-architecture. We would stress the fact that this is only a surely needed, all but clean solution to the problem. In the end it's only a way to re-start discussion within the community. So, as always, comments, ideas, rants, etc.. are welcome! :-) Signed-off-by: NDario Faggioli <raistlin@linux.it> Signed-off-by: NJuri Lelli <juri.lelli@gmail.com> [ Added !RT_MUTEXES build fix. ] Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Peter Zijlstra 提交于
Turn the pi-chains from plist to rb-tree, in the rt_mutex code, and provide a proper comparison function for -deadline and -priority tasks. This is done mainly because: - classical prio field of the plist is just an int, which might not be enough for representing a deadline; - manipulating such a list would become O(nr_deadline_tasks), which might be to much, as the number of -deadline task increases. Therefore, an rb-tree is used, and tasks are queued in it according to the following logic: - among two -priority (i.e., SCHED_BATCH/OTHER/RR/FIFO) tasks, the one with the higher (lower, actually!) prio wins; - among a -priority and a -deadline task, the latter always wins; - among two -deadline tasks, the one with the earliest deadline wins. Queueing and dequeueing functions are changed accordingly, for both the list of a task's pi-waiters and the list of tasks blocked on a pi-lock. Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NDario Faggioli <raistlin@linux.it> Signed-off-by: NJuri Lelli <juri.lelli@gmail.com> Signed-off-again-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-10-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 06 11月, 2013 1 次提交
-
-
由 Peter Zijlstra 提交于
Suggested-by: NIngo Molnar <mingo@kernel.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-p9ijt8div0hwldexwfm4nlhj@git.kernel.org [ Fixed build failure in kernel/rcu/tree_plugin.h. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 28 5月, 2013 1 次提交
-
-
由 Juri Lelli 提交于
Parameters and usage of rt_mutex_adjust_prio_chain() are already documented in Documentation/rt-mutex-design.txt. However, since this function is called from several paths with different semantics (related to the arguments), it is handy to have a quick reference directly in the code. Signed-off-by: NJuri Lelli <juri.lelli@gmail.com> Cc: Clark Williams <williams@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1368608650-7935-1-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 08 2月, 2013 1 次提交
-
-
由 Clark Williams 提交于
Move rt scheduler definitions out of include/linux/sched.h into new file include/linux/sched/rt.h Signed-off-by: NClark Williams <williams@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lanSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 12 12月, 2011 1 次提交
-
-
由 Paul E. McKenney 提交于
This reverts commit 5342e269. The approach taken in this patch was deemed too abusive to mutexes, and thus too likely to result in maintenance problems in the future. Instead, we will disallow RCU read-side critical sections that partially overlap with interrupt-disbled code segments. Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 31 10月, 2011 1 次提交
-
-
由 Paul Gortmaker 提交于
The changed files were only including linux/module.h for the EXPORT_SYMBOL infrastructure, and nothing else. Revector them onto the isolated export header for faster compile times. Nothing to see here but a whole lot of instances of: -#include <linux/module.h> +#include <linux/export.h> This commit is only changing the kernel dir; next targets will probably be mm, fs, the arch dirs, etc. Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
-
- 29 9月, 2011 1 次提交
-
-
由 Paul E. McKenney 提交于
Create a separate lockdep class for the rt_mutex used for RCU priority boosting and enable use of rt_mutex_lock() with irqs disabled. This prevents RCU priority boosting from falling prey to deadlocks when someone begins an RCU read-side critical section in preemptible state, but releases it with an irq-disabled lock held. Unfortunately, the scheduler's runqueue and priority-inheritance locks still must either completely enclose or be completely enclosed by any overlapping RCU read-side critical section. This version removes a redundant local_irq_restore() noted by Yong Zhang. Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
-
- 08 7月, 2011 1 次提交
-
-
由 Dima Zavin 提交于
This was legacy code brought over from the RT tree and is no longer necessary. Signed-off-by: NDima Zavin <dima@android.com> Acked-by: NThomas Gleixner <tglx@linutronix.de> Cc: Daniel Walker <dwalker@codeaurora.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/1310084879-10351-2-git-send-email-dima@android.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
-
- 28 1月, 2011 1 次提交
-
-
由 Lai Jiangshan 提交于
In current rtmutex, the pending owner may be boosted by the tasks in the rtmutex's waitlist when the pending owner is deboosted or a task in the waitlist is boosted. This boosting is unrelated, because the pending owner does not really take the rtmutex. It is not reasonable. Example. time1: A(high prio) onwers the rtmutex. B(mid prio) and C (low prio) in the waitlist. time2 A release the lock, B becomes the pending owner A(or other high prio task) continues to run. B's prio is lower than A, so B is just queued at the runqueue. time3 A or other high prio task sleeps, but we have passed some time The B and C's prio are changed in the period (time2 ~ time3) due to boosting or deboosting. Now C has the priority higher than B. ***Is it reasonable that C has to boost B and help B to get the rtmutex? NO!! I think, it is unrelated/unneed boosting before B really owns the rtmutex. We should give C a chance to beat B and win the rtmutex. This is the motivation of this patch. This patch *ensures* only the top waiter or higher priority task can take the lock. How? 1) we don't dequeue the top waiter when unlock, if the top waiter is changed, the old top waiter will fail and go to sleep again. 2) when requiring lock, it will get the lock when the lock is not taken and: there is no waiter OR higher priority than waiters OR it is top waiter. 3) In any time, the top waiter is changed, the top waiter will be woken up. The algorithm is much simpler than before, no pending owner, no boosting for pending owner. Other advantage of this patch: 1) The states of a rtmutex are reduced a half, easier to read the code. 2) the codes become shorter. 3) top waiter is not dequeued until it really take the lock: they will retain FIFO when it is stolen. Not advantage nor disadvantage 1) Even we may wakeup multiple waiters(any time when top waiter changed), we hardly cause "thundering herd", the number of wokenup task is likely 1 or very little. 2) two APIs are changed. rt_mutex_owner() will not return pending owner, it will return NULL when the top waiter is going to take the lock. rt_mutex_next_owner() always return the top waiter. will not return NULL if we have waiters because the top waiter is not dequeued. I have fixed the code that use these APIs. need updated after this patch is accepted 1) Document/* 2) the testcase scripts/rt-tester/t4-l2-pi-deboost.tst Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com> LKML-Reference: <4D3012D5.4060709@cn.fujitsu.com> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
-
- 15 12月, 2009 1 次提交
-
-
由 Thomas Gleixner 提交于
Convert locks which cannot be sleeping locks in preempt-rt to raw_spinlocks. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Acked-by: NIngo Molnar <mingo@elte.hu>
-