- 04 4月, 2017 1 次提交
-
-
由 Xunlei Pang 提交于
We should deboost before waking the high-priority task, such that we don't run two tasks with the same "state" (priority, deadline, sched_class, etc). In order to make sure the boosting task doesn't start running between unlock and deboost (due to 'spurious' wakeup), we move the deboost under the wait_lock, that way its serialized against the wait loop in __rt_mutex_slowlock(). Doing the deboost early can however lead to priority-inversion if current would get preempted after the deboost but before waking our high-prio task, hence we disable preemption before doing deboost, and enabling it after the wake up is over. This gets us the right semantic order, but most importantly however; this change ensures pointer stability for the next patch, where we have rt_mutex_setprio() cache a pointer to the top-most waiter task. If we, as before this change, do the wakeup first and then deboost, this pointer might point into thin air. [peterz: Changelog + patch munging] Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NXunlei Pang <xlpang@redhat.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NSteven Rostedt <rostedt@goodmis.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170323150216.110065320@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 24 3月, 2017 6 次提交
-
-
由 Peter Zijlstra 提交于
When PREEMPT_RT_FULL does the spinlock -> rt_mutex substitution the PI chain code will (falsely) report a deadlock and BUG. The problem is that it hold hb->lock (now an rt_mutex) while doing task_blocks_on_rt_mutex on the futex's pi_state::rtmutex. This, when interleaved just right with futex_unlock_pi() leads it to believe to see an AB-BA deadlock. Task1 (holds rt_mutex, Task2 (does FUTEX_LOCK_PI) does FUTEX_UNLOCK_PI) lock hb->lock lock rt_mutex (as per start_proxy) lock hb->lock Which is a trivial AB-BA. It is not an actual deadlock, because it won't be holding hb->lock by the time it actually blocks on the rt_mutex, but the chainwalk code doesn't know that and it would be a nightmare to handle this gracefully. To avoid this problem, do the same as in futex_unlock_pi() and drop hb->lock after acquiring wait_lock. This still fully serializes against futex_unlock_pi(), since adding to the wait_list does the very same lock dance, and removing it holds both locks. Aside of solving the RT problem this makes the lock and unlock mechanism symetric and reduces the hb->lock held time. Reported-and-tested-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Suggested-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104152.161341537@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Peter Zijlstra 提交于
By changing futex_lock_pi() to use rt_mutex_*_proxy_lock() all wait_list modifications are done under both hb->lock and wait_lock. This closes the obvious interleave pattern between futex_lock_pi() and futex_unlock_pi(), but not entirely so. See below: Before: futex_lock_pi() futex_unlock_pi() unlock hb->lock lock hb->lock unlock hb->lock lock rt_mutex->wait_lock unlock rt_mutex_wait_lock -EAGAIN lock rt_mutex->wait_lock list_add unlock rt_mutex->wait_lock schedule() lock rt_mutex->wait_lock list_del unlock rt_mutex->wait_lock <idem> -EAGAIN lock hb->lock After: futex_lock_pi() futex_unlock_pi() lock hb->lock lock rt_mutex->wait_lock list_add unlock rt_mutex->wait_lock unlock hb->lock schedule() lock hb->lock unlock hb->lock lock hb->lock lock rt_mutex->wait_lock list_del unlock rt_mutex->wait_lock lock rt_mutex->wait_lock unlock rt_mutex_wait_lock -EAGAIN unlock hb->lock It does however solve the earlier starvation/live-lock scenario which got introduced with the -EAGAIN since unlike the before scenario; where the -EAGAIN happens while futex_unlock_pi() doesn't hold any locks; in the after scenario it happens while futex_unlock_pi() actually holds a lock, and then it is serialized on that lock. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104152.062785528@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Peter Zijlstra 提交于
With the ultimate goal of keeping rt_mutex wait_list and futex_q waiters consistent it's necessary to split 'rt_mutex_futex_lock()' into finer parts, such that only the actual blocking can be done without hb->lock held. Split split_mutex_finish_proxy_lock() into two parts, one that does the blocking and one that does remove_waiter() when the lock acquire failed. When the rtmutex was acquired successfully the waiter can be removed in the acquisiton path safely, since there is no concurrency on the lock owner. This means that, except for futex_lock_pi(), all wait_list modifications are done with both hb->lock and wait_lock held. [bigeasy@linutronix.de: fix for futex_requeue_pi_signal_restart] Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104152.001659630@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Peter Zijlstra 提交于
Since there's already two copies of this code, introduce a helper now before adding a third one. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.950039479@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Peter Zijlstra 提交于
Part of what makes futex_unlock_pi() intricate is that rt_mutex_futex_unlock() -> rt_mutex_slowunlock() can drop rt_mutex::wait_lock. This means it cannot rely on the atomicy of wait_lock, which would be preferred in order to not rely on hb->lock so much. The reason rt_mutex_slowunlock() needs to drop wait_lock is because it can race with the rt_mutex fastpath, however futexes have their own fast path. Since futexes already have a bunch of separate rt_mutex accessors, complete that set and implement a rt_mutex variant without fastpath for them. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.702962446@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Peter Zijlstra 提交于
These are unused and clutter up the code. Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104151.652692478@infradead.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 02 3月, 2017 3 次提交
-
-
由 Ingo Molnar 提交于
We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/debug.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Ingo Molnar 提交于
We are going to split <linux/sched/wake_q.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/wake_q.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 30 1月, 2017 1 次提交
-
-
由 Steven Rostedt (VMware) 提交于
Running my likely/unlikely profiler for 3 weeks on two production machines, I discovered that the unlikely() test in __rt_mutex_slowlock() checking if state is TASK_INTERRUPTIBLE is hit 100% of the time, making it a very likely case. The reason is, on a vanilla kernel, the majority case of calling rt_mutex() is from the futex code. This code is always called as TASK_INTERRUPTIBLE. In the -rt patch, this code is commonly called when PREEMPT_RT is enabled with TASK_UNINTERRUPTIBLE. But that's not the likely scenario. The rt_mutex() code should be optimized for the common vanilla case, and that is from a futex, with TASK_INTERRUPTIBLE as the state. Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170119113234.1efeedd1@gandalf.local.homeSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 02 12月, 2016 2 次提交
-
-
由 Thomas Gleixner 提交于
While debugging the unlock vs. dequeue race which resulted in state corruption of futexes the lockless nature of rt_mutex_proxy_unlock() caused some confusion. Add commentry to explain why it is safe to do this lockless. Add matching comments to rt_mutex_init_proxy_locked() for completeness sake. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: David Daney <ddaney@caviumnetworks.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20161130210030.591941927@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Thomas Gleixner 提交于
David reported a futex/rtmutex state corruption. It's caused by the following problem: CPU0 CPU1 CPU2 l->owner=T1 rt_mutex_lock(l) lock(l->wait_lock) l->owner = T1 | HAS_WAITERS; enqueue(T2) boost() unlock(l->wait_lock) schedule() rt_mutex_lock(l) lock(l->wait_lock) l->owner = T1 | HAS_WAITERS; enqueue(T3) boost() unlock(l->wait_lock) schedule() signal(->T2) signal(->T3) lock(l->wait_lock) dequeue(T2) deboost() unlock(l->wait_lock) lock(l->wait_lock) dequeue(T3) ===> wait list is now empty deboost() unlock(l->wait_lock) lock(l->wait_lock) fixup_rt_mutex_waiters() if (wait_list_empty(l)) { owner = l->owner & ~HAS_WAITERS; l->owner = owner ==> l->owner = T1 } lock(l->wait_lock) rt_mutex_unlock(l) fixup_rt_mutex_waiters() if (wait_list_empty(l)) { owner = l->owner & ~HAS_WAITERS; cmpxchg(l->owner, T1, NULL) ===> Success (l->owner = NULL) l->owner = owner ==> l->owner = T1 } That means the problem is caused by fixup_rt_mutex_waiters() which does the RMW to clear the waiters bit unconditionally when there are no waiters in the rtmutexes rbtree. This can be fatal: A concurrent unlock can release the rtmutex in the fastpath because the waiters bit is not set. If the cmpxchg() gets in the middle of the RMW operation then the previous owner, which just unlocked the rtmutex is set as the owner again when the write takes place after the successfull cmpxchg(). The solution is rather trivial: verify that the owner member of the rtmutex has the waiters bit set before clearing it. This does not require a cmpxchg() or other atomic operations because the waiters bit can only be set and cleared with the rtmutex wait_lock held. It's also safe against the fast path unlock attempt. The unlock attempt via cmpxchg() will either see the bit set and take the slowpath or see the bit cleared and release it atomically in the fastpath. It's remarkable that the test program provided by David triggers on ARM64 and MIPS64 really quick, but it refuses to reproduce on x86-64, while the problem exists there as well. That refusal might explain that this got not discovered earlier despite the bug existing from day one of the rtmutex implementation more than 10 years ago. Thanks to David for meticulously instrumenting the code and providing the information which allowed to decode this subtle problem. Reported-by: NDavid Daney <ddaney@caviumnetworks.com> Tested-by: NDavid Daney <david.daney@cavium.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@vger.kernel.org Fixes: 23f78d4a ("[PATCH] pi-futex: rt mutex core") Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 21 11月, 2016 1 次提交
-
-
由 Waiman Long 提交于
Currently the wake_q data structure is defined by the WAKE_Q() macro. This macro, however, looks like a function doing something as "wake" is a verb. Even checkpatch.pl was confused as it reported warnings like WARNING: Missing a blank line after declarations #548: FILE: kernel/futex.c:3665: + int ret; + WAKE_Q(wake_q); This patch renames the WAKE_Q() macro to DEFINE_WAKE_Q() which clarifies what the macro is doing and eliminates the checkpatch.pl warnings. Signed-off-by: NWaiman Long <longman@redhat.com> Acked-by: NDavidlohr Bueso <dave@stgolabs.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1479401198-1765-1-git-send-email-longman@redhat.com [ Resolved conflict and added missing rename. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 08 6月, 2016 1 次提交
-
-
One warning should be enough to get one motivated to fix this. It is possible that this happens more than once and that starts flooding the output. Later the prints will be suppressed so we only get half of it. Depending on the console system used it might not be helpful. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1464356838-1755-1-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 26 1月, 2016 1 次提交
-
-
由 Thomas Gleixner 提交于
Sasha reported a lockdep splat about a potential deadlock between RCU boosting rtmutex and the posix timer it_lock. CPU0 CPU1 rtmutex_lock(&rcu->rt_mutex) spin_lock(&rcu->rt_mutex.wait_lock) local_irq_disable() spin_lock(&timer->it_lock) spin_lock(&rcu->mutex.wait_lock) --> Interrupt spin_lock(&timer->it_lock) This is caused by the following code sequence on CPU1 rcu_read_lock() x = lookup(); if (x) spin_lock_irqsave(&x->it_lock); rcu_read_unlock(); return x; We could fix that in the posix timer code by keeping rcu read locked across the spinlocked and irq disabled section, but the above sequence is common and there is no reason not to support it. Taking rt_mutex.wait_lock irq safe prevents the deadlock. Reported-by: NSasha Levin <sasha.levin@oracle.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
-
- 06 10月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
As of 654672d4 (locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations) and 6d79ef2d (locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'), weakly ordered archs can benefit from more relaxed use of barriers when locking and unlocking, instead of regular full barrier semantics. While currently only arm64 supports such optimizations, updating corresponding locking primitives serves for other archs to immediately benefit as well, once the necessary machinery is implemented of course. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NThomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul E.McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1443643395-17016-4-git-send-email-dave@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 23 9月, 2015 1 次提交
-
-
由 Juri Lelli 提交于
rt_mutex_waiter_less() check of task deadlines is open coded. Since this is subject to wraparound bugs, make it use the correct helper. Reported-by: NLuca Abeni <luca.abeni@unitn.it> Signed-off-by: NJuri Lelli <juri.lelli@arm.com> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441188096-23021-4-git-send-email-juri.lelli@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 20 7月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
No one uses this anymore, and this is not the first time the idea of replacing it with a (now possible) userspace side. Lock stealing logic was removed long ago in when the lock was granted to the highest prio. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Darren Hart <dvhart@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1435782588-4177-2-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 20 6月, 2015 2 次提交
-
-
由 Davidlohr Bueso 提交于
... as of fb00aca4 (rtmutex: Turn the plist into an rb-tree) we no longer use plists for queuing any waiters. Update stale comments. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1432056298-18738-4-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
wake_futex_pi() wakes the task before releasing the hash bucket lock (HB). The first thing the woken up task usually does is to acquire the lock which requires the HB lock. On SMP Systems this leads to blocking on the HB lock which is released by the owner shortly after. This patch rearranges the unlock path by first releasing the HB lock and then waking up the task. [ tglx: Fixed up the rtmutex unlock path ] Originally-from: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Link: http://lkml.kernel.org/r/20150617083350.GA2433@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 19 6月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
Mark the task for later wakeup after the wait_lock has been released. This way, once the next task is awoken, it will have a better chance to of finding the wait_lock free when continuing executing in __rt_mutex_slowlock() when trying to acquire the rtmutex, calling try_to_take_rt_mutex(). Upon contended scenarios, other tasks attempting take the lock may acquire it first, right after the wait_lock is released, but (a) this can also occur with the current code, as it relies on the spinlock fairness, and (b) we are dealing with the top-waiter anyway, so it will always take the lock next. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1432056298-18738-2-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 14 5月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
rt_mutex_trylock() must be called from thread context. It can be called from atomic regions (preemption or interrupts disabled), but not from hard/softirq/nmi context. Add a warning to alert abusers. The reasons for this are: 1) There is a potential deadlock in the slowpath 2) Another cpu which blocks on the rtmutex will boost the task which allegedly locked the rtmutex, but that cannot work because the hard/softirq context borrows the task context. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de>
-
- 13 5月, 2015 1 次提交
-
-
The rtmutex code is the only user of __HAVE_ARCH_CMPXCHG and we have a few other user of cmpxchg() which do not care about __HAVE_ARCH_CMPXCHG. This define was first introduced in 23f78d4a ("[PATCH] pi-futex: rt mutex core") which is v2.6.18. The generic cmpxchg was introduced later in 068fbad2 ("Add cmpxchg_local to asm-generic for per cpu atomic operations") which is v2.6.25. Back then something was required to get rtmutex working with the fast path on architectures without cmpxchg and this seems to be the result. It popped up recently on rt-users because ARM (v6+) does not define __HAVE_ARCH_CMPXCHG (even that it implements it) which results in slower locking performance in the fast path. To put some numbers on it: preempt -RT, am335x, 10 loops of 100000 invocations of rt_spin_lock() + rt_spin_unlock() (time "total" is the average of the 10 loops for the 100000 invocations, "loop" is "total / 100000 * 1000"): cmpxchg | slowpath used || cmpxchg used | total | loop || total | loop --------|-----------|-------||------------|------- ARMv6 | 9129.4 us | 91 ns || 3311.9 us | 33 ns generic | 9360.2 us | 94 ns || 10834.6 us | 108 ns ----------------------------||-------------------- Forcing it to generic cmpxchg() made things worse for the slowpath and even worse in cmpxchg() path. It boils down to 14ns more per lock+unlock in a cache hot loop so it might not be that much in real world. The last test was a substitute for pre ARMv6 machine but then I was able to perform the comparison on imx28 which is ARMv5 and therefore is always is using the generic cmpxchg implementation. And the numbers: | total | loop -------- |----------- |-------- slowpath | 263937.2 us | 2639 ns cmpxchg | 16934.2 us | 169 ns -------------------------------- The numbers are larger since the machine is slower in general. However, letting rtmutex use cmpxchg() instead the slowpath seem to improve things. Since from the ARM (tested on am335x + imx28) point of view always using cmpxchg() in rt_mutex_lock() + rt_mutex_unlock() makes sense I would drop the define. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: will.deacon@arm.com Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20150225175613.GE6823@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 08 5月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
Ronny reported that the following scenario is not handled correctly: T1 (prio = 10) lock(rtmutex); T2 (prio = 20) lock(rtmutex) boost T1 T1 (prio = 20) sys_set_scheduler(prio = 30) T1 prio = 30 .... sys_set_scheduler(prio = 10) T1 prio = 30 The last step is wrong as T1 should now be back at prio 20. Commit c365c292 ("sched: Consider pi boosting in setscheduler()") only handles the case where a boosted tasks tries to lower its priority. Fix it by taking the new effective priority into account for the decision whether a change of the priority is required. Reported-by: NRonny Meeus <ronny.meeus@gmail.com> Tested-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Fixes: c365c292 ("sched: Consider pi boosting in setscheduler()") Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505051806060.4225@nanosSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 4月, 2015 1 次提交
-
-
由 Thomas Gleixner 提交于
The check for hrtimer_active() after starting the timer is pointless. If the timer is inactive it has expired already and therefor the task pointer is already NULL. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.081830481@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 25 3月, 2015 1 次提交
-
-
由 Tom(JeHyeon) Yeon 提交于
The following commit changed "deadlock_detect" to "chwalk": 8930ed80 ("rtmutex: Cleanup deadlock detector debug logic") do that rename in the function's documentation as well. Signed-off-by: NTom(JeHyeon) Yeon <tom.yeon@windriver.com> Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/1426655010-31651-1-git-send-email-tom.yeon@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 01 3月, 2015 1 次提交
-
-
The "usual" path is: - rt_mutex_slowlock() - set_current_state() - task_blocks_on_rt_mutex() (ret 0) - __rt_mutex_slowlock() - sleep or not but do return with __set_current_state(TASK_RUNNING) - back to caller. In the early error case where task_blocks_on_rt_mutex() return -EDEADLK we never change the task's state back to RUNNING. I assume this is intended. Without this change after ww_mutex using rt_mutex the selftest passes but later I get plenty of: | bad: scheduling from the idle thread! backtraces. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: NMike Galbraith <umgwanakikbuti@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: afffc6c1 ("locking/rtmutex: Optimize setting task running after being blocked") Link: http://lkml.kernel.org/r/1425056229-22326-4-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 18 2月, 2015 1 次提交
-
-
With task_blocks_on_rt_mutex() returning early -EDEADLK we never add the waiter to the waitqueue. Later, we try to remove it via remove_waiter() and go boom in rt_mutex_top_waiter() because rb_entry() gives a NULL pointer. ( Tested on v3.18-RT where rtmutex is used for regular mutex and I tried to get one twice in a row. ) Not sure when this started but I guess 397335f0 ("rtmutex: Fix deadlock detector for real") or commit 3d5c9340 ("rtmutex: Handle deadlock detection smarter"). Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> # for v3.16 and later kernels Link: http://lkml.kernel.org/r/1424187823-19600-1-git-send-email-bigeasy@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 04 2月, 2015 1 次提交
-
-
由 Davidlohr Bueso 提交于
We explicitly mark the task running after returning from a __rt_mutex_slowlock() call, which does the actual sleeping via wait-wake-trylocking. As such, this patch does two things: (1) refactors the code so that setting current to TASK_RUNNING is done by __rt_mutex_slowlock(), and not by the callers. The downside to this is that it becomes a bit unclear when at what point we block. As such I've added a comment that the task blocks when calling __rt_mutex_slowlock() so readers can figure out when it is running again. (2) relaxes setting current's state through __set_current_state(), instead of it's more expensive barrier alternative. There was no need for the implied barrier as we're obviously not planning on blocking. Signed-off-by: NDavidlohr Bueso <dbueso@suse.de> Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1422857784.18096.1.camel@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 13 8月, 2014 1 次提交
-
-
由 Davidlohr Bueso 提交于
Specifically: Documentation/locking/lockdep-design.txt Documentation/locking/lockstat.txt Documentation/locking/mutex-design.txt Documentation/locking/rt-mutex-design.txt Documentation/locking/rt-mutex.txt Documentation/locking/spinlocks.txt Documentation/locking/ww-mutex-design.txt Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NPeter Zijlstra <peterz@infradead.org> Cc: jason.low2@hp.com Cc: aswin@hp.com Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Mason <clm@fb.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: David Airlie <airlied@linux.ie> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: David S. Miller <davem@davemloft.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jason Low <jason.low2@hp.com> Cc: Josef Bacik <jbacik@fusionio.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Lubomir Rintel <lkundrak@v3.sk> Cc: Masanari Iida <standby24x7@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: fengguang.wu@intel.com Link: http://lkml.kernel.org/r/1406752916-3341-6-git-send-email-davidlohr@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 22 6月, 2014 9 次提交
-
-
由 Thomas Gleixner 提交于
In case the dead lock detector is enabled we follow the lock chain to the end in rt_mutex_adjust_prio_chain, even if we could stop earlier due to the priority/waiter constellation. But once we are no longer the top priority waiter in a certain step or the task holding the lock has already the same priority then there is no point in dequeing and enqueing along the lock chain as there is no change at all. So stop the queueing at this point. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/20140522031950.280830190@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
The conditions under which deadlock detection is conducted are unclear and undocumented. Add constants instead of using 0/1 and provide a selection function which hides the additional debug dependency from the calling code. Add comments where needed. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/20140522031949.947264874@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Thomas Gleixner 提交于
The deadlock logic is only required for futexes. Remove the extra arguments for the public functions and also for the futex specific ones which get always called with deadlock detection enabled. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
Exit right away, when the removed waiter was not the top priority waiter on the lock. Get rid of the extra indent level. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
-
由 Thomas Gleixner 提交于
Add commentry to document the chain walk and the protection mechanisms and their scope. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
Add a separate local variable for the boost/deboost logic to make the code more readable. Add comments where appropriate. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
There is no point to keep the task ref across the check for lock owner. Drop the ref before that, so the protection context is clear. Found while documenting the chain walk. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
-
由 Thomas Gleixner 提交于
The current implementation of try_to_take_rtmutex() is correct, but requires more than a single brain twist to understand the clever encoded conditionals. Untangle it and document the cases proper. Looks less efficient at the first glance, but actually reduces the binary code size on x8664 by 80 bytes. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
-
由 Thomas Gleixner 提交于
Oleg noticed that rtmutex_slowtrylock() has a pointless check for rt_mutex_owner(lock) != current. To avoid calling try_to_take_rtmutex() we really want to check whether the lock has an owner at all or whether the trylock failed because the owner is NULL, but the RT_MUTEX_HAS_WAITERS bit is set. This covers the lock is owned by caller situation as well. We can actually do this check lockless. trylock is taking a chance whether we take lock->wait_lock to do the check or not. Add comments to the function while at it. Reported-by: NOleg Nesterov <oleg@redhat.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NSteven Rostedt <rostedt@goodmis.org> Reviewed-by: NLai Jiangshan <laijs@cn.fujitsu.com>
-