1. 24 3月, 2017 5 次提交
  2. 02 3月, 2017 3 次提交
  3. 30 1月, 2017 1 次提交
  4. 02 12月, 2016 2 次提交
    • T
      locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked() · 84d82ec5
      Thomas Gleixner 提交于
      While debugging the unlock vs. dequeue race which resulted in state
      corruption of futexes the lockless nature of rt_mutex_proxy_unlock()
      caused some confusion.
      
      Add commentry to explain why it is safe to do this lockless. Add matching
      comments to rt_mutex_init_proxy_locked() for completeness sake.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: David Daney <ddaney@caviumnetworks.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/20161130210030.591941927@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      84d82ec5
    • T
      locking/rtmutex: Prevent dequeue vs. unlock race · dbb26055
      Thomas Gleixner 提交于
      David reported a futex/rtmutex state corruption. It's caused by the
      following problem:
      
      CPU0		CPU1		CPU2
      
      l->owner=T1
      		rt_mutex_lock(l)
      		lock(l->wait_lock)
      		l->owner = T1 | HAS_WAITERS;
      		enqueue(T2)
      		boost()
      		  unlock(l->wait_lock)
      		schedule()
      
      				rt_mutex_lock(l)
      				lock(l->wait_lock)
      				l->owner = T1 | HAS_WAITERS;
      				enqueue(T3)
      				boost()
      				  unlock(l->wait_lock)
      				schedule()
      		signal(->T2)	signal(->T3)
      		lock(l->wait_lock)
      		dequeue(T2)
      		deboost()
      		  unlock(l->wait_lock)
      				lock(l->wait_lock)
      				dequeue(T3)
      				  ===> wait list is now empty
      				deboost()
      				 unlock(l->wait_lock)
      		lock(l->wait_lock)
      		fixup_rt_mutex_waiters()
      		  if (wait_list_empty(l)) {
      		    owner = l->owner & ~HAS_WAITERS;
      		    l->owner = owner
      		     ==> l->owner = T1
      		  }
      
      				lock(l->wait_lock)
      rt_mutex_unlock(l)		fixup_rt_mutex_waiters()
      				  if (wait_list_empty(l)) {
      				    owner = l->owner & ~HAS_WAITERS;
      cmpxchg(l->owner, T1, NULL)
       ===> Success (l->owner = NULL)
      				    l->owner = owner
      				     ==> l->owner = T1
      				  }
      
      That means the problem is caused by fixup_rt_mutex_waiters() which does the
      RMW to clear the waiters bit unconditionally when there are no waiters in
      the rtmutexes rbtree.
      
      This can be fatal: A concurrent unlock can release the rtmutex in the
      fastpath because the waiters bit is not set. If the cmpxchg() gets in the
      middle of the RMW operation then the previous owner, which just unlocked
      the rtmutex is set as the owner again when the write takes place after the
      successfull cmpxchg().
      
      The solution is rather trivial: verify that the owner member of the rtmutex
      has the waiters bit set before clearing it. This does not require a
      cmpxchg() or other atomic operations because the waiters bit can only be
      set and cleared with the rtmutex wait_lock held. It's also safe against the
      fast path unlock attempt. The unlock attempt via cmpxchg() will either see
      the bit set and take the slowpath or see the bit cleared and release it
      atomically in the fastpath.
      
      It's remarkable that the test program provided by David triggers on ARM64
      and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
      problem exists there as well. That refusal might explain that this got not
      discovered earlier despite the bug existing from day one of the rtmutex
      implementation more than 10 years ago.
      
      Thanks to David for meticulously instrumenting the code and providing the
      information which allowed to decode this subtle problem.
      Reported-by: NDavid Daney <ddaney@caviumnetworks.com>
      Tested-by: NDavid Daney <david.daney@cavium.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Fixes: 23f78d4a ("[PATCH] pi-futex: rt mutex core")
      Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      dbb26055
  5. 21 11月, 2016 1 次提交
  6. 08 6月, 2016 1 次提交
  7. 26 1月, 2016 1 次提交
    • T
      rtmutex: Make wait_lock irq safe · b4abf910
      Thomas Gleixner 提交于
      Sasha reported a lockdep splat about a potential deadlock between RCU boosting
      rtmutex and the posix timer it_lock.
      
      CPU0					CPU1
      
      rtmutex_lock(&rcu->rt_mutex)
        spin_lock(&rcu->rt_mutex.wait_lock)
      					local_irq_disable()
      					spin_lock(&timer->it_lock)
      					spin_lock(&rcu->mutex.wait_lock)
      --> Interrupt
          spin_lock(&timer->it_lock)
      
      This is caused by the following code sequence on CPU1
      
           rcu_read_lock()
           x = lookup();
           if (x)
           	spin_lock_irqsave(&x->it_lock);
           rcu_read_unlock();
           return x;
      
      We could fix that in the posix timer code by keeping rcu read locked across
      the spinlocked and irq disabled section, but the above sequence is common and
      there is no reason not to support it.
      
      Taking rt_mutex.wait_lock irq safe prevents the deadlock.
      Reported-by: NSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      b4abf910
  8. 06 10月, 2015 1 次提交
  9. 23 9月, 2015 1 次提交
  10. 20 7月, 2015 1 次提交
  11. 20 6月, 2015 2 次提交
  12. 19 6月, 2015 1 次提交
    • D
      locking/rtmutex: Implement lockless top-waiter wakeup · 45ab4eff
      Davidlohr Bueso 提交于
      Mark the task for later wakeup after the wait_lock has been released.
      This way, once the next task is awoken, it will have a better chance
      to of finding the wait_lock free when continuing executing in
      __rt_mutex_slowlock() when trying to acquire the rtmutex, calling
      try_to_take_rt_mutex(). Upon contended scenarios, other tasks attempting
      take the lock may acquire it first, right after the wait_lock is released,
      but (a) this can also occur with the current code, as it relies on the
      spinlock fairness, and (b) we are dealing with the top-waiter anyway,
      so it will always take the lock next.
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1432056298-18738-2-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      45ab4eff
  13. 14 5月, 2015 1 次提交
    • T
      rtmutex: Warn if trylock is called from hard/softirq context · 6ce47fd9
      Thomas Gleixner 提交于
      rt_mutex_trylock() must be called from thread context. It can be
      called from atomic regions (preemption or interrupts disabled), but
      not from hard/softirq/nmi context. Add a warning to alert abusers.
      
      The reasons for this are:
      
          1) There is a potential deadlock in the slowpath
      
          2) Another cpu which blocks on the rtmutex will boost the task
             which allegedly locked the rtmutex, but that cannot work
             because the hard/softirq context borrows the task context.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      6ce47fd9
  14. 13 5月, 2015 1 次提交
    • S
      locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG · cede8841
      Sebastian Andrzej Siewior 提交于
      The rtmutex code is the only user of __HAVE_ARCH_CMPXCHG and we have a few
      other user of cmpxchg() which do not care about __HAVE_ARCH_CMPXCHG. This
      define was first introduced in 23f78d4a ("[PATCH] pi-futex: rt mutex core")
      which is v2.6.18. The generic cmpxchg was introduced later in 068fbad2
      ("Add cmpxchg_local to asm-generic for per cpu atomic operations") which is
      v2.6.25.
      Back then something was required to get rtmutex working with the fast
      path on architectures without cmpxchg and this seems to be the result.
      
      It popped up recently on rt-users because ARM (v6+) does not define
      __HAVE_ARCH_CMPXCHG (even that it implements it) which results in slower
      locking performance in the fast path.
      To put some numbers on it: preempt -RT, am335x, 10 loops of
      100000 invocations of rt_spin_lock() + rt_spin_unlock() (time "total" is
      the average of the 10 loops for the 100000 invocations, "loop" is
      "total / 100000 * 1000"):
      
           cmpxchg |    slowpath used  ||    cmpxchg used
                   |   total   | loop  ||   total    | loop
           --------|-----------|-------||------------|-------
           ARMv6   | 9129.4 us | 91 ns ||  3311.9 us |  33 ns
           generic | 9360.2 us | 94 ns || 10834.6 us | 108 ns
           ----------------------------||--------------------
      
      Forcing it to generic cmpxchg() made things worse for the slowpath and
      even worse in cmpxchg() path. It boils down to 14ns more per lock+unlock
      in a cache hot loop so it might not be that much in real world.
      The last test was a substitute for pre ARMv6 machine but then I was able
      to perform the comparison on imx28 which is ARMv5 and therefore is
      always is using the generic cmpxchg implementation. And the numbers:
      
                    |   total     | loop
           -------- |-----------  |--------
           slowpath | 263937.2 us | 2639 ns
           cmpxchg  |  16934.2 us |  169 ns
           --------------------------------
      
      The numbers are larger since the machine is slower in general. However,
      letting rtmutex use cmpxchg() instead the slowpath seem to improve things.
      
      Since from the ARM (tested on am335x + imx28) point of view always
      using cmpxchg() in rt_mutex_lock() + rt_mutex_unlock() makes sense I
      would drop the define.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: will.deacon@arm.com
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20150225175613.GE6823@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      cede8841
  15. 08 5月, 2015 1 次提交
  16. 22 4月, 2015 1 次提交
  17. 25 3月, 2015 1 次提交
  18. 01 3月, 2015 1 次提交
  19. 18 2月, 2015 1 次提交
  20. 04 2月, 2015 1 次提交
  21. 13 8月, 2014 1 次提交
    • D
      locking/Documentation: Move locking related docs into Documentation/locking/ · 214e0aed
      Davidlohr Bueso 提交于
      Specifically:
        Documentation/locking/lockdep-design.txt
        Documentation/locking/lockstat.txt
        Documentation/locking/mutex-design.txt
        Documentation/locking/rt-mutex-design.txt
        Documentation/locking/rt-mutex.txt
        Documentation/locking/spinlocks.txt
        Documentation/locking/ww-mutex-design.txt
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Acked-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: jason.low2@hp.com
      Cc: aswin@hp.com
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lubomir Rintel <lkundrak@v3.sk>
      Cc: Masanari Iida <standby24x7@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: fengguang.wu@intel.com
      Link: http://lkml.kernel.org/r/1406752916-3341-6-git-send-email-davidlohr@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      214e0aed
  22. 22 6月, 2014 9 次提交
  23. 16 6月, 2014 1 次提交
    • T
      rtmutex: Plug slow unlock race · 27e35715
      Thomas Gleixner 提交于
      When the rtmutex fast path is enabled the slow unlock function can
      create the following situation:
      
      spin_lock(foo->m->wait_lock);
      foo->m->owner = NULL;
      	    			rt_mutex_lock(foo->m); <-- fast path
      				free = atomic_dec_and_test(foo->refcnt);
      				rt_mutex_unlock(foo->m); <-- fast path
      				if (free)
      				   kfree(foo);
      
      spin_unlock(foo->m->wait_lock); <--- Use after free.
      
      Plug the race by changing the slow unlock to the following scheme:
      
           while (!rt_mutex_has_waiters(m)) {
           	    /* Clear the waiters bit in m->owner */
      	    clear_rt_mutex_waiters(m);
            	    owner = rt_mutex_owner(m);
            	    spin_unlock(m->wait_lock);
            	    if (cmpxchg(m->owner, owner, 0) == owner)
            	       return;
            	    spin_lock(m->wait_lock);
           }
      
      So in case of a new waiter incoming while the owner tries the slow
      path unlock we have two situations:
      
       unlock(wait_lock);
      					lock(wait_lock);
       cmpxchg(p, owner, 0) == owner
       	    	   			mark_rt_mutex_waiters(lock);
      	 				acquire(lock);
      
      Or:
      
       unlock(wait_lock);
      					lock(wait_lock);
      	 				mark_rt_mutex_waiters(lock);
       cmpxchg(p, owner, 0) != owner
      					enqueue_waiter();
      					unlock(wait_lock);
       lock(wait_lock);
       wakeup_next waiter();
       unlock(wait_lock);
      					lock(wait_lock);
      					acquire(lock);
      
      If the fast path is disabled, then the simple
      
         m->owner = NULL;
         unlock(m->wait_lock);
      
      is sufficient as all access to m->owner is serialized via
      m->wait_lock;
      
      Also document and clarify the wakeup_next_waiter function as suggested
      by Oleg Nesterov.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140611183852.937945560@linutronix.de
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      27e35715
  24. 07 6月, 2014 1 次提交
    • T
      rtmutex: Detect changes in the pi lock chain · 82084984
      Thomas Gleixner 提交于
      When we walk the lock chain, we drop all locks after each step. So the
      lock chain can change under us before we reacquire the locks. That's
      harmless in principle as we just follow the wrong lock path. But it
      can lead to a false positive in the dead lock detection logic:
      
      T0 holds L0
      T0 blocks on L1 held by T1
      T1 blocks on L2 held by T2
      T2 blocks on L3 held by T3
      T4 blocks on L4 held by T4
      
      Now we walk the chain
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> 
           lock T2 ->  adjust T2 ->  drop locks
      
      T2 times out and blocks on L0
      
      Now we continue:
      
      lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all.
      
      Brad tried to work around that in the deadlock detection logic itself,
      but the more I looked at it the less I liked it, because it's crystal
      ball magic after the fact.
      
      We actually can detect a chain change very simple:
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->
      
           next_lock = T2->pi_blocked_on->lock;
      
      drop locks
      
      T2 times out and blocks on L0
      
      Now we continue:
      
      lock T2 -> 
      
           if (next_lock != T2->pi_blocked_on->lock)
           	   return;
      
      So if we detect that T2 is now blocked on a different lock we stop the
      chain walk. That's also correct in the following scenario:
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->
      
           next_lock = T2->pi_blocked_on->lock;
      
      drop locks
      
      T3 times out and drops L3
      T2 acquires L3 and blocks on L4 now
      
      Now we continue:
      
      lock T2 -> 
      
           if (next_lock != T2->pi_blocked_on->lock)
           	   return;
      
      We don't have to follow up the chain at that point, because T2
      propagated our priority up to T4 already.
      
      [ Folded a cleanup patch from peterz ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-by: NBrad Mouring <bmouring@ni.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140605152801.930031935@linutronix.de
      Cc: stable@vger.kernel.org
      82084984