1. 20 6月, 2015 2 次提交
  2. 19 6月, 2015 1 次提交
    • D
      locking/rtmutex: Implement lockless top-waiter wakeup · 45ab4eff
      Davidlohr Bueso 提交于
      Mark the task for later wakeup after the wait_lock has been released.
      This way, once the next task is awoken, it will have a better chance
      to of finding the wait_lock free when continuing executing in
      __rt_mutex_slowlock() when trying to acquire the rtmutex, calling
      try_to_take_rt_mutex(). Upon contended scenarios, other tasks attempting
      take the lock may acquire it first, right after the wait_lock is released,
      but (a) this can also occur with the current code, as it relies on the
      spinlock fairness, and (b) we are dealing with the top-waiter anyway,
      so it will always take the lock next.
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1432056298-18738-2-git-send-email-dave@stgolabs.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      45ab4eff
  3. 14 5月, 2015 1 次提交
    • T
      rtmutex: Warn if trylock is called from hard/softirq context · 6ce47fd9
      Thomas Gleixner 提交于
      rt_mutex_trylock() must be called from thread context. It can be
      called from atomic regions (preemption or interrupts disabled), but
      not from hard/softirq/nmi context. Add a warning to alert abusers.
      
      The reasons for this are:
      
          1) There is a potential deadlock in the slowpath
      
          2) Another cpu which blocks on the rtmutex will boost the task
             which allegedly locked the rtmutex, but that cannot work
             because the hard/softirq context borrows the task context.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      6ce47fd9
  4. 13 5月, 2015 1 次提交
    • S
      locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG · cede8841
      Sebastian Andrzej Siewior 提交于
      The rtmutex code is the only user of __HAVE_ARCH_CMPXCHG and we have a few
      other user of cmpxchg() which do not care about __HAVE_ARCH_CMPXCHG. This
      define was first introduced in 23f78d4a ("[PATCH] pi-futex: rt mutex core")
      which is v2.6.18. The generic cmpxchg was introduced later in 068fbad2
      ("Add cmpxchg_local to asm-generic for per cpu atomic operations") which is
      v2.6.25.
      Back then something was required to get rtmutex working with the fast
      path on architectures without cmpxchg and this seems to be the result.
      
      It popped up recently on rt-users because ARM (v6+) does not define
      __HAVE_ARCH_CMPXCHG (even that it implements it) which results in slower
      locking performance in the fast path.
      To put some numbers on it: preempt -RT, am335x, 10 loops of
      100000 invocations of rt_spin_lock() + rt_spin_unlock() (time "total" is
      the average of the 10 loops for the 100000 invocations, "loop" is
      "total / 100000 * 1000"):
      
           cmpxchg |    slowpath used  ||    cmpxchg used
                   |   total   | loop  ||   total    | loop
           --------|-----------|-------||------------|-------
           ARMv6   | 9129.4 us | 91 ns ||  3311.9 us |  33 ns
           generic | 9360.2 us | 94 ns || 10834.6 us | 108 ns
           ----------------------------||--------------------
      
      Forcing it to generic cmpxchg() made things worse for the slowpath and
      even worse in cmpxchg() path. It boils down to 14ns more per lock+unlock
      in a cache hot loop so it might not be that much in real world.
      The last test was a substitute for pre ARMv6 machine but then I was able
      to perform the comparison on imx28 which is ARMv5 and therefore is
      always is using the generic cmpxchg implementation. And the numbers:
      
                    |   total     | loop
           -------- |-----------  |--------
           slowpath | 263937.2 us | 2639 ns
           cmpxchg  |  16934.2 us |  169 ns
           --------------------------------
      
      The numbers are larger since the machine is slower in general. However,
      letting rtmutex use cmpxchg() instead the slowpath seem to improve things.
      
      Since from the ARM (tested on am335x + imx28) point of view always
      using cmpxchg() in rt_mutex_lock() + rt_mutex_unlock() makes sense I
      would drop the define.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: will.deacon@arm.com
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20150225175613.GE6823@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      cede8841
  5. 08 5月, 2015 1 次提交
  6. 22 4月, 2015 1 次提交
  7. 25 3月, 2015 1 次提交
  8. 01 3月, 2015 1 次提交
  9. 18 2月, 2015 1 次提交
  10. 04 2月, 2015 1 次提交
  11. 13 8月, 2014 1 次提交
    • D
      locking/Documentation: Move locking related docs into Documentation/locking/ · 214e0aed
      Davidlohr Bueso 提交于
      Specifically:
        Documentation/locking/lockdep-design.txt
        Documentation/locking/lockstat.txt
        Documentation/locking/mutex-design.txt
        Documentation/locking/rt-mutex-design.txt
        Documentation/locking/rt-mutex.txt
        Documentation/locking/spinlocks.txt
        Documentation/locking/ww-mutex-design.txt
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Acked-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: jason.low2@hp.com
      Cc: aswin@hp.com
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lubomir Rintel <lkundrak@v3.sk>
      Cc: Masanari Iida <standby24x7@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: fengguang.wu@intel.com
      Link: http://lkml.kernel.org/r/1406752916-3341-6-git-send-email-davidlohr@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      214e0aed
  12. 22 6月, 2014 9 次提交
  13. 16 6月, 2014 1 次提交
    • T
      rtmutex: Plug slow unlock race · 27e35715
      Thomas Gleixner 提交于
      When the rtmutex fast path is enabled the slow unlock function can
      create the following situation:
      
      spin_lock(foo->m->wait_lock);
      foo->m->owner = NULL;
      	    			rt_mutex_lock(foo->m); <-- fast path
      				free = atomic_dec_and_test(foo->refcnt);
      				rt_mutex_unlock(foo->m); <-- fast path
      				if (free)
      				   kfree(foo);
      
      spin_unlock(foo->m->wait_lock); <--- Use after free.
      
      Plug the race by changing the slow unlock to the following scheme:
      
           while (!rt_mutex_has_waiters(m)) {
           	    /* Clear the waiters bit in m->owner */
      	    clear_rt_mutex_waiters(m);
            	    owner = rt_mutex_owner(m);
            	    spin_unlock(m->wait_lock);
            	    if (cmpxchg(m->owner, owner, 0) == owner)
            	       return;
            	    spin_lock(m->wait_lock);
           }
      
      So in case of a new waiter incoming while the owner tries the slow
      path unlock we have two situations:
      
       unlock(wait_lock);
      					lock(wait_lock);
       cmpxchg(p, owner, 0) == owner
       	    	   			mark_rt_mutex_waiters(lock);
      	 				acquire(lock);
      
      Or:
      
       unlock(wait_lock);
      					lock(wait_lock);
      	 				mark_rt_mutex_waiters(lock);
       cmpxchg(p, owner, 0) != owner
      					enqueue_waiter();
      					unlock(wait_lock);
       lock(wait_lock);
       wakeup_next waiter();
       unlock(wait_lock);
      					lock(wait_lock);
      					acquire(lock);
      
      If the fast path is disabled, then the simple
      
         m->owner = NULL;
         unlock(m->wait_lock);
      
      is sufficient as all access to m->owner is serialized via
      m->wait_lock;
      
      Also document and clarify the wakeup_next_waiter function as suggested
      by Oleg Nesterov.
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140611183852.937945560@linutronix.de
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      27e35715
  14. 07 6月, 2014 2 次提交
    • T
      rtmutex: Detect changes in the pi lock chain · 82084984
      Thomas Gleixner 提交于
      When we walk the lock chain, we drop all locks after each step. So the
      lock chain can change under us before we reacquire the locks. That's
      harmless in principle as we just follow the wrong lock path. But it
      can lead to a false positive in the dead lock detection logic:
      
      T0 holds L0
      T0 blocks on L1 held by T1
      T1 blocks on L2 held by T2
      T2 blocks on L3 held by T3
      T4 blocks on L4 held by T4
      
      Now we walk the chain
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> 
           lock T2 ->  adjust T2 ->  drop locks
      
      T2 times out and blocks on L0
      
      Now we continue:
      
      lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all.
      
      Brad tried to work around that in the deadlock detection logic itself,
      but the more I looked at it the less I liked it, because it's crystal
      ball magic after the fact.
      
      We actually can detect a chain change very simple:
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->
      
           next_lock = T2->pi_blocked_on->lock;
      
      drop locks
      
      T2 times out and blocks on L0
      
      Now we continue:
      
      lock T2 -> 
      
           if (next_lock != T2->pi_blocked_on->lock)
           	   return;
      
      So if we detect that T2 is now blocked on a different lock we stop the
      chain walk. That's also correct in the following scenario:
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->
      
           next_lock = T2->pi_blocked_on->lock;
      
      drop locks
      
      T3 times out and drops L3
      T2 acquires L3 and blocks on L4 now
      
      Now we continue:
      
      lock T2 -> 
      
           if (next_lock != T2->pi_blocked_on->lock)
           	   return;
      
      We don't have to follow up the chain at that point, because T2
      propagated our priority up to T4 already.
      
      [ Folded a cleanup patch from peterz ]
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reported-by: NBrad Mouring <bmouring@ni.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140605152801.930031935@linutronix.de
      Cc: stable@vger.kernel.org
      82084984
    • T
      rtmutex: Handle deadlock detection smarter · 3d5c9340
      Thomas Gleixner 提交于
      Even in the case when deadlock detection is not requested by the
      caller, we can detect deadlocks. Right now the code stops the lock
      chain walk and keeps the waiter enqueued, even on itself. Silly not to
      yell when such a scenario is detected and to keep the waiter enqueued.
      
      Return -EDEADLK unconditionally and handle it at the call sites.
      
      The futex calls return -EDEADLK. The non futex ones dequeue the
      waiter, throw a warning and put the task into a schedule loop.
      
      Tagged for stable as it makes the code more robust.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Brad Mouring <bmouring@ni.com>
      Link: http://lkml.kernel.org/r/20140605152801.836501969@linutronix.de
      Cc: stable@vger.kernel.org
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      3d5c9340
  15. 28 5月, 2014 1 次提交
    • T
      rtmutex: Fix deadlock detector for real · 397335f0
      Thomas Gleixner 提交于
      The current deadlock detection logic does not work reliably due to the
      following early exit path:
      
      	/*
      	 * Drop out, when the task has no waiters. Note,
      	 * top_waiter can be NULL, when we are in the deboosting
      	 * mode!
      	 */
      	if (top_waiter && (!task_has_pi_waiters(task) ||
      			   top_waiter != task_top_pi_waiter(task)))
      		goto out_unlock_pi;
      
      So this not only exits when the task has no waiters, it also exits
      unconditionally when the current waiter is not the top priority waiter
      of the task.
      
      So in a nested locking scenario, it might abort the lock chain walk
      and therefor miss a potential deadlock.
      
      Simple fix: Continue the chain walk, when deadlock detection is
      enabled.
      
      We also avoid the whole enqueue, if we detect the deadlock right away
      (A-A). It's an optimization, but also prevents that another waiter who
      comes in after the detection and before the task has undone the damage
      observes the situation and detects the deadlock and returns
      -EDEADLOCK, which is wrong as the other task is not in a deadlock
      situation.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      397335f0
  16. 23 2月, 2014 1 次提交
  17. 13 1月, 2014 2 次提交
    • D
      sched/deadline: Add SCHED_DEADLINE inheritance logic · 2d3d891d
      Dario Faggioli 提交于
      Some method to deal with rt-mutexes and make sched_dl interact with
      the current PI-coded is needed, raising all but trivial issues, that
      needs (according to us) to be solved with some restructuring of
      the pi-code (i.e., going toward a proxy execution-ish implementation).
      
      This is under development, in the meanwhile, as a temporary solution,
      what this commits does is:
      
       - ensure a pi-lock owner with waiters is never throttled down. Instead,
         when it runs out of runtime, it immediately gets replenished and it's
         deadline is postponed;
      
       - the scheduling parameters (relative deadline and default runtime)
         used for that replenishments --during the whole period it holds the
         pi-lock-- are the ones of the waiting task with earliest deadline.
      
      Acting this way, we provide some kind of boosting to the lock-owner,
      still by using the existing (actually, slightly modified by the previous
      commit) pi-architecture.
      
      We would stress the fact that this is only a surely needed, all but
      clean solution to the problem. In the end it's only a way to re-start
      discussion within the community. So, as always, comments, ideas, rants,
      etc.. are welcome! :-)
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      [ Added !RT_MUTEXES build fix. ]
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      2d3d891d
    • P
      rtmutex: Turn the plist into an rb-tree · fb00aca4
      Peter Zijlstra 提交于
      Turn the pi-chains from plist to rb-tree, in the rt_mutex code,
      and provide a proper comparison function for -deadline and
      -priority tasks.
      
      This is done mainly because:
       - classical prio field of the plist is just an int, which might
         not be enough for representing a deadline;
       - manipulating such a list would become O(nr_deadline_tasks),
         which might be to much, as the number of -deadline task increases.
      
      Therefore, an rb-tree is used, and tasks are queued in it according
      to the following logic:
       - among two -priority (i.e., SCHED_BATCH/OTHER/RR/FIFO) tasks, the
         one with the higher (lower, actually!) prio wins;
       - among a -priority and a -deadline task, the latter always wins;
       - among two -deadline tasks, the one with the earliest deadline
         wins.
      
      Queueing and dequeueing functions are changed accordingly, for both
      the list of a task's pi-waiters and the list of tasks blocked on
      a pi-lock.
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NDario Faggioli <raistlin@linux.it>
      Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
      Signed-off-again-by: NPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1383831828-15501-10-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      fb00aca4
  18. 06 11月, 2013 1 次提交
  19. 28 5月, 2013 1 次提交
  20. 08 2月, 2013 1 次提交
  21. 12 12月, 2011 1 次提交
  22. 31 10月, 2011 1 次提交
  23. 29 9月, 2011 1 次提交
    • P
      rcu: Permit rt_mutex_unlock() with irqs disabled · 5342e269
      Paul E. McKenney 提交于
      Create a separate lockdep class for the rt_mutex used for RCU priority
      boosting and enable use of rt_mutex_lock() with irqs disabled.  This
      prevents RCU priority boosting from falling prey to deadlocks when
      someone begins an RCU read-side critical section in preemptible state,
      but releases it with an irq-disabled lock held.
      
      Unfortunately, the scheduler's runqueue and priority-inheritance locks
      still must either completely enclose or be completely enclosed by any
      overlapping RCU read-side critical section.
      
      This version removes a redundant local_irq_restore() noted by
      Yong Zhang.
      Signed-off-by: NPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5342e269
  24. 08 7月, 2011 1 次提交
  25. 28 1月, 2011 1 次提交
    • L
      rtmutex: Simplify PI algorithm and make highest prio task get lock · 8161239a
      Lai Jiangshan 提交于
      In current rtmutex, the pending owner may be boosted by the tasks
      in the rtmutex's waitlist when the pending owner is deboosted
      or a task in the waitlist is boosted. This boosting is unrelated,
      because the pending owner does not really take the rtmutex.
      It is not reasonable.
      
      Example.
      
      time1:
      A(high prio) onwers the rtmutex.
      B(mid prio) and C (low prio) in the waitlist.
      
      time2
      A release the lock, B becomes the pending owner
      A(or other high prio task) continues to run. B's prio is lower
      than A, so B is just queued at the runqueue.
      
      time3
      A or other high prio task sleeps, but we have passed some time
      The B and C's prio are changed in the period (time2 ~ time3)
      due to boosting or deboosting. Now C has the priority higher
      than B. ***Is it reasonable that C has to boost B and help B to
      get the rtmutex?
      
      NO!! I think, it is unrelated/unneed boosting before B really
      owns the rtmutex. We should give C a chance to beat B and
      win the rtmutex.
      
      This is the motivation of this patch. This patch *ensures*
      only the top waiter or higher priority task can take the lock.
      
      How?
      1) we don't dequeue the top waiter when unlock, if the top waiter
         is changed, the old top waiter will fail and go to sleep again.
      2) when requiring lock, it will get the lock when the lock is not taken and:
         there is no waiter OR higher priority than waiters OR it is top waiter.
      3) In any time, the top waiter is changed, the top waiter will be woken up.
      
      The algorithm is much simpler than before, no pending owner, no
      boosting for pending owner.
      
      Other advantage of this patch:
      1) The states of a rtmutex are reduced a half, easier to read the code.
      2) the codes become shorter.
      3) top waiter is not dequeued until it really take the lock:
         they will retain FIFO when it is stolen.
      
      Not advantage nor disadvantage
      1) Even we may wakeup multiple waiters(any time when top waiter changed),
         we hardly cause "thundering herd",
         the number of wokenup task is likely 1 or very little.
      2) two APIs are changed.
         rt_mutex_owner() will not return pending owner, it will return NULL when
                          the top waiter is going to take the lock.
         rt_mutex_next_owner() always return the top waiter.
      	                 will not return NULL if we have waiters
                               because the top waiter is not dequeued.
      
         I have fixed the code that use these APIs.
      
      need updated after this patch is accepted
      1) Document/*
      2) the testcase scripts/rt-tester/t4-l2-pi-deboost.tst
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4D3012D5.4060709@cn.fujitsu.com>
      Reviewed-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      8161239a
  26. 15 12月, 2009 2 次提交
  27. 06 8月, 2009 1 次提交
    • D
      rtmutex: Avoid deadlock in rt_mutex_start_proxy_lock() · 1bbf2083
      Darren Hart 提交于
      In the event of a lock steal or owner died,
      rt_mutex_start_proxy_lock() will give the rt_mutex to the
      waiting task, but it fails to release the wait_lock. This leads
      to subsequent deadlocks when other tasks try to acquire the
      rt_mutex.
      
      I also removed a few extra blank lines that really spaced this
      routine out. I must have been high on the \n when I wrote this
      originally...
      Signed-off-by: NDarren Hart <dvhltc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Dinakar Guniguntala <dino@in.ibm.com>
      Cc: John Stultz <johnstul@linux.vnet.ibm.com>
      LKML-Reference: <4A79D7F1.4000405@us.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1bbf2083
  28. 13 6月, 2009 1 次提交