1. 24 2月, 2015 1 次提交
  2. 18 2月, 2015 3 次提交
    • D
      locking/rwsem: Set lock ownership ASAP · 7a215f89
      Davidlohr Bueso 提交于
      In order to optimize the spinning step, we need to set the lock
      owner as soon as the lock is acquired; after a successful counter
      cmpxchg operation, that is. This is particularly useful as rwsems
      need to set the owner to nil for readers, so there is a greater
      chance of falling out of the spinning. Currently we only set the
      owner much later in the game, in the more generic level -- latency
      can be specially bad when waiting for a node->next pointer when
      releasing the osq in up_write calls.
      
      As such, update the owner inside rwsem_try_write_lock (when the
      lock is obtained after blocking) and rwsem_try_write_lock_unqueued
      (when the lock is obtained while spinning). This requires creating
      a new internal rwsem.h header to share the owner related calls.
      
      Also cleanup some headers for mutex and rwsem.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/1422609267-15102-4-git-send-email-dave@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7a215f89
    • J
      locking/mutex: Refactor mutex_spin_on_owner() · be1f7bf2
      Jason Low 提交于
      As suggested by Davidlohr, we could refactor mutex_spin_on_owner().
      
      Currently, we split up owner_running() with mutex_spin_on_owner().
      When the owner changes, we make duplicate owner checks which are not
      necessary. It also makes the code a bit obscure as we are using a
      second check to figure out why we broke out of the loop.
      
      This patch modifies it such that we remove the owner_running() function
      and the mutex_spin_on_owner() loop directly checks for if the owner changes,
      if the owner is not running, or if we need to reschedule. If the owner
      changes, we break out of the loop and return true. If the owner is not
      running or if we need to reschedule, then break out of the loop and return
      false.
      Suggested-by: NDavidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: NJason Low <jason.low2@hp.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: chegu_vinod@hp.com
      Cc: tglx@linutronix.de
      Link: http://lkml.kernel.org/r/1422914367-5574-3-git-send-email-jason.low2@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be1f7bf2
    • J
      locking/mutex: In mutex_spin_on_owner(), return true when owner changes · 07d2413a
      Jason Low 提交于
      In the mutex_spin_on_owner(), we return true only if lock->owner == NULL.
      This was beneficial in situations where there were multiple threads
      simultaneously spinning for the mutex. If another thread got the lock
      while other spinner(s) were also doing mutex_spin_on_owner(), then the
      other spinners would stop spinning. This workaround helped reduce the
      chance that many spinners were simultaneously spinning for the mutex
      which can help reduce contention in highly contended cases.
      
      However, recent changes were made to the optimistic spinning code such
      that instead of having all spinners simultaneously spin for the mutex,
      we queue the spinners with an MCS lock such that only one thread spins
      for the mutex at a time. Furthermore, the OSQ optimizations ensure that
      spinners in the queue will stop waiting if it needs to reschedule.
      
      Now, we don't have to worry about multiple threads spinning on owner
      at the same time, and if lock->owner is not NULL at this point, it likely
      means another thread happens to obtain the lock in the fastpath. In this
      case, it would make sense for the spinner to continue spinning as long
      as the spinner doesn't need to schedule and the mutex owner is running.
      
      This patch changes this so that mutex_spin_on_owner() returns true when
      the lock owner changes, which means a thread will only stop spinning
      if it either needs to reschedule or if the lock owner is not running.
      
      We saw up to a 5% performance improvement in the fserver workload with
      this patch.
      Signed-off-by: NJason Low <jason.low2@hp.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NDavidlohr Bueso <dave@stgolabs.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: chegu_vinod@hp.com
      Cc: tglx@linutronix.de
      Link: http://lkml.kernel.org/r/1422914367-5574-2-git-send-email-jason.low2@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      07d2413a
  3. 04 2月, 2015 2 次提交
  4. 14 1月, 2015 3 次提交
  5. 28 10月, 2014 1 次提交
    • P
      locking/mutex: Don't assume TASK_RUNNING · 6f942a1f
      Peter Zijlstra 提交于
      We're going to make might_sleep() test for TASK_RUNNING, because
      blocking without TASK_RUNNING will destroy the task state by setting
      it to TASK_RUNNING.
      
      There are a few occasions where its 'valid' to call blocking
      primitives (and mutex_lock in particular) and not have TASK_RUNNING,
      typically such cases are right before we set TASK_RUNNING anyhow.
      
      Robustify the code by not assuming this; this has the beneficial side
      effect of allowing optional code emission for fixing the above
      might_sleep() false positives.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: tglx@linutronix.de
      Cc: ilya.dryomov@inktank.com
      Cc: umgwanakikbuti@gmail.com
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140924082241.988560063@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6f942a1f
  6. 13 8月, 2014 4 次提交
  7. 17 7月, 2014 1 次提交
    • D
      arch, locking: Ciao arch_mutex_cpu_relax() · 3a6bfbc9
      Davidlohr Bueso 提交于
      The arch_mutex_cpu_relax() function, introduced by 34b133f8, is
      hacky and ugly. It was added a few years ago to address the fact
      that common cpu_relax() calls include yielding on s390, and thus
      impact the optimistic spinning functionality of mutexes. Nowadays
      we use this function well beyond mutexes: rwsem, qrwlock, mcs and
      lockref. Since the macro that defines the call is in the mutex header,
      any users must include mutex.h and the naming is misleading as well.
      
      This patch (i) renames the call to cpu_relax_lowlatency  ("relax, but
      only if you can do it with very low latency") and (ii) defines it in
      each arch's asm/processor.h local header, just like for regular cpu_relax
      functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
      and thus we can take it out of mutex.h. While this can seem redundant,
      I believe it is a good choice as it allows us to move out arch specific
      logic from generic locking primitives and enables future(?) archs to
      transparently define it, similarly to System Z.
      Signed-off-by: NDavidlohr Bueso <davidlohr@hp.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bharat Bhushan <r65777@freescale.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
      Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Joseph Myers <joseph@codesourcery.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Nicolas Pitre <nico@linaro.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Qais Yousef <qais.yousef@imgtec.com>
      Cc: Qiaowei Ren <qiaowei.ren@intel.com>
      Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Stratos Karafotis <stratosk@semaphore.gr>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vasily Kulikov <segoon@openwall.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Wolfram Sang <wsa@the-dreams.de>
      Cc: adi-buildroot-devel@lists.sourceforge.net
      Cc: linux390@de.ibm.com
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-am33-list@redhat.com
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-c6x-dev@linux-c6x.org
      Cc: linux-cris-kernel@axis.com
      Cc: linux-hexagon@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux@lists.openrisc.net
      Cc: linux-m32r-ja@ml.linux-m32r.org
      Cc: linux-m32r@ml.linux-m32r.org
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: linux-metag@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-sh@vger.kernel.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: sparclinux@vger.kernel.org
      Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3a6bfbc9
  8. 16 7月, 2014 2 次提交
    • J
      locking/spinlocks/mcs: Introduce and use init macro and function for osq locks · 4d9d951e
      Jason Low 提交于
      Currently, we initialize the osq lock by directly setting the lock's values. It
      would be preferable if we use an init macro to do the initialization like we do
      with other locks.
      
      This patch introduces and uses a macro and function for initializing the osq lock.
      Signed-off-by: NJason Low <jason.low2@hp.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Scott Norton <scott.norton@hp.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Waiman Long <waiman.long@hp.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Link: http://lkml.kernel.org/r/1405358872-3732-4-git-send-email-jason.low2@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4d9d951e
    • J
      locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead · 90631822
      Jason Low 提交于
      The cancellable MCS spinlock is currently used to queue threads that are
      doing optimistic spinning. It uses per-cpu nodes, where a thread obtaining
      the lock would access and queue the local node corresponding to the CPU that
      it's running on. Currently, the cancellable MCS lock is implemented by using
      pointers to these nodes.
      
      In this patch, instead of operating on pointers to the per-cpu nodes, we
      store the CPU numbers in which the per-cpu nodes correspond to in atomic_t.
      A similar concept is used with the qspinlock.
      
      By operating on the CPU # of the nodes using atomic_t instead of pointers
      to those nodes, this can reduce the overhead of the cancellable MCS spinlock
      by 32 bits (on 64 bit systems).
      Signed-off-by: NJason Low <jason.low2@hp.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Scott Norton <scott.norton@hp.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Waiman Long <waiman.long@hp.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Josef Bacik <jbacik@fusionio.com>
      Link: http://lkml.kernel.org/r/1405358872-3732-3-git-send-email-jason.low2@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      90631822
  9. 05 7月, 2014 4 次提交
  10. 12 3月, 2014 1 次提交
  11. 11 3月, 2014 6 次提交
  12. 14 2月, 2014 1 次提交
  13. 28 1月, 2014 2 次提交
  14. 11 11月, 2013 1 次提交
  15. 06 11月, 2013 1 次提交
  16. 19 10月, 2013 1 次提交
  17. 31 7月, 2013 1 次提交
  18. 26 7月, 2013 1 次提交
  19. 23 7月, 2013 1 次提交
    • D
      mutex: Do not unnecessarily deal with waiters · ec83f425
      Davidlohr Bueso 提交于
      Upon entering the slowpath, we immediately attempt to acquire
      the lock by checking if it is already unlocked. If we are lucky
      enough that this is the case, then we don't need to deal with
      any waiter related logic.
      
      Furthermore any checks for an empty wait_list are unnecessary as
      we already know that count is non-negative and hence no one is
      waiting for the lock.
      
      Move the count check and xchg calls to be done before any
      waiters are setup - including waiter debugging. Upon failure to
      acquire the lock, the xchg sets the counter to 0, instead of -1
      as it was originally. This can be done here since we set it back
      to -1 right at the beginning of the loop so other waiters are
      woken up when the lock is released.
      
      When tested on a 8-socket (80 core) system against a vanilla
      3.10-rc1 kernel, this patch provides some small performance
      benefits (+2-6%). While it could be considered in the noise
      level, the average percentages were stable across multiple runs
      and no performance regressions were seen. Two big winners, for
      small amounts of users (10-100), were the short and compute
      workloads had a +19.36% and +%15.76% in jobs per minute.
      
      Also change some break statements to 'goto slowpath', which IMO
      makes a little more intuitive to read.
      Signed-off-by: NDavidlohr Bueso <davidlohr.bueso@hp.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Acked-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1372450398.2106.1.camel@buesod1.americas.hpqcorp.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ec83f425
  20. 22 7月, 2013 1 次提交
  21. 12 7月, 2013 1 次提交
  22. 26 6月, 2013 1 次提交
    • D
      mutex: Add w/w mutex slowpath debugging · 23010027
      Daniel Vetter 提交于
      Injects EDEADLK conditions at pseudo-random interval, with
      exponential backoff up to UINT_MAX (to ensure that every lock
      operation still completes in a reasonable time).
      
      This way we can test the wound slowpath even for ww mutex users
      where contention is never expected, and the ww deadlock
      avoidance algorithm is only needed for correctness against
      malicious userspace. An example would be protecting kernel
      modesetting properties, which thanks to single-threaded X isn't
      really expected to contend, ever.
      
      I've looked into using the CONFIG_FAULT_INJECTION
      infrastructure, but decided against it for two reasons:
      
      - EDEADLK handling is mandatory for ww mutex users and should
        never affect the outcome of a syscall. This is in contrast to -ENOMEM
        injection. So fine configurability isn't required.
      
      - The fault injection framework only allows to set a simple
        probability for failure. Now the probability that a ww mutex acquire
        stage with N locks will never complete (due to too many injected
        EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock
        operations for the completely uncontended case would be O(exp(N)).
        The per-acuiqire ctx exponential backoff solution choosen here only
        results in O(log N) overhead due to injection and so O(log N * N)
        lock operations. This way we can fail with high probability (and so
        have good test coverage even for fancy backoff and lock acquisition
        paths) without running into patalogical cases.
      
      Note that EDEADLK will only ever be injected when we managed to
      acquire the lock. This prevents any behaviour changes for users
      which rely on the EALREADY semantics.
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@canonical.com>
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: dri-devel@lists.freedesktop.org
      Cc: linaro-mm-sig@lists.linaro.org
      Cc: rostedt@goodmis.org
      Cc: daniel@ffwll.ch
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20130620113117.4001.21681.stgit@patserSigned-off-by: NIngo Molnar <mingo@kernel.org>
      23010027