1. 10 9月, 2018 1 次提交
  2. 03 7月, 2018 2 次提交
    • T
      locking: Implement an algorithm choice for Wound-Wait mutexes · 08295b3b
      Thomas Hellstrom 提交于
      The current Wound-Wait mutex algorithm is actually not Wound-Wait but
      Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait
      is, contrary to Wait-Die a preemptive algorithm and is known to generate
      fewer backoffs. Testing reveals that this is true if the
      number of simultaneous contending transactions is small.
      As the number of simultaneous contending threads increases, Wait-Wound
      becomes inferior to Wait-Die in terms of elapsed time.
      Possibly due to the larger number of held locks of sleeping transactions.
      
      Update documentation and callers.
      
      Timings using git://people.freedesktop.org/~thomash/ww_mutex_test
      tag patch-18-06-15
      
      Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly
      chosen out of 100000. Four core Intel x86_64:
      
      Algorithm    #threads       Rollbacks  time
      Wound-Wait   4              ~100       ~17s.
      Wait-Die     4              ~150000    ~19s.
      Wound-Wait   16             ~360000    ~109s.
      Wait-Die     16             ~450000    ~82s.
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Sean Paul <seanpaul@chromium.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-doc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: linaro-mm-sig@lists.linaro.org
      Co-authored-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      08295b3b
    • P
      locking: WW mutex cleanup · 55f036ca
      Peter Ziljstra 提交于
      Make the WW mutex code more readable by adding comments, splitting up
      functions and pointing out that we're actually using the Wait-Die
      algorithm.
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Gustavo Padovan <gustavo@padovan.org>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Sean Paul <seanpaul@chromium.org>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: linux-doc@vger.kernel.org
      Cc: linux-media@vger.kernel.org
      Cc: linaro-mm-sig@lists.linaro.org
      Co-authored-by: NThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      55f036ca
  3. 04 5月, 2018 1 次提交
    • P
      locking/mutex: Optimize __mutex_trylock_fast() · c427f695
      Peter Zijlstra 提交于
      Use try_cmpxchg to avoid the pointless TEST instruction..
      And add the (missing) atomic_long_try_cmpxchg*() wrappery.
      
      On x86_64 this gives:
      
      0000000000000710 <mutex_lock>:						0000000000000710 <mutex_lock>:
       710:   65 48 8b 14 25 00 00    mov    %gs:0x0,%rdx                      710:   65 48 8b 14 25 00 00    mov    %gs:0x0,%rdx
       717:   00 00                                                            717:   00 00
                              715: R_X86_64_32S       current_task                                    715: R_X86_64_32S       current_task
       719:   31 c0                   xor    %eax,%eax                         719:   31 c0                   xor    %eax,%eax
       71b:   f0 48 0f b1 17          lock cmpxchg %rdx,(%rdi)                 71b:   f0 48 0f b1 17          lock cmpxchg %rdx,(%rdi)
       720:   48 85 c0                test   %rax,%rax                         720:   75 02                   jne    724 <mutex_lock+0x14>
       723:   75 02                   jne    727 <mutex_lock+0x17>             722:   f3 c3                   repz retq
       725:   f3 c3                   repz retq                                724:   eb da                   jmp    700 <__mutex_lock_slowpath>
       727:   eb d7                   jmp    700 <__mutex_lock_slowpath>       726:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
       729:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)                         72d:   00 00 00
      
      On ARM64 this gives:
      
      000000000000638 <mutex_lock>:						0000000000000638 <mutex_lock>:
           638:       d5384101        mrs     x1, sp_el0                           638:       d5384101        mrs     x1, sp_el0
           63c:       d2800002        mov     x2, #0x0                             63c:       d2800002        mov     x2, #0x0
           640:       f9800011        prfm    pstl1strm, [x0]                      640:       f9800011        prfm    pstl1strm, [x0]
           644:       c85ffc03        ldaxr   x3, [x0]                             644:       c85ffc03        ldaxr   x3, [x0]
           648:       ca020064        eor     x4, x3, x2                           648:       ca020064        eor     x4, x3, x2
           64c:       b5000064        cbnz    x4, 658 <mutex_lock+0x20>            64c:       b5000064        cbnz    x4, 658 <mutex_lock+0x20>
           650:       c8047c01        stxr    w4, x1, [x0]                         650:       c8047c01        stxr    w4, x1, [x0]
           654:       35ffff84        cbnz    w4, 644 <mutex_lock+0xc>             654:       35ffff84        cbnz    w4, 644 <mutex_lock+0xc>
           658:       b40000c3        cbz     x3, 670 <mutex_lock+0x38>            658:       b5000043        cbnz    x3, 660 <mutex_lock+0x28>
           65c:       a9bf7bfd        stp     x29, x30, [sp,#-16]!                 65c:       d65f03c0        ret
           660:       910003fd        mov     x29, sp                              660:       a9bf7bfd        stp     x29, x30, [sp,#-16]!
           664:       97ffffef        bl      620 <__mutex_lock_slowpath>          664:       910003fd        mov     x29, sp
           668:       a8c17bfd        ldp     x29, x30, [sp],#16                   668:       97ffffee        bl      620 <__mutex_lock_slowpath>
           66c:       d65f03c0        ret                                          66c:       a8c17bfd        ldp     x29, x30, [sp],#16
           670:       d65f03c0        ret                                          670:       d65f03c0        ret
      Reported-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      c427f695
  4. 20 3月, 2018 1 次提交
  5. 16 5月, 2017 1 次提交
    • M
      mutex, futex: adjust kernel-doc markups to generate ReST · 7b4ff1ad
      Mauro Carvalho Chehab 提交于
      There are a few issues on some kernel-doc markups that was
      causing troubles with kernel-doc output on ReST format:
      
      ./kernel/futex.c:492: WARNING: Inline emphasis start-string without end-string.
      ./kernel/futex.c:1264: WARNING: Block quote ends without a blank line; unexpected unindent.
      ./kernel/futex.c:1721: WARNING: Block quote ends without a blank line; unexpected unindent.
      ./kernel/futex.c:2338: WARNING: Block quote ends without a blank line; unexpected unindent.
      ./kernel/futex.c:2426: WARNING: Block quote ends without a blank line; unexpected unindent.
      ./kernel/futex.c:2899: WARNING: Block quote ends without a blank line; unexpected unindent.
      ./kernel/futex.c:2972: WARNING: Block quote ends without a blank line; unexpected unindent.
      
      Fix them.
      
      No functional changes.
      Acked-by: NDarren Hart (VMware) <dvhart@infradead.org>
      Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
      7b4ff1ad
  6. 02 3月, 2017 3 次提交
  7. 30 1月, 2017 1 次提交
  8. 14 1月, 2017 14 次提交
    • T
      locking/mutex, sched/wait: Add mutex_lock_io() · 1460cb65
      Tejun Heo 提交于
      We sometimes end up propagating IO blocking through mutexes; however,
      because there currently is no way of annotating mutex sleeps as
      iowait, there are cases where iowait and /proc/stat:procs_blocked
      report misleading numbers obscuring the actual state of the system.
      
      This patch adds mutex_lock_io() so that mutex sleeps can be marked as
      iowait in those cases.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: adilger.kernel@dilger.ca
      Cc: jack@suse.com
      Cc: kernel-team@fb.com
      Cc: mingbo@fb.com
      Cc: tytso@mit.edu
      Link: http://lkml.kernel.org/r/1477673892-28940-4-git-send-email-tj@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      1460cb65
    • N
      locking/mutex: Initialize mutex_waiter::ww_ctx with poison when debugging · 977625a6
      Nicolai Hähnle 提交于
      Help catch cases where mutex_lock is used directly on w/w mutexes, which
      otherwise result in the w/w tasks reading uninitialized data.
      Signed-off-by: NNicolai Hähnle <Nicolai.Haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-12-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      977625a6
    • N
      locking/ww_mutex: Optimize ww-mutexes by yielding to other waiters from optimistic spin · c516df97
      Nicolai Hähnle 提交于
      Lock stealing is less beneficial for w/w mutexes since we may just end up
      backing off if we stole from a thread with an earlier acquire stamp that
      already holds another w/w mutex that we also need. So don't spin
      optimistically unless we are sure that there is no other waiter that might
      cause us to back off.
      
      Median timings taken of a contention-heavy GPU workload:
      
      Before:
      
        real    0m52.946s
        user    0m7.272s
        sys     1m55.964s
      
      After:
      
        real    0m53.086s
        user    0m7.360s
        sys     1m46.204s
      
      This particular workload still spends 20%-25% of CPU in mutex_spin_on_owner
      according to perf, but my attempts to further reduce this spinning based on
      various heuristics all lead to an increase in measured wall time despite
      the decrease in sys time.
      Signed-off-by: NNicolai Hähnle <Nicolai.Haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-11-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c516df97
    • N
      locking/ww_mutex: Re-check ww->ctx in the inner optimistic spin loop · 25f13b40
      Nicolai Hähnle 提交于
      In the following scenario, thread #1 should back off its attempt to lock
      ww1 and unlock ww2 (assuming the acquire context stamps are ordered
      accordingly).
      
          Thread #0               Thread #1
          ---------               ---------
                                  successfully lock ww2
          set ww1->base.owner
                                  attempt to lock ww1
                                  confirm ww1->ctx == NULL
                                  enter mutex_spin_on_owner
          set ww1->ctx
      
      What was likely to happen previously is:
      
          attempt to lock ww2
          refuse to spin because
            ww2->ctx != NULL
          schedule()
                                  detect thread #0 is off CPU
                                  stop optimistic spin
                                  return -EDEADLK
                                  unlock ww2
                                  wakeup thread #0
          lock ww2
      
      Now, we are more likely to see:
      
                                  detect ww1->ctx != NULL
                                  stop optimistic spin
                                  return -EDEADLK
                                  unlock ww2
          successfully lock ww2
      
      ... because thread #1 will stop its optimistic spin as soon as possible.
      
      The whole scenario is quite unlikely, since it requires thread #1 to get
      between thread #0 setting the owner and setting the ctx. But since we're
      idling here anyway, the additional check is basically free.
      
      Found by inspection.
      Signed-off-by: NNicolai Hähnle <Nicolai.Haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-10-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      25f13b40
    • P
      locking/mutex: Improve inlining · 427b1820
      Peter Zijlstra 提交于
      Instead of inlining __mutex_lock_common() 5 times, once for each
      {state,ww} variant. Reduce this to two, ww and !ww.
      
      Then add __always_inline to mutex_optimistic_spin(), so that that will
      get inlined all 4 remaining times, for all {waiter,ww} variants.
      
         text    data     bss     dec     hex filename
      
         6301       0       0    6301    189d defconfig-build/kernel/locking/mutex.o
         4053       0       0    4053     fd5 defconfig-build/kernel/locking/mutex.o
         4257       0       0    4257    10a1 defconfig-build/kernel/locking/mutex.o
      
      This reduces total text size and better separates the ww and !ww mutex
      code generation.
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      427b1820
    • N
      locking/ww_mutex: Optimize ww-mutexes by waking at most one waiter for backoff... · 659cf9f5
      Nicolai Hähnle 提交于
      locking/ww_mutex: Optimize ww-mutexes by waking at most one waiter for backoff when acquiring the lock
      
      The wait list is sorted by stamp order, and the only waiting task that may
      have to back off is the first waiter with a context.
      
      The regular slow path does not have to wake any other tasks at all, since
      all other waiters that would have to back off were either woken up when
      the waiter was added to the list, or detected the condition before they
      added themselves.
      
      Median timings taken of a contention-heavy GPU workload:
      
      Without this series:
      
        real    0m59.900s
        user    0m7.516s
        sys     2m16.076s
      
      With changes up to and including this patch:
      
        real    0m52.946s
        user    0m7.272s
        sys     1m55.964s
      Signed-off-by: NNicolai Hähnle <Nicolai.Haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-9-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      659cf9f5
    • N
      locking/ww_mutex: Notify waiters that have to back off while adding tasks to wait list · 200b1874
      Nicolai Hähnle 提交于
      While adding our task as a waiter, detect if another task should back off
      because of us.
      
      With this patch, we establish the invariant that the wait list contains
      at most one (sleeping) waiter with ww_ctx->acquired > 0, and this waiter
      will be the first waiter with a context.
      
      Since only waiters with ww_ctx->acquired > 0 have to back off, this allows
      us to be much more economical with wakeups.
      Signed-off-by: NNicolai Hähnle <Nicolai.Haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-8-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      200b1874
    • N
      locking/ww_mutex: Add waiters in stamp order · 6baa5c60
      Nicolai Hähnle 提交于
      Add regular waiters in stamp order. Keep adding waiters that have no
      context in FIFO order and take care not to starve them.
      
      While adding our task as a waiter, back off if we detect that there is
      a waiter with a lower stamp in front of us.
      
      Make sure to call lock_contended even when we back off early.
      
      For w/w mutexes, being first in the wait list is only stable when
      taking the lock without a context. Therefore, the purpose of the first
      flag is split into two: 'first' remains to indicate whether we want to
      spin optimistically, while 'handoff' indicates that we should be
      prepared to accept a handoff.
      
      For w/w locking with a context, we always accept handoffs after the
      first schedule(), to handle the following sequence of events:
      
       1. Task #0 unlocks and hands off to Task #2 which is first in line
      
       2. Task #1 adds itself in front of Task #2
      
       3. Task #2 wakes up and must accept the handoff even though it is no
          longer first in line
      Signed-off-by: NNicolai Hähnle <nicolai.haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= <Nicolai.Haehnle@amd.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-7-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6baa5c60
    • N
      locking/ww_mutex: Remove the __ww_mutex_lock*() inline wrappers · c5470b22
      Nicolai Hähnle 提交于
      Keep the documentation in the header file since there is no good place
      for it in mutex.c: there are two rather different implementations with
      different EXPORT_SYMBOLs for each function.
      Signed-off-by: NNicolai Hähnle <nicolai.haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= <Nicolai.Haehnle@amd.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-6-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c5470b22
    • N
      locking/ww_mutex: Set use_ww_ctx even when locking without a context · ea9e0fb8
      Nicolai Hähnle 提交于
      We will add a new field to struct mutex_waiter.  This field must be
      initialized for all waiters if any waiter uses the ww_use_ctx path.
      
      So there is a trade-off: Keep ww_mutex locking without a context on
      the faster non-use_ww_ctx path, at the cost of adding the
      initialization to all mutex locks (including non-ww_mutexes), or avoid
      the additional cost for non-ww_mutex locks, at the cost of adding
      additional checks to the use_ww_ctx path.
      
      We take the latter choice.  It may be worth eliminating the users of
      ww_mutex_lock(lock, NULL), but there are a lot of them.
      Signed-off-by: NNicolai Hähnle <Nicolai.Haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-5-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      ea9e0fb8
    • N
      locking/ww_mutex: Extract stamp comparison to __ww_mutex_stamp_after() · 3822da3e
      Nicolai Hähnle 提交于
      The function will be re-used in subsequent patches.
      Signed-off-by: NNicolai Hähnle <Nicolai.Haehnle@amd.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <dev@mblankhorst.nl>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dri-devel@lists.freedesktop.org
      Link: http://lkml.kernel.org/r/1482346000-9927-4-git-send-email-nhaehnle@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3822da3e
    • P
      locking/mutex: Fix mutex handoff · e274795e
      Peter Zijlstra 提交于
      While reviewing the ww_mutex patches, I noticed that it was still
      possible to (incorrectly) succeed for (incorrect) code like:
      
      	mutex_lock(&a);
      	mutex_lock(&a);
      
      This was possible if the second mutex_lock() would block (as expected)
      but then receive a spurious wakeup. At that point it would find itself
      at the front of the queue, request a handoff and instantly claim
      ownership and continue, since owner would point to itself.
      
      Avoid this scenario and simplify the code by introducing a third low
      bit to signal handoff pickup. So once we request handoff, unlock
      clears the handoff bit and sets the pickup bit along with the new
      owner.
      
      This also removes the need for the .handoff argument to
      __mutex_trylock(), since that becomes superfluous with PICKUP.
      
      In order to guarantee enough low bits, ensure task_struct alignment is
      at least L1_CACHE_BYTES (which seems a good ideal regardless).
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 9d659ae1 ("locking/mutex: Add lock handoff to avoid starvation")
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      e274795e
    • D
      sched/core: Remove set_task_state() · 642fa448
      Davidlohr Bueso 提交于
      This is a nasty interface and setting the state of a foreign task must
      not be done. As of the following commit:
      
        be628be0 ("bcache: Make gc wakeup sane, remove set_task_state()")
      
      ... everyone in the kernel calls set_task_state() with current, allowing
      the helper to be removed.
      
      However, as the comment indicates, it is still around for those archs
      where computing current is more expensive than using a pointer, at least
      in theory. An important arch that is affected is arm64, however this has
      been addressed now [1] and performance is up to par making no difference
      with either calls.
      
      Of all the callers, if any, it's the locking bits that would care most
      about this -- ie: we end up passing a tsk pointer to a lot of the lock
      slowpath, and setting ->state on that. The following numbers are based
      on two tests: a custom ad-hoc microbenchmark that just measures
      latencies (for ~65 million calls) between get_task_state() vs
      get_current_state().
      
      Secondly for a higher overview, an unlink microbenchmark was used,
      which pounds on a single file with open, close,unlink combos with
      increasing thread counts (up to 4x ncpus). While the workload is quite
      unrealistic, it does contend a lot on the inode mutex or now rwsem.
      
      [1] https://lkml.kernel.org/r/1483468021-8237-1-git-send-email-mark.rutland@arm.com
      
      == 1. x86-64 ==
      
      Avg runtime set_task_state():    601 msecs
      Avg runtime set_current_state(): 552 msecs
      
                                                  vanilla                 dirty
      Hmean    unlink1-processes-2      36089.26 (  0.00%)    38977.33 (  8.00%)
      Hmean    unlink1-processes-5      28555.01 (  0.00%)    29832.55 (  4.28%)
      Hmean    unlink1-processes-8      37323.75 (  0.00%)    44974.57 ( 20.50%)
      Hmean    unlink1-processes-12     43571.88 (  0.00%)    44283.01 (  1.63%)
      Hmean    unlink1-processes-21     34431.52 (  0.00%)    38284.45 ( 11.19%)
      Hmean    unlink1-processes-30     34813.26 (  0.00%)    37975.17 (  9.08%)
      Hmean    unlink1-processes-48     37048.90 (  0.00%)    39862.78 (  7.59%)
      Hmean    unlink1-processes-79     35630.01 (  0.00%)    36855.30 (  3.44%)
      Hmean    unlink1-processes-110    36115.85 (  0.00%)    39843.91 ( 10.32%)
      Hmean    unlink1-processes-141    32546.96 (  0.00%)    35418.52 (  8.82%)
      Hmean    unlink1-processes-172    34674.79 (  0.00%)    36899.21 (  6.42%)
      Hmean    unlink1-processes-203    37303.11 (  0.00%)    36393.04 ( -2.44%)
      Hmean    unlink1-processes-224    35712.13 (  0.00%)    36685.96 (  2.73%)
      
      == 2. ppc64le ==
      
      Avg runtime set_task_state():  938 msecs
      Avg runtime set_current_state: 940 msecs
      
                                                  vanilla                 dirty
      Hmean    unlink1-processes-2      19269.19 (  0.00%)    30704.50 ( 59.35%)
      Hmean    unlink1-processes-5      20106.15 (  0.00%)    21804.15 (  8.45%)
      Hmean    unlink1-processes-8      17496.97 (  0.00%)    17243.28 ( -1.45%)
      Hmean    unlink1-processes-12     14224.15 (  0.00%)    17240.21 ( 21.20%)
      Hmean    unlink1-processes-21     14155.66 (  0.00%)    15681.23 ( 10.78%)
      Hmean    unlink1-processes-30     14450.70 (  0.00%)    15995.83 ( 10.69%)
      Hmean    unlink1-processes-48     16945.57 (  0.00%)    16370.42 ( -3.39%)
      Hmean    unlink1-processes-79     15788.39 (  0.00%)    14639.27 ( -7.28%)
      Hmean    unlink1-processes-110    14268.48 (  0.00%)    14377.40 (  0.76%)
      Hmean    unlink1-processes-141    14023.65 (  0.00%)    16271.69 ( 16.03%)
      Hmean    unlink1-processes-172    13417.62 (  0.00%)    16067.55 ( 19.75%)
      Hmean    unlink1-processes-203    15293.08 (  0.00%)    15440.40 (  0.96%)
      Hmean    unlink1-processes-234    13719.32 (  0.00%)    16190.74 ( 18.01%)
      Hmean    unlink1-processes-265    16400.97 (  0.00%)    16115.22 ( -1.74%)
      Hmean    unlink1-processes-296    14388.60 (  0.00%)    16216.13 ( 12.70%)
      Hmean    unlink1-processes-320    15771.85 (  0.00%)    15905.96 (  0.85%)
      
      x86-64 (known to be fast for get_current()/this_cpu_read_stable() caching)
      and ppc64 (with paca) show similar improvements in the unlink microbenches.
      The small delta for ppc64 (2ms), does not represent the gains on the unlink
      runs. In the case of x86, there was a decent amount of variation in the
      latency runs, but always within a 20 to 50ms increase), ppc was more constant.
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave@stgolabs.net
      Cc: mark.rutland@arm.com
      Link: http://lkml.kernel.org/r/1483479794-14013-5-git-send-email-dave@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      642fa448
    • D
      kernel/locking: Compute 'current' directly · d269a8b8
      Davidlohr Bueso 提交于
      This patch effectively replaces the tsk pointer dereference
      (which is obviously == current), to directly use get_current()
      macro. This is to make the removal of setting foreign task
      states smoother and painfully obvious. Performance win on some
      archs such as x86-64 and ppc64. On a microbenchmark that calls
      set_task_state() vs set_current_state() and an inode rwsem
      pounding benchmark doing unlink:
      
      == 1. x86-64 ==
      
      Avg runtime set_task_state():    601 msecs
      Avg runtime set_current_state(): 552 msecs
      
                                                  vanilla                 dirty
      Hmean    unlink1-processes-2      36089.26 (  0.00%)    38977.33 (  8.00%)
      Hmean    unlink1-processes-5      28555.01 (  0.00%)    29832.55 (  4.28%)
      Hmean    unlink1-processes-8      37323.75 (  0.00%)    44974.57 ( 20.50%)
      Hmean    unlink1-processes-12     43571.88 (  0.00%)    44283.01 (  1.63%)
      Hmean    unlink1-processes-21     34431.52 (  0.00%)    38284.45 ( 11.19%)
      Hmean    unlink1-processes-30     34813.26 (  0.00%)    37975.17 (  9.08%)
      Hmean    unlink1-processes-48     37048.90 (  0.00%)    39862.78 (  7.59%)
      Hmean    unlink1-processes-79     35630.01 (  0.00%)    36855.30 (  3.44%)
      Hmean    unlink1-processes-110    36115.85 (  0.00%)    39843.91 ( 10.32%)
      Hmean    unlink1-processes-141    32546.96 (  0.00%)    35418.52 (  8.82%)
      Hmean    unlink1-processes-172    34674.79 (  0.00%)    36899.21 (  6.42%)
      Hmean    unlink1-processes-203    37303.11 (  0.00%)    36393.04 ( -2.44%)
      Hmean    unlink1-processes-224    35712.13 (  0.00%)    36685.96 (  2.73%)
      
      == 2. ppc64le ==
      
      Avg runtime set_task_state():  938 msecs
      Avg runtime set_current_state: 940 msecs
      
                                                  vanilla                 dirty
      Hmean    unlink1-processes-2      19269.19 (  0.00%)    30704.50 ( 59.35%)
      Hmean    unlink1-processes-5      20106.15 (  0.00%)    21804.15 (  8.45%)
      Hmean    unlink1-processes-8      17496.97 (  0.00%)    17243.28 ( -1.45%)
      Hmean    unlink1-processes-12     14224.15 (  0.00%)    17240.21 ( 21.20%)
      Hmean    unlink1-processes-21     14155.66 (  0.00%)    15681.23 ( 10.78%)
      Hmean    unlink1-processes-30     14450.70 (  0.00%)    15995.83 ( 10.69%)
      Hmean    unlink1-processes-48     16945.57 (  0.00%)    16370.42 ( -3.39%)
      Hmean    unlink1-processes-79     15788.39 (  0.00%)    14639.27 ( -7.28%)
      Hmean    unlink1-processes-110    14268.48 (  0.00%)    14377.40 (  0.76%)
      Hmean    unlink1-processes-141    14023.65 (  0.00%)    16271.69 ( 16.03%)
      Hmean    unlink1-processes-172    13417.62 (  0.00%)    16067.55 ( 19.75%)
      Hmean    unlink1-processes-203    15293.08 (  0.00%)    15440.40 (  0.96%)
      Hmean    unlink1-processes-234    13719.32 (  0.00%)    16190.74 ( 18.01%)
      Hmean    unlink1-processes-265    16400.97 (  0.00%)    16115.22 ( -1.74%)
      Hmean    unlink1-processes-296    14388.60 (  0.00%)    16216.13 ( 12.70%)
      Hmean    unlink1-processes-320    15771.85 (  0.00%)    15905.96 (  0.85%)
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave@stgolabs.net
      Cc: mark.rutland@arm.com
      Link: http://lkml.kernel.org/r/1483479794-14013-4-git-send-email-dave@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d269a8b8
  9. 22 11月, 2016 1 次提交
    • P
      locking/mutex: Break out of expensive busy-loop on... · 05ffc951
      Pan Xinhui 提交于
      locking/mutex: Break out of expensive busy-loop on {mutex,rwsem}_spin_on_owner() when owner vCPU is preempted
      
      An over-committed guest with more vCPUs than pCPUs has a heavy overload
      in the two spin_on_owner. This blames on the lock holder preemption
      issue.
      
      Break out of the loop if the vCPU is preempted: if vcpu_is_preempted(cpu)
      is true.
      
      test-case:
      perf record -a perf bench sched messaging -g 400 -p && perf report
      
      before patch:
      20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
       8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
       4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
       3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
       2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
       2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
       2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
      
      after patch:
       9.99%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
       5.28%  sched-messaging  [unknown]         [H] 0xc0000000000768e0
       4.27%  sched-messaging  [kernel.vmlinux]  [k] __copy_tofrom_user_power7
       3.77%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
       3.24%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
       3.02%  sched-messaging  [kernel.vmlinux]  [k] system_call
       2.69%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
      Tested-by: NJuergen Gross <jgross@suse.com>
      Signed-off-by: NPan Xinhui <xinhui.pan@linux.vnet.ibm.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Cc: David.Laight@ACULAB.COM
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: benh@kernel.crashing.org
      Cc: boqun.feng@gmail.com
      Cc: bsingharora@gmail.com
      Cc: dave@stgolabs.net
      Cc: kernellwp@gmail.com
      Cc: konrad.wilk@oracle.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: mpe@ellerman.id.au
      Cc: paulmck@linux.vnet.ibm.com
      Cc: paulus@samba.org
      Cc: rkrcmar@redhat.com
      Cc: virtualization@lists.linux-foundation.org
      Cc: will.deacon@arm.com
      Cc: xen-devel-request@lists.xenproject.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1478077718-37424-4-git-send-email-xinhui.pan@linux.vnet.ibm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      05ffc951
  10. 21 11月, 2016 1 次提交
  11. 16 11月, 2016 1 次提交
  12. 25 10月, 2016 5 次提交
    • W
      locking/mutex: Enable optimistic spinning of woken waiter · b341afb3
      Waiman Long 提交于
      This patch makes the waiter that sets the HANDOFF flag start spinning
      instead of sleeping until the handoff is complete or the owner
      sleeps. Otherwise, the handoff will cause the optimistic spinners to
      abort spinning as the handed-off owner may not be running.
      Tested-by: NJason Low <jason.low2@hpe.com>
      Signed-off-by: NWaiman Long <Waiman.Long@hpe.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Ding Tianhong <dingtianhong@huawei.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Will Deacon <Will.Deacon@arm.com>
      Link: http://lkml.kernel.org/r/1472254509-27508-2-git-send-email-Waiman.Long@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b341afb3
    • W
      locking/mutex: Simplify some ww_mutex code in __mutex_lock_common() · a40ca565
      Waiman Long 提交于
      This patch removes some of the redundant ww_mutex code in
      __mutex_lock_common().
      Tested-by: NJason Low <jason.low2@hpe.com>
      Signed-off-by: NWaiman Long <Waiman.Long@hpe.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Ding Tianhong <dingtianhong@huawei.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Will Deacon <Will.Deacon@arm.com>
      Link: http://lkml.kernel.org/r/1472254509-27508-1-git-send-email-Waiman.Long@hpe.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a40ca565
    • P
      locking/mutex: Restructure wait loop · 5bbd7e64
      Peter Zijlstra 提交于
      Doesn't really matter yet, but pull the HANDOFF and trylock out from
      under the wait_lock.
      
      The intention is to add an optimistic spin loop here, which requires
      we do not hold the wait_lock, so shuffle code around in preparation.
      
      Also clarify the purpose of taking the wait_lock in the wait loop, its
      tempting to want to avoid it altogether, but the cancellation cases
      need to to avoid losing wakeups.
      Suggested-by: NWaiman Long <waiman.long@hpe.com>
      Tested-by: NJason Low <jason.low2@hpe.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5bbd7e64
    • P
      locking/mutex: Add lock handoff to avoid starvation · 9d659ae1
      Peter Zijlstra 提交于
      Implement lock handoff to avoid lock starvation.
      
      Lock starvation is possible because mutex_lock() allows lock stealing,
      where a running (or optimistic spinning) task beats the woken waiter
      to the acquire.
      
      Lock stealing is an important performance optimization because waiting
      for a waiter to wake up and get runtime can take a significant time,
      during which everyboy would stall on the lock.
      
      The down-side is of course that it allows for starvation.
      
      This patch has the waiter requesting a handoff if it fails to acquire
      the lock upon waking. This re-introduces some of the wait time,
      because once we do a handoff we have to wait for the waiter to wake up
      again.
      
      A future patch will add a round of optimistic spinning to attempt to
      alleviate this penalty, but if that turns out to not be enough, we can
      add a counter and only request handoff after multiple failed wakeups.
      
      There are a few tricky implementation details:
      
       - accepting a handoff must only be done in the wait-loop. Since the
         handoff condition is owner == current, it can easily cause
         recursive locking trouble.
      
       - accepting the handoff must be careful to provide the ACQUIRE
         semantics.
      
       - having the HANDOFF bit set on unlock requires care, we must not
         clear the owner.
      
       - we must be careful to not leave HANDOFF set after we've acquired
         the lock. The tricky scenario is setting the HANDOFF bit on an
         unlocked mutex.
      Tested-by: NJason Low <jason.low2@hpe.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NWaiman Long <Waiman.Long@hpe.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      9d659ae1
    • P
      locking/mutex: Rework mutex::owner · 3ca0ff57
      Peter Zijlstra 提交于
      The current mutex implementation has an atomic lock word and a
      non-atomic owner field.
      
      This disparity leads to a number of issues with the current mutex code
      as it means that we can have a locked mutex without an explicit owner
      (because the owner field has not been set, or already cleared).
      
      This leads to a number of weird corner cases, esp. between the
      optimistic spinning and debug code. Where the optimistic spinning
      code needs the owner field updated inside the lock region, the debug
      code is more relaxed because the whole lock is serialized by the
      wait_lock.
      
      Also, the spinning code itself has a few corner cases where we need to
      deal with a held lock without an owner field.
      
      Furthermore, it becomes even more of a problem when trying to fix
      starvation cases in the current code. We end up stacking special case
      on special case.
      
      To solve this rework the basic mutex implementation to be a single
      atomic word that contains the owner and uses the low bits for extra
      state.
      
      This matches how PI futexes and rt_mutex already work. By having the
      owner an integral part of the lock state a lot of the problems
      dissapear and we get a better option to deal with starvation cases,
      direct owner handoff.
      
      Changing the basic mutex does however invalidate all the arch specific
      mutex code; this patch leaves that unused in-place, a later patch will
      remove that.
      Tested-by: NJason Low <jason.low2@hpe.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      3ca0ff57
  13. 24 6月, 2016 1 次提交
  14. 03 6月, 2016 1 次提交
  15. 29 2月, 2016 1 次提交
  16. 06 10月, 2015 1 次提交
  17. 09 4月, 2015 1 次提交
    • J
      locking/mutex: Further simplify mutex_spin_on_owner() · 01ac33c1
      Jason Low 提交于
      Similar to what Linus suggested for rwsem_spin_on_owner(), in
      mutex_spin_on_owner() instead of having while (true) and
      breaking out of the spin loop on lock->owner != owner, we can
      have the loop directly check for while (lock->owner == owner) to
      improve the readability of the code.
      
      It also shrinks the code a bit:
      
         text    data     bss     dec     hex filename
         3721       0       0    3721     e89 mutex.o.before
         3705       0       0    3705     e79 mutex.o.after
      Signed-off-by: NJason Low <jason.low2@hp.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/1428521960-5268-2-git-send-email-jason.low2@hp.com
      [ Added code generation info. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      01ac33c1
  18. 24 2月, 2015 1 次提交
  19. 18 2月, 2015 2 次提交
    • D
      locking/rwsem: Set lock ownership ASAP · 7a215f89
      Davidlohr Bueso 提交于
      In order to optimize the spinning step, we need to set the lock
      owner as soon as the lock is acquired; after a successful counter
      cmpxchg operation, that is. This is particularly useful as rwsems
      need to set the owner to nil for readers, so there is a greater
      chance of falling out of the spinning. Currently we only set the
      owner much later in the game, in the more generic level -- latency
      can be specially bad when waiting for a node->next pointer when
      releasing the osq in up_write calls.
      
      As such, update the owner inside rwsem_try_write_lock (when the
      lock is obtained after blocking) and rwsem_try_write_lock_unqueued
      (when the lock is obtained while spinning). This requires creating
      a new internal rwsem.h header to share the owner related calls.
      
      Also cleanup some headers for mutex and rwsem.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/1422609267-15102-4-git-send-email-dave@stgolabs.netSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7a215f89
    • J
      locking/mutex: Refactor mutex_spin_on_owner() · be1f7bf2
      Jason Low 提交于
      As suggested by Davidlohr, we could refactor mutex_spin_on_owner().
      
      Currently, we split up owner_running() with mutex_spin_on_owner().
      When the owner changes, we make duplicate owner checks which are not
      necessary. It also makes the code a bit obscure as we are using a
      second check to figure out why we broke out of the loop.
      
      This patch modifies it such that we remove the owner_running() function
      and the mutex_spin_on_owner() loop directly checks for if the owner changes,
      if the owner is not running, or if we need to reschedule. If the owner
      changes, we break out of the loop and return true. If the owner is not
      running or if we need to reschedule, then break out of the loop and return
      false.
      Suggested-by: NDavidlohr Bueso <dave@stgolabs.net>
      Signed-off-by: NJason Low <jason.low2@hp.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: chegu_vinod@hp.com
      Cc: tglx@linutronix.de
      Link: http://lkml.kernel.org/r/1422914367-5574-3-git-send-email-jason.low2@hp.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      be1f7bf2