1. 05 5月, 2014 1 次提交
    • T
      rwsem: Add comments to explain the meaning of the rwsem's count field · 3cf2f34e
      Tim Chen 提交于
      It took me quite a while to understand how rwsem's count field
      mainifested itself in different scenarios.
      
      Add comments to provide a quick reference to the the rwsem's count
      field for each scenario where readers and writers are contending
      for the lock.
      
      Hopefully it will be useful for future maintenance of the code and
      for people to get up to speed on how the logic in the code works.
      Signed-off-by: NTim Chen <tim.c.chen@linux.intel.com>
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Paul E.McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1399060437.2970.146.camel@schen9-DESKSigned-off-by: NIngo Molnar <mingo@kernel.org>
      3cf2f34e
  2. 14 2月, 2014 1 次提交
  3. 06 11月, 2013 1 次提交
  4. 08 5月, 2013 1 次提交
    • D
      rwsem: check counter to avoid cmpxchg calls · 9607a85b
      Davidlohr Bueso 提交于
      This patch tries to reduce the amount of cmpxchg calls in the writer
      failed path by checking the counter value first before issuing the
      instruction.  If ->count is not set to RWSEM_WAITING_BIAS then there is
      no point wasting a cmpxchg call.
      
      Furthermore, Michel states "I suppose it helps due to the case where
      someone else steals the lock while we're trying to acquire
      sem->wait_lock."
      
      Two very different workloads and machines were used to see how this
      patch improves throughput: pgbench on a quad-core laptop and aim7 on a
      large 8 socket box with 80 cores.
      
      Some results comparing Michel's fast-path write lock stealing
      (tps-rwsem) on a quad-core laptop running pgbench:
      
        | db_size | clients  |  tps-rwsem     |   tps-patch  |
        +---------+----------+----------------+--------------+
        | 160 MB   |       1 |           6906 |         9153 | + 32.5
        | 160 MB   |       2 |          15931 |        22487 | + 41.1%
        | 160 MB   |       4 |          33021 |        32503 |
        | 160 MB   |       8 |          34626 |        34695 |
        | 160 MB   |      16 |          33098 |        34003 |
        | 160 MB   |      20 |          31343 |        31440 |
        | 160 MB   |      30 |          28961 |        28987 |
        | 160 MB   |      40 |          26902 |        26970 |
        | 160 MB   |      50 |          25760 |        25810 |
        ------------------------------------------------------
        | 1.6 GB   |       1 |           7729 |         7537 |
        | 1.6 GB   |       2 |          19009 |        23508 | + 23.7%
        | 1.6 GB   |       4 |          33185 |        32666 |
        | 1.6 GB   |       8 |          34550 |        34318 |
        | 1.6 GB   |      16 |          33079 |        32689 |
        | 1.6 GB   |      20 |          31494 |        31702 |
        | 1.6 GB   |      30 |          28535 |        28755 |
        | 1.6 GB   |      40 |          27054 |        27017 |
        | 1.6 GB   |      50 |          25591 |        25560 |
        ------------------------------------------------------
        | 7.6 GB   |       1 |           6224 |         7469 | + 20.0%
        | 7.6 GB   |       2 |          13611 |        12778 |
        | 7.6 GB   |       4 |          33108 |        32927 |
        | 7.6 GB   |       8 |          34712 |        34878 |
        | 7.6 GB   |      16 |          32895 |        33003 |
        | 7.6 GB   |      20 |          31689 |        31974 |
        | 7.6 GB   |      30 |          29003 |        28806 |
        | 7.6 GB   |      40 |          26683 |        26976 |
        | 7.6 GB   |      50 |          25925 |        25652 |
        ------------------------------------------------------
      
      For the aim7 worloads, they overall improved on top of Michel's
      patchset.  For full graphs on how the rwsem series plus this patch
      behaves on a large 8 socket machine against a vanilla kernel:
      
        http://stgolabs.net/rwsem-aim7-results.tar.gzSigned-off-by: NDavidlohr Bueso <davidlohr.bueso@hp.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9607a85b
  5. 07 5月, 2013 13 次提交
  6. 19 2月, 2013 1 次提交
    • A
      rwsem: Implement writer lock-stealing for better scalability · ce6711f3
      Alex Shi 提交于
      Commit 5a505085 ("mm/rmap: Convert the struct anon_vma::mutex
      to an rwsem") changed struct anon_vma::mutex to an rwsem, which
      caused aim7 fork_test performance to drop by 50%.
      
      Yuanhan Liu did the following excellent analysis:
      
          https://lkml.org/lkml/2013/1/29/84
      
      and found that the regression is caused by strict, serialized,
      FIFO sequential write-ownership of rwsems. Ingo suggested
      implementing opportunistic lock-stealing for the front writer
      task in the waitqueue.
      
      Yuanhan Liu implemented lock-stealing for spinlock-rwsems,
      which indeed recovered much of the regression - confirming
      the analysis that the main factor in the regression was the
      FIFO writer-fairness of rwsems.
      
      In this patch we allow lock-stealing to happen when the first
      waiter is also writer. With that change in place the
      aim7 fork_test performance is fully recovered on my
      Intel NHM EP, NHM EX, SNB EP 2S and 4S test-machines.
      
      Reported-by: lkp@linux.intel.com
      Reported-by: NYuanhan Liu <yuanhan.liu@linux.intel.com>
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: paul.gortmaker@windriver.com
      Link: https://lkml.org/lkml/2013/1/29/84
      Link: http://lkml.kernel.org/r/1360069915-31619-1-git-send-email-alex.shi@intel.com
      [ Small stylistic fixes, updated changelog. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      ce6711f3
  7. 08 3月, 2012 1 次提交
  8. 13 9月, 2011 1 次提交
  9. 27 1月, 2011 1 次提交
    • T
      rwsem: Remove redundant asmregparm annotation · d1233754
      Thomas Gleixner 提交于
      Peter Zijlstra pointed out, that the only user of asmregparm (x86) is
      compiling the kernel already with -mregparm=3. So the annotation of
      the rwsem functions is redundant. Remove it.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Chris Zankel <chris@zankel.net>
      LKML-Reference: <alpine.LFD.2.00.1101262130450.31804@localhost6.localdomain6>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      d1233754
  10. 10 8月, 2010 5 次提交
    • M
      rwsem: smaller wrappers around rwsem_down_failed_common · a8618a0e
      Michel Lespinasse 提交于
      More code can be pushed from rwsem_down_read_failed and
      rwsem_down_write_failed into rwsem_down_failed_common.
      
      Following change adding down_read_critical infrastructure support also
      enjoys having flags available in a register rather than having to fish it
      out in the struct rwsem_waiter...
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Mike Waychison <mikew@google.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Ying Han <yinghan@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a8618a0e
    • M
      rwsem: wake queued readers when writer blocks on active read lock · 424acaae
      Michel Lespinasse 提交于
      This change addresses the following situation:
      
      - Thread A acquires the rwsem for read
      - Thread B tries to acquire the rwsem for write, notices there is already
        an active owner for the rwsem.
      - Thread C tries to acquire the rwsem for read, notices that thread B already
        tried to acquire it.
      - Thread C grabs the spinlock and queues itself on the wait queue.
      - Thread B grabs the spinlock and queues itself behind C. At this point A is
        the only remaining active owner on the rwsem.
      
      In this situation thread B could notice that it was the last active writer
      on the rwsem, and decide to wake C to let it proceed in parallel with A
      since they both only want the rwsem for read.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Mike Waychison <mikew@google.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Ying Han <yinghan@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      424acaae
    • M
      rwsem: let RWSEM_WAITING_BIAS represent any number of waiting threads · fd41b334
      Michel Lespinasse 提交于
      Previously each waiting thread added a bias of RWSEM_WAITING_BIAS.  With
      this change, the bias is added only once to indicate that the wait list is
      non-empty.
      
      This has a few nice properties which will be used in following changes:
      - when the spinlock is held and the waiter list is known to be non-empty,
        count < RWSEM_WAITING_BIAS  <=>  there is an active writer on that sem
      - count == RWSEM_WAITING_BIAS  <=>  there are waiting threads and no
                                           active readers/writers on that sem
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Mike Waychison <mikew@google.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Ying Han <yinghan@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fd41b334
    • M
      rwsem: lighter active count checks when waking up readers · 70bdc6e0
      Michel Lespinasse 提交于
      In __rwsem_do_wake(), we can skip the active count check unless we come
      there from up_xxxx().  Also when checking the active count, it is not
      actually necessary to increment it; this allows us to get rid of the read
      side undo code and simplify the calculation of the final rwsem count
      adjustment once we've counted the reader threads to wake.
      
      The basic observation is the following.  When there are waiter threads on
      a rwsem and the spinlock is held, other threads can only increment the
      active count by trying to grab the rwsem in down_xxxx().  However
      down_xxxx() will notice there are waiter threads and take the down_failed
      path, blocking to acquire the spinlock on the way there.  Therefore, a
      thread observing an active count of zero with waiters queued and the
      spinlock held, is protected against other threads acquiring the rwsem
      until it wakes the last waiter or releases the spinlock.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Mike Waychison <mikew@google.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Ying Han <yinghan@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      70bdc6e0
    • M
      rwsem: fully separate code paths to wake writers vs readers · 345af7bf
      Michel Lespinasse 提交于
      This is in preparation for later changes in the series.
      
      In __rwsem_do_wake(), the first queued waiter is checked first in order to
      determine whether it's a writer or a reader.  The code paths diverge at
      this point.  The code that checks and increments the rwsem active count is
      duplicated on both sides - the point is that later changes in the series
      will be able to independently modify both sides.
      Signed-off-by: NMichel Lespinasse <walken@google.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Mike Waychison <mikew@google.com>
      Cc: Suleiman Souhlal <suleiman@google.com>
      Cc: Ying Han <yinghan@google.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      345af7bf
  11. 13 5月, 2010 1 次提交
  12. 30 1月, 2008 1 次提交
  13. 18 12月, 2007 1 次提交
  14. 11 10月, 2006 1 次提交
  15. 30 9月, 2006 1 次提交
  16. 04 7月, 2006 2 次提交
  17. 01 5月, 2005 1 次提交
  18. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4