1. 14 9月, 2009 2 次提交
    • F
      reiserfs, kill-the-BKL: fix unsafe j_flush_mutex lock · a412f9ef
      Frederic Weisbecker 提交于
      Impact: fix a deadlock
      
      The j_flush_mutex is acquired safely in journal.c:
      if we can't take it, we free the reiserfs per superblock lock
      and wait a bit.
      
      But we have a remaining place in kupdate_transactions() where
      j_flush_mutex is still acquired traditionnaly. Thus the following
      scenario (warned by lockdep) can happen:
      
      A						B
      
      mutex_lock(&write_lock)			mutex_lock(&write_lock)
      	mutex_lock(&j_flush_mutex)	mutex_lock(&j_flush_mutex) //block
      	mutex_unlock(&write_lock)
      	sleep...
      	mutex_lock(&write_lock) //deadlock
      
      Fix this by using reiserfs_mutex_lock_safe() in kupdate_transactions().
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@texware.it>
      Cc: Jeff Mahoney <jeffm@suse.com>
      LKML-Reference: <1239660635-12940-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a412f9ef
    • F
      reiserfs: kill-the-BKL · 8ebc4232
      Frederic Weisbecker 提交于
      This patch is an attempt to remove the Bkl based locking scheme from
      reiserfs and is intended.
      
      It is a bit inspired from an old attempt by Peter Zijlstra:
      
         http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html
      
      The bkl is heavily used in this filesystem to prevent from
      concurrent write accesses on the filesystem.
      
      Reiserfs makes a deep use of the specific properties of the Bkl:
      
      - It can be acqquired recursively by a same task
      - It is released on the schedule() calls and reacquired when schedule() returns
      
      The two properties above are a roadmap for the reiserfs write locking so it's
      very hard to simply replace it with a common mutex.
      
      - We need a recursive-able locking unless we want to restructure several blocks
        of the code.
      - We need to identify the sites where the bkl was implictly relaxed
        (schedule, wait, sync, etc...) so that we can in turn release and
        reacquire our new lock explicitly.
        Such implicit releases of the lock are often required to let other
        resources producer/consumer do their job or we can suffer unexpected
        starvations or deadlocks.
      
      So the new lock that replaces the bkl here is a per superblock mutex with a
      specific property: it can be acquired recursively by a same task, like the
      bkl.
      
      For such purpose, we integrate a lock owner and a lock depth field on the
      superblock information structure.
      
      The first axis on this patch is to turn reiserfs_write_(un)lock() function
      into a wrapper to manage this mutex. Also some explicit calls to
      lock_kernel() have been converted to reiserfs_write_lock() helpers.
      
      The second axis is to find the important blocking sites (schedule...(),
      wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
      release of the write lock on these locations before blocking. Then we can
      safely wait for those who can give us resources or those who need some.
      Typically this is a fight between the current writer, the reiserfs workqueue
      (aka the async commiter) and the pdflush threads.
      
      The third axis is a consequence of the second. The write lock is usually
      on top of a lock dependency chain which can include the journal lock, the
      flush lock or the commit lock. So it's dangerous to release and trying to
      reacquire the write lock while we still hold other locks.
      
      This is fine with the bkl:
      
            T1                       T2
      
      lock_kernel()
          mutex_lock(A)
          unlock_kernel()
          // do something
                                  lock_kernel()
                                      mutex_lock(A) -> already locked by T1
                                      schedule() (and then unlock_kernel())
          lock_kernel()
          mutex_unlock(A)
          ....
      
      This is not fine with a mutex:
      
            T1                       T2
      
      mutex_lock(write)
          mutex_lock(A)
          mutex_unlock(write)
          // do something
                                 mutex_lock(write)
                                    mutex_lock(A) -> already locked by T1
                                    schedule()
      
          mutex_lock(write) -> already locked by T2
          deadlock
      
      The solution in this patch is to provide a helper which releases the write
      lock and sleep a bit if we can't lock a mutex that depend on it. It's another
      simulation of the bkl behaviour.
      
      The last axis is to locate the fs callbacks that are called with the bkl held,
      according to Documentation/filesystem/Locking.
      
      Those are:
      
      - reiserfs_remount
      - reiserfs_fill_super
      - reiserfs_put_super
      
      Reiserfs didn't need to explicitly lock because of the context of these callbacks.
      But now we must take care of that with the new locking.
      
      After this patch, reiserfs suffers from a slight performance regression (for now).
      On UP, a high volume write with dd reports an average of 27 MB/s instead
      of 30 MB/s without the patch applied.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Bron Gondwana <brong@fastmail.fm>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      LKML-Reference: <1239070789-13354-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ebc4232
  2. 11 7月, 2009 1 次提交
  3. 31 3月, 2009 6 次提交
    • J
      reiserfs: rename p_s_sb to sb · a9dd3643
      Jeff Mahoney 提交于
      This patch is a simple s/p_s_sb/sb/g to the reiserfs code.  This is the
      first in a series of patches to rip out some of the awful variable
      naming in reiserfs.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a9dd3643
    • J
      reiserfs: strip trailing whitespace · 0222e657
      Jeff Mahoney 提交于
      This patch strips trailing whitespace from the reiserfs code.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0222e657
    • J
      reiserfs: rearrange journal abort · 32e8b106
      Jeff Mahoney 提交于
      This patch kills off reiserfs_journal_abort as it is never called, and
      combines __reiserfs_journal_abort_{soft,hard} into one function called
      reiserfs_abort_journal, which performs the same work. It is silent
      as opposed to the old version, since the message was always issued
      after a regular 'abort' message.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32e8b106
    • J
      reiserfs: rework reiserfs_panic · c3a9c210
      Jeff Mahoney 提交于
      ReiserFS panics can be somewhat inconsistent.
      In some cases:
       * a unique identifier may be associated with it
       * the function name may be included
       * the device may be printed separately
      
      This patch aims to make warnings more consistent. reiserfs_warning() prints
      the device name, so printing it a second time is not required. The function
      name for a warning is always helpful in debugging, so it is now automatically
      inserted into the output. Hans has stated that every warning should have
      a unique identifier. Some cases lack them, others really shouldn't have them.
      reiserfs_warning() now expects an id associated with each message. In the
      rare case where one isn't needed, "" will suffice.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c3a9c210
    • J
      reiserfs: rework reiserfs_warning · 45b03d5e
      Jeff Mahoney 提交于
      ReiserFS warnings can be somewhat inconsistent.
      In some cases:
       * a unique identifier may be associated with it
       * the function name may be included
       * the device may be printed separately
      
      This patch aims to make warnings more consistent. reiserfs_warning() prints
      the device name, so printing it a second time is not required. The function
      name for a warning is always helpful in debugging, so it is now automatically
      inserted into the output. Hans has stated that every warning should have
      a unique identifier. Some cases lack them, others really shouldn't have them.
      reiserfs_warning() now expects an id associated with each message. In the
      rare case where one isn't needed, "" will suffice.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      45b03d5e
    • J
      reiserfs: audit transaction ids to always be unsigned ints · 600ed416
      Jeff Mahoney 提交于
      This patch fixes up the reiserfs code such that transaction ids are
      always unsigned ints.  In places they can currently be signed ints or
      unsigned longs.
      
      The former just causes an annoying clm-2200 warning and may join a
      transaction when it should wait.
      
      The latter is just for correctness since the disk format uses a 32-bit
      transaction id.  There aren't any runtime problems that result from it
      not wrapping at the correct location since the value is truncated
      correctly even on big endian systems.  The 0 value might make it to
      disk, but the mount-time checks will bump it to 10 itself.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      600ed416
  4. 21 10月, 2008 4 次提交
  5. 05 8月, 2008 2 次提交
  6. 26 7月, 2008 3 次提交
  7. 30 4月, 2008 1 次提交
  8. 28 4月, 2008 2 次提交
  9. 19 4月, 2008 1 次提交
  10. 20 10月, 2007 2 次提交
  11. 17 10月, 2007 3 次提交
  12. 09 5月, 2007 2 次提交
  13. 30 11月, 2006 1 次提交
  14. 22 11月, 2006 1 次提交
  15. 21 10月, 2006 1 次提交
    • A
      [PATCH] separate bdi congestion functions from queue congestion functions · 3fcfab16
      Andrew Morton 提交于
      Separate out the concept of "queue congestion" from "backing-dev congestion".
      Congestion is a backing-dev concept, not a queue concept.
      
      The blk_* congestion functions are retained, as wrappers around the core
      backing-dev congestion functions.
      
      This proper layering is needed so that NFS can cleanly use the congestion
      functions, and so that CONFIG_BLOCK=n actually links.
      
      Cc: "Thomas Maier" <balagi@justmail.de>
      Cc: "Jens Axboe" <jens.axboe@oracle.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Peter Osterlund <petero2@telia.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3fcfab16
  16. 04 10月, 2006 1 次提交
  17. 30 9月, 2006 1 次提交
    • C
      [PATCH] Fix reiserfs latencies caused by data=ordered · a3172027
      Chris Mason 提交于
      ReiserFS does periodic cleanup of old transactions in order to limit the
      length of time a journal replay may take after a crash.  Sometimes, writing
      metadata from an old (already committed) transaction may require committing
      a newer transaction, which also requires writing all data=ordered buffers.
      This can cause very long stalls on journal_begin.
      
      This patch makes sure new transactions will not need to be committed before
      trying a periodic reclaim of an old transaction.  It is low risk because if
      a bad decision is made, it just means a slightly longer journal replay
      after a crash.
      Signed-off-by: NChris Mason <mason@suse.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a3172027
  18. 01 7月, 2006 1 次提交
  19. 27 6月, 2006 1 次提交
  20. 26 3月, 2006 1 次提交
  21. 03 3月, 2006 1 次提交
  22. 02 2月, 2006 2 次提交