1. 22 9月, 2009 2 次提交
  2. 14 9月, 2009 2 次提交
    • F
      kill-the-BKL/reiserfs: only acquire the write lock once in reiserfs_dirty_inode · dc8f6d89
      Frederic Weisbecker 提交于
      Impact: fix a deadlock
      
      reiserfs_dirty_inode() is the super_operations::dirty_inode() callback
      of reiserfs. It can be called from different contexts where the write
      lock can be already held.
      
      But this function also grab the write lock (possibly recursively).
      Subsequent release of the lock before sleep will actually not release
      the lock if the caller of mark_inode_dirty() (which in turn calls
      reiserfs_dirty_inode()) already owns the lock.
      
      A typical case:
      
      reiserfs_write_end() {
      	acquire_write_lock()
      	mark_inode_dirty() {
      		reiserfs_dirty_inode() {
      			reacquire_write_lock() {
      				journal_begin() {
      					do_journal_begin_r() {
      						/*
      						 * fail to release, still
      						 * one depth of lock
      						 */
      						release_write_lock()
      						reiserfs_wait_on_write_block() {
      							wait_event()
      
      The event is usually provided by something which needs the write lock but
      it hasn't been released.
      
      We use reiserfs_write_lock_once() here to ensure we only grab the
      write lock in one level.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@texware.it>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      LKML-Reference: <1239680065-25013-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dc8f6d89
    • F
      reiserfs: kill-the-BKL · 8ebc4232
      Frederic Weisbecker 提交于
      This patch is an attempt to remove the Bkl based locking scheme from
      reiserfs and is intended.
      
      It is a bit inspired from an old attempt by Peter Zijlstra:
      
         http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html
      
      The bkl is heavily used in this filesystem to prevent from
      concurrent write accesses on the filesystem.
      
      Reiserfs makes a deep use of the specific properties of the Bkl:
      
      - It can be acqquired recursively by a same task
      - It is released on the schedule() calls and reacquired when schedule() returns
      
      The two properties above are a roadmap for the reiserfs write locking so it's
      very hard to simply replace it with a common mutex.
      
      - We need a recursive-able locking unless we want to restructure several blocks
        of the code.
      - We need to identify the sites where the bkl was implictly relaxed
        (schedule, wait, sync, etc...) so that we can in turn release and
        reacquire our new lock explicitly.
        Such implicit releases of the lock are often required to let other
        resources producer/consumer do their job or we can suffer unexpected
        starvations or deadlocks.
      
      So the new lock that replaces the bkl here is a per superblock mutex with a
      specific property: it can be acquired recursively by a same task, like the
      bkl.
      
      For such purpose, we integrate a lock owner and a lock depth field on the
      superblock information structure.
      
      The first axis on this patch is to turn reiserfs_write_(un)lock() function
      into a wrapper to manage this mutex. Also some explicit calls to
      lock_kernel() have been converted to reiserfs_write_lock() helpers.
      
      The second axis is to find the important blocking sites (schedule...(),
      wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
      release of the write lock on these locations before blocking. Then we can
      safely wait for those who can give us resources or those who need some.
      Typically this is a fight between the current writer, the reiserfs workqueue
      (aka the async commiter) and the pdflush threads.
      
      The third axis is a consequence of the second. The write lock is usually
      on top of a lock dependency chain which can include the journal lock, the
      flush lock or the commit lock. So it's dangerous to release and trying to
      reacquire the write lock while we still hold other locks.
      
      This is fine with the bkl:
      
            T1                       T2
      
      lock_kernel()
          mutex_lock(A)
          unlock_kernel()
          // do something
                                  lock_kernel()
                                      mutex_lock(A) -> already locked by T1
                                      schedule() (and then unlock_kernel())
          lock_kernel()
          mutex_unlock(A)
          ....
      
      This is not fine with a mutex:
      
            T1                       T2
      
      mutex_lock(write)
          mutex_lock(A)
          mutex_unlock(write)
          // do something
                                 mutex_lock(write)
                                    mutex_lock(A) -> already locked by T1
                                    schedule()
      
          mutex_lock(write) -> already locked by T2
          deadlock
      
      The solution in this patch is to provide a helper which releases the write
      lock and sleep a bit if we can't lock a mutex that depend on it. It's another
      simulation of the bkl behaviour.
      
      The last axis is to locate the fs callbacks that are called with the bkl held,
      according to Documentation/filesystem/Locking.
      
      Those are:
      
      - reiserfs_remount
      - reiserfs_fill_super
      - reiserfs_put_super
      
      Reiserfs didn't need to explicitly lock because of the context of these callbacks.
      But now we must take care of that with the new locking.
      
      After this patch, reiserfs suffers from a slight performance regression (for now).
      On UP, a high volume write with dd reports an average of 27 MB/s instead
      of 30 MB/s without the patch applied.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Bron Gondwana <brong@fastmail.fm>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      LKML-Reference: <1239070789-13354-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ebc4232
  3. 09 7月, 2009 1 次提交
  4. 24 6月, 2009 2 次提交
  5. 12 6月, 2009 5 次提交
  6. 18 5月, 2009 1 次提交
  7. 09 5月, 2009 2 次提交
  8. 03 4月, 2009 1 次提交
    • C
      fs/reiserfs: return f_fsid for statfs(2) · 651d0623
      Coly Li 提交于
      Make reiserfs3 return f_fsid info for statfs(2).  By Andreas' suggestion,
      this patch populates a persistent f_fsid between boots/mounts with help of
      on-disk uuid record.
      
      Randy Dunlap reported a compiling error from v2 patch like:
          fs/built-in.o: In function `reiserfs_statfs':
          super.c:(.text+0x7332b): undefined reference to `crc32_le'
          super.c:(.text+0x7333f): undefined reference to `crc32_le'
      Also he provided helpful solution to fix this error. The modification of v3
      patch is based on Randy's suggestion, add 'select CRC32' in fs/reiserfs/Kconfig.
      Signed-off-by: NColy Li <coly.li@suse.de>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      651d0623
  9. 31 3月, 2009 7 次提交
  10. 26 3月, 2009 2 次提交
  11. 10 1月, 2009 1 次提交
    • T
      filesystem freeze: add error handling of write_super_lockfs/unlockfs · c4be0c1d
      Takashi Sato 提交于
      Currently, ext3 in mainline Linux doesn't have the freeze feature which
      suspends write requests.  So, we cannot take a backup which keeps the
      filesystem's consistency with the storage device's features (snapshot and
      replication) while it is mounted.
      
      In many case, a commercial filesystem (e.g.  VxFS) has the freeze feature
      and it would be used to get the consistent backup.
      
      If Linux's standard filesystem ext3 has the freeze feature, we can do it
      without a commercial filesystem.
      
      So I have implemented the ioctls of the freeze feature.
      I think we can take the consistent backup with the following steps.
      1. Freeze the filesystem with the freeze ioctl.
      2. Separate the replication volume or create the snapshot
         with the storage device's feature.
      3. Unfreeze the filesystem with the unfreeze ioctl.
      4. Take the backup from the separated replication volume
         or the snapshot.
      
      This patch:
      
      VFS:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that they can return an error.
      Rename write_super_lockfs and unlockfs of the super block operation
      freeze_fs and unfreeze_fs to avoid a confusion.
      
      ext3, ext4, xfs, gfs2, jfs:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that write_super_lockfs returns an error if needed,
      and unlockfs always returns 0.
      
      reiserfs:
      Changed the type of write_super_lockfs and unlockfs from "void"
      to "int" so that they always return 0 (success) to keep a current behavior.
      Signed-off-by: NTakashi Sato <t-sato@yk.jp.nec.com>
      Signed-off-by: NMasayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
      Cc: <xfs-masters@oss.sgi.com>
      Cc: <linux-ext4@vger.kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dave Kleikamp <shaggy@austin.ibm.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c4be0c1d
  12. 06 1月, 2009 2 次提交
  13. 23 10月, 2008 1 次提交
  14. 13 8月, 2008 1 次提交
  15. 01 8月, 2008 1 次提交
    • A
      [PATCH] fix races and leaks in vfs_quota_on() users · 77e69dac
      Al Viro 提交于
      * new helper: vfs_quota_on_path(); equivalent of vfs_quota_on() sans the
        pathname resolution.
      * callers of vfs_quota_on() that do their own pathname resolution and
        checks based on it are switched to vfs_quota_on_path(); that way we
        avoid the races.
      * reiserfs leaked dentry/vfsmount references on several failure exits.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      77e69dac
  16. 27 7月, 2008 1 次提交
  17. 26 7月, 2008 4 次提交
  18. 05 7月, 2008 1 次提交
  19. 28 4月, 2008 3 次提交