1. 10 8月, 2010 1 次提交
  2. 06 3月, 2010 1 次提交
  3. 02 1月, 2010 2 次提交
    • F
      reiserfs: Warn on lock relax if taken recursively · c4a62ca3
      Frederic Weisbecker 提交于
      When we relax the reiserfs lock to avoid creating unwanted
      dependencies against others locks while grabbing these,
      we want to ensure it has not been taken recursively, otherwise
      the lock won't be really relaxed. Only its depth will be decreased.
      The unwanted dependency would then actually happen.
      
      To prevent from that, add a reiserfs_lock_check_recursive() call
      in the places that need it.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      c4a62ca3
    • F
      reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion · 0719d343
      Frederic Weisbecker 提交于
      i_xattr_sem depends on the reiserfs lock. But after we grab
      i_xattr_sem, we may relax/relock the reiserfs lock while waiting
      on a freezed filesystem, creating a dependency inversion between
      the two locks.
      
      In order to avoid the i_xattr_sem -> reiserfs lock dependency, let's
      create a reiserfs_down_read_safe() that acts like
      reiserfs_mutex_lock_safe(): relax the reiserfs lock while grabbing
      another lock to avoid undesired dependencies induced by the
      heivyweight reiserfs lock.
      
      This fixes the following warning:
      
      [  990.005931] =======================================================
      [  990.012373] [ INFO: possible circular locking dependency detected ]
      [  990.013233] 2.6.33-rc1 #1
      [  990.013233] -------------------------------------------------------
      [  990.013233] dbench/1891 is trying to acquire lock:
      [  990.013233]  (&REISERFS_SB(s)->lock){+.+.+.}, at: [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
      [  990.013233]
      [  990.013233] but task is already holding lock:
      [  990.013233]  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]
      [  990.013233] which lock already depends on the new lock.
      [  990.013233]
      [  990.013233]
      [  990.013233] the existing dependency chain (in reverse order) is:
      [  990.013233]
      [  990.013233] -> #1 (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}:
      [  990.013233]        [<ffffffff81063afc>] __lock_acquire+0xf9c/0x1560
      [  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
      [  990.013233]        [<ffffffff814ac194>] down_write+0x44/0x80
      [  990.013233]        [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
      [  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
      [  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
      [  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
      [  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
      [  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
      [  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
      [  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
      [  990.013233]
      [  990.013233] -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
      [  990.013233]        [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
      [  990.013233]        [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
      [  990.013233]        [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
      [  990.013233]        [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
      [  990.013233]        [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
      [  990.013233]        [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
      [  990.013233]        [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
      [  990.013233]        [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
      [  990.013233]        [<ffffffff8115a6aa>] user_set+0x8a/0x90
      [  990.013233]        [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
      [  990.013233]        [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
      [  990.013233]        [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
      [  990.013233]        [<ffffffff810e2780>] setxattr+0xc0/0x150
      [  990.013233]        [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
      [  990.013233]        [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
      [  990.013233]
      [  990.013233] other info that might help us debug this:
      [  990.013233]
      [  990.013233] 2 locks held by dbench/1891:
      [  990.013233]  #0:  (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff810e2678>] vfs_setxattr+0x78/0xc0
      [  990.013233]  #1:  (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]
      [  990.013233] stack backtrace:
      [  990.013233] Pid: 1891, comm: dbench Not tainted 2.6.33-rc1 #1
      [  990.013233] Call Trace:
      [  990.013233]  [<ffffffff81061639>] print_circular_bug+0xe9/0xf0
      [  990.013233]  [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
      [  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]  [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
      [  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
      [  990.013233]  [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
      [  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff81062592>] ? mark_held_locks+0x72/0xa0
      [  990.013233]  [<ffffffff814ab81d>] ? __mutex_unlock_slowpath+0xbd/0x140
      [  990.013233]  [<ffffffff810628ad>] ? trace_hardirqs_on_caller+0x14d/0x1a0
      [  990.013233]  [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
      [  990.013233]  [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
      [  990.013233]  [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
      [  990.013233]  [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
      [  990.013233]  [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
      [  990.013233]  [<ffffffff814abcb4>] ? __mutex_lock_common+0x284/0x3b0
      [  990.013233]  [<ffffffff8115a6aa>] user_set+0x8a/0x90
      [  990.013233]  [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
      [  990.013233]  [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
      [  990.013233]  [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
      [  990.013233]  [<ffffffff810e2780>] setxattr+0xc0/0x150
      [  990.013233]  [<ffffffff81056018>] ? sched_clock_cpu+0xb8/0x100
      [  990.013233]  [<ffffffff8105eded>] ? trace_hardirqs_off+0xd/0x10
      [  990.013233]  [<ffffffff810560a3>] ? cpu_clock+0x43/0x50
      [  990.013233]  [<ffffffff810c6820>] ? fget+0xb0/0x110
      [  990.013233]  [<ffffffff810c6770>] ? fget+0x0/0x110
      [  990.013233]  [<ffffffff81002ddc>] ? sysret_check+0x27/0x62
      [  990.013233]  [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
      [  990.013233]  [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
      Reported-and-tested-by: NChristian Kujau <lists@nerdbynature.de>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      0719d343
  4. 17 12月, 2009 1 次提交
    • F
      reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion · 47376ceb
      Frederic Weisbecker 提交于
      The reiserfs lock -> inode mutex dependency gets inverted when we
      relax the lock while walking to the tree.
      
      To fix this, use a specialized version of reiserfs_mutex_lock_safe
      that takes care of mutex subclasses. Then we can grab the inode
      mutex with I_MUTEX_XATTR subclass without any reiserfs lock
      dependency.
      
      This fixes the following report:
      
      [ INFO: possible circular locking dependency detected ]
      2.6.32-06793-gf4054253-dirty #2
      -------------------------------------------------------
      mv/18566 is trying to acquire lock:
       (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1110708>] reiserfs_write_lock+0x28=
      /0x40
      
      but task is already holding lock:
       (&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [<c111033c>]
      reiserfs_for_each_xattr+0x10c/0x380
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&sb->s_type->i_mutex_key#5/3){+.+.+.}:
             [<c104f723>] validate_chain+0xa23/0xf70
             [<c1050155>] __lock_acquire+0x4e5/0xa70
             [<c105075a>] lock_acquire+0x7a/0xa0
             [<c134c76f>] mutex_lock_nested+0x5f/0x2b0
             [<c11102b4>] reiserfs_for_each_xattr+0x84/0x380
             [<c1110615>] reiserfs_delete_xattrs+0x15/0x50
             [<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
             [<c10a565c>] generic_delete_inode+0x9c/0x150
             [<c10a574d>] generic_drop_inode+0x3d/0x60
             [<c10a4667>] iput+0x47/0x50
             [<c109cc0b>] do_unlinkat+0xdb/0x160
             [<c109cca0>] sys_unlink+0x10/0x20
             [<c1002c50>] sysenter_do_call+0x12/0x36
      
      -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
             [<c104fc68>] validate_chain+0xf68/0xf70
             [<c1050155>] __lock_acquire+0x4e5/0xa70
             [<c105075a>] lock_acquire+0x7a/0xa0
             [<c134c76f>] mutex_lock_nested+0x5f/0x2b0
             [<c1110708>] reiserfs_write_lock+0x28/0x40
             [<c1103d6b>] search_by_key+0x1f7b/0x21b0
             [<c10e73ef>] search_by_entry_key+0x1f/0x3b0
             [<c10e77f7>] reiserfs_find_entry+0x77/0x400
             [<c10e81e5>] reiserfs_lookup+0x85/0x130
             [<c109a144>] __lookup_hash+0xb4/0x110
             [<c109b763>] lookup_one_len+0xb3/0x100
             [<c1110350>] reiserfs_for_each_xattr+0x120/0x380
             [<c1110615>] reiserfs_delete_xattrs+0x15/0x50
             [<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
             [<c10a565c>] generic_delete_inode+0x9c/0x150
             [<c10a574d>] generic_drop_inode+0x3d/0x60
             [<c10a4667>] iput+0x47/0x50
             [<c10a1c4f>] dentry_iput+0x6f/0xf0
             [<c10a1d74>] d_kill+0x24/0x50
             [<c10a396b>] dput+0x5b/0x120
             [<c109ca89>] sys_renameat+0x1b9/0x230
             [<c109cb28>] sys_rename+0x28/0x30
             [<c1002c50>] sysenter_do_call+0x12/0x36
      
      other info that might help us debug this:
      
      2 locks held by mv/18566:
       #0:  (&sb->s_type->i_mutex_key#5/1){+.+.+.}, at: [<c109b6ac>]
      lock_rename+0xcc/0xd0
       #1:  (&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [<c111033c>]
      reiserfs_for_each_xattr+0x10c/0x380
      
      stack backtrace:
      Pid: 18566, comm: mv Tainted: G         C 2.6.32-06793-gf4054253-dirty #2
      Call Trace:
       [<c134b252>] ? printk+0x18/0x1e
       [<c104e790>] print_circular_bug+0xc0/0xd0
       [<c104fc68>] validate_chain+0xf68/0xf70
       [<c104c8cb>] ? trace_hardirqs_off+0xb/0x10
       [<c1050155>] __lock_acquire+0x4e5/0xa70
       [<c105075a>] lock_acquire+0x7a/0xa0
       [<c1110708>] ? reiserfs_write_lock+0x28/0x40
       [<c134c76f>] mutex_lock_nested+0x5f/0x2b0
       [<c1110708>] ? reiserfs_write_lock+0x28/0x40
       [<c1110708>] ? reiserfs_write_lock+0x28/0x40
       [<c134b60a>] ? schedule+0x27a/0x440
       [<c1110708>] reiserfs_write_lock+0x28/0x40
       [<c1103d6b>] search_by_key+0x1f7b/0x21b0
       [<c1050176>] ? __lock_acquire+0x506/0xa70
       [<c1051267>] ? lock_release_non_nested+0x1e7/0x340
       [<c1110708>] ? reiserfs_write_lock+0x28/0x40
       [<c104e354>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c104e3ab>] ? trace_hardirqs_on+0xb/0x10
       [<c1042a55>] ? T.316+0x15/0x1a0
       [<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
       [<c10e73ef>] search_by_entry_key+0x1f/0x3b0
       [<c134bf2a>] ? __mutex_unlock_slowpath+0x9a/0x120
       [<c104e354>] ? trace_hardirqs_on_caller+0x124/0x170
       [<c10e77f7>] reiserfs_find_entry+0x77/0x400
       [<c10e81e5>] reiserfs_lookup+0x85/0x130
       [<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
       [<c109a144>] __lookup_hash+0xb4/0x110
       [<c109b763>] lookup_one_len+0xb3/0x100
       [<c1110350>] reiserfs_for_each_xattr+0x120/0x380
       [<c110ffe0>] ? delete_one_xattr+0x0/0x1c0
       [<c1003342>] ? math_error+0x22/0x150
       [<c1110708>] ? reiserfs_write_lock+0x28/0x40
       [<c1110615>] reiserfs_delete_xattrs+0x15/0x50
       [<c1110708>] ? reiserfs_write_lock+0x28/0x40
       [<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
       [<c10a561f>] ? generic_delete_inode+0x5f/0x150
       [<c10ef4f0>] ? reiserfs_delete_inode+0x0/0x140
       [<c10a565c>] generic_delete_inode+0x9c/0x150
       [<c10a574d>] generic_drop_inode+0x3d/0x60
       [<c10a4667>] iput+0x47/0x50
       [<c10a1c4f>] dentry_iput+0x6f/0xf0
       [<c10a1d74>] d_kill+0x24/0x50
       [<c10a396b>] dput+0x5b/0x120
       [<c109ca89>] sys_renameat+0x1b9/0x230
       [<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
       [<c104c8cb>] ? trace_hardirqs_off+0xb/0x10
       [<c1042dde>] ? cpu_clock+0x4e/0x60
       [<c1350825>] ? do_page_fault+0x155/0x370
       [<c1041816>] ? up_read+0x16/0x30
       [<c1350825>] ? do_page_fault+0x155/0x370
       [<c109cb28>] sys_rename+0x28/0x30
       [<c1002c50>] sysenter_do_call+0x12/0x36
      Reported-by: NAlexander Beregalov <a.beregalov@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      47376ceb
  5. 16 12月, 2009 2 次提交
  6. 15 10月, 2009 1 次提交
    • F
      kill-the-bkl/reiserfs: definitely drop the bkl from reiserfs_ioctl() · 205cb37b
      Frederic Weisbecker 提交于
      The reiserfs ioctl path doesn't need the big kernel lock anymore , now
      that the filesystem synchronizes through its own lock.
      
      We can then turn reiserfs_ioctl() into an unlocked_ioctl callback.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Laurent Riffard <laurent.riffard@free.fr>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      205cb37b
  7. 14 9月, 2009 6 次提交
    • F
      kill-the-bkl/reiserfs: acquire the inode mutex safely · c72e0575
      Frederic Weisbecker 提交于
      While searching a pathname, an inode mutex can be acquired
      in do_lookup() which calls reiserfs_lookup() which in turn
      acquires the write lock.
      
      On the other side reiserfs_fill_super() can acquire the write_lock
      and then call reiserfs_lookup_privroot() which can acquire an
      inode mutex (the root of the mount point).
      
      So we theoretically risk an AB - BA lock inversion that could lead
      to a deadlock.
      
      As for other lock dependencies found since the bkl to mutex
      conversion, the fix is to use reiserfs_mutex_lock_safe() which
      drops the lock dependency to the write lock.
      
      [ Impact: fix a possible deadlock with reiserfs ]
      
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      c72e0575
    • F
      kill-the-bkl/reiserfs: conditionaly release the write lock on fs_changed() · d663af80
      Frederic Weisbecker 提交于
      The goal of fs_changed() is to check whether the tree changed during a
      schedule(). This is a BKL legacy.
      
      A recent patch added an explicit unconditional release/reacquire of the
      write lock around the cond_resched() called inside fs_changed.
      
      But it's wasteful to unconditionally do that, we are creating superfluous
      lock contention in !TIF_NEED_RESCHED case.
      
      This patch manage that by calling reiserfs_cond_resched() from fs_changed()
      which only releases the lock if we are going to reschedule.
      
      [ Impact: inject less lock contention and tree job retries ]
      
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      d663af80
    • F
      kill-the-BKL/reiserfs: add reiserfs_cond_resched() · e43d3f21
      Frederic Weisbecker 提交于
      Usually, when we call cond_resched(), we want the write lock
      to be released and then reacquired once we return from scheduling.
      Not only does it follow the previous bkl based locking scheme, but
      it also let other waiters to get the lock.
      
      But if we aren't going to reschedule(), such as in !TIF_NEED_RESCHED
      case, it's useless to release the lock. Worse, if we release and reacquire
      the lock whereas it is not needed, we create useless contentions. Also
      if someone takes the lock while we are modifying or reading the tree,
      there are good chances we'll have to retry our operation, eg if the
      block we were seeeking has moved.
      
      So this patch introduces a helper which only unlock the write lock
      if we are going to schedule.
      
      [ Impact: prepare to inject less lock contention and less tree operation attempts ]
      Reported-by: NAndi Kleen <andi@firstfloor.org>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      e43d3f21
    • F
      kill-the-BKL/reiserfs: release write lock on fs_changed() · f32049dc
      Frederic Weisbecker 提交于
      fs_changed() is a macro used by reiserfs to check whether its tree has been
      rebalanced. It has been designed to check parallel changes on the tree after
      calling a sleeping function, which released the Bkl.
      
      fs_changed() also calls cond_resched(), so that if rescheduling is needed,
      we are in the best place to do that, since we check if the tree has changed
      just after (because of the bkl release on schedule()).
      
      Even if we are not anymore using the Bkl, we still want to release the lock
      while we reschedule, so that other waiters for the lock can acquire it safely,
      because of the following __fs_changed() check.
      
      [ Impact: release the reiserfs write lock when it is not needed ]
      
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      f32049dc
    • F
      kill-the-BKL/reiserfs: provide a tool to lock only once the write lock · daf88c89
      Frederic Weisbecker 提交于
      Sometimes we don't want to recursively hold the per superblock write
      lock because we want to be sure it is actually released when we come
      to sleep.
      
      This patch introduces the necessary tools for that.
      
      reiserfs_write_lock_once() does the same job than reiserfs_write_lock()
      except that it won't try to acquire recursively the lock if the current
      task already owns it. Also the lock_depth before the call of this function
      is returned.
      
      reiserfs_write_unlock_once() unlock only if reiserfs_write_lock_once()
      returned a depth equal to -1, ie: only if it actually locked.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Alessio Igor Bogani <abogani@texware.it>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Alexander Beregalov <a.beregalov@gmail.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      LKML-Reference: <1239680065-25013-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      daf88c89
    • F
      reiserfs: kill-the-BKL · 8ebc4232
      Frederic Weisbecker 提交于
      This patch is an attempt to remove the Bkl based locking scheme from
      reiserfs and is intended.
      
      It is a bit inspired from an old attempt by Peter Zijlstra:
      
         http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html
      
      The bkl is heavily used in this filesystem to prevent from
      concurrent write accesses on the filesystem.
      
      Reiserfs makes a deep use of the specific properties of the Bkl:
      
      - It can be acqquired recursively by a same task
      - It is released on the schedule() calls and reacquired when schedule() returns
      
      The two properties above are a roadmap for the reiserfs write locking so it's
      very hard to simply replace it with a common mutex.
      
      - We need a recursive-able locking unless we want to restructure several blocks
        of the code.
      - We need to identify the sites where the bkl was implictly relaxed
        (schedule, wait, sync, etc...) so that we can in turn release and
        reacquire our new lock explicitly.
        Such implicit releases of the lock are often required to let other
        resources producer/consumer do their job or we can suffer unexpected
        starvations or deadlocks.
      
      So the new lock that replaces the bkl here is a per superblock mutex with a
      specific property: it can be acquired recursively by a same task, like the
      bkl.
      
      For such purpose, we integrate a lock owner and a lock depth field on the
      superblock information structure.
      
      The first axis on this patch is to turn reiserfs_write_(un)lock() function
      into a wrapper to manage this mutex. Also some explicit calls to
      lock_kernel() have been converted to reiserfs_write_lock() helpers.
      
      The second axis is to find the important blocking sites (schedule...(),
      wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
      release of the write lock on these locations before blocking. Then we can
      safely wait for those who can give us resources or those who need some.
      Typically this is a fight between the current writer, the reiserfs workqueue
      (aka the async commiter) and the pdflush threads.
      
      The third axis is a consequence of the second. The write lock is usually
      on top of a lock dependency chain which can include the journal lock, the
      flush lock or the commit lock. So it's dangerous to release and trying to
      reacquire the write lock while we still hold other locks.
      
      This is fine with the bkl:
      
            T1                       T2
      
      lock_kernel()
          mutex_lock(A)
          unlock_kernel()
          // do something
                                  lock_kernel()
                                      mutex_lock(A) -> already locked by T1
                                      schedule() (and then unlock_kernel())
          lock_kernel()
          mutex_unlock(A)
          ....
      
      This is not fine with a mutex:
      
            T1                       T2
      
      mutex_lock(write)
          mutex_lock(A)
          mutex_unlock(write)
          // do something
                                 mutex_lock(write)
                                    mutex_lock(A) -> already locked by T1
                                    schedule()
      
          mutex_lock(write) -> already locked by T2
          deadlock
      
      The solution in this patch is to provide a helper which releases the write
      lock and sleep a bit if we can't lock a mutex that depend on it. It's another
      simulation of the bkl behaviour.
      
      The last axis is to locate the fs callbacks that are called with the bkl held,
      according to Documentation/filesystem/Locking.
      
      Those are:
      
      - reiserfs_remount
      - reiserfs_fill_super
      - reiserfs_put_super
      
      Reiserfs didn't need to explicitly lock because of the context of these callbacks.
      But now we must take care of that with the new locking.
      
      After this patch, reiserfs suffers from a slight performance regression (for now).
      On UP, a high volume write with dd reports an average of 27 MB/s instead
      of 30 MB/s without the patch applied.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: NIngo Molnar <mingo@elte.hu>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Bron Gondwana <brong@fastmail.fm>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      LKML-Reference: <1239070789-13354-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8ebc4232
  8. 19 6月, 2009 1 次提交
    • J
      reiserfs: fix warnings with gcc 4.4 · 1d965fe0
      Jeff Mahoney 提交于
      Several code paths in reiserfs have a construct like:
      
       if (is_direntry_le_ih(ih = B_N_PITEM_HEAD(src, item_num))) ...
      
      which, in addition to being ugly, end up causing compiler warnings with
      gcc 4.4.0.  Previous compilers didn't issue a warning.
      
      fs/reiserfs/do_balan.c:1273: warning: operation on `aux_ih' may be undefined
      fs/reiserfs/lbalance.c:393: warning: operation on `ih' may be undefined
      fs/reiserfs/lbalance.c:421: warning: operation on `ih' may be undefined
      fs/reiserfs/lbalance.c:777: warning: operation on `ih' may be undefined
      
      I believe this is due to the ih being passed to macros which evaluate the
      argument more than once.  This is old code and we haven't seen any
      problems with it, but this patch eliminates the warnings.
      
      It converts the multiple evaluation macros to static inlines and does a
      preassignment for the cases that were causing the warnings because that
      code is just ugly.
      Reported-by: NChris Mason <mason@oracle.com>
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1d965fe0
  9. 31 3月, 2009 17 次提交
  10. 03 2月, 2009 2 次提交
    • J
      headers_check fix cleanup: linux/reiserfs_fs.h · 750e1c18
      Jaswinder Singh Rajput 提交于
      Only REISERFS_IOC_* definitions are required for user space
      rest should be in #ifdef __KERNEL__ as pointed by Arnd Bergmann.
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      750e1c18
    • J
      headers_check fix: linux/reinserfs_fs.h · 11d9f653
      Jaswinder Singh Rajput 提交于
      fix the following 'make headers_check' warnings:
      
        usr/include/linux/reiserfs_fs.h:687: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:995: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:997: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1467: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1760: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1764: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1766: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1769: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1771: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1805: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1948: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1949: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1950: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1951: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1962: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1963: extern's make no sense in userspace
        usr/include/linux/reiserfs_fs.h:1964: extern's make no sense in userspace
      Signed-off-by: NJaswinder Singh Rajput <jaswinderrajput@gmail.com>
      11d9f653
  11. 17 10月, 2008 1 次提交
  12. 26 7月, 2008 1 次提交
  13. 28 4月, 2008 1 次提交
    • J
      reiserfs: unpack tails on quota files · d5dee5c3
      Jan Kara 提交于
      Quota files cannot have tails because quota_write and quota_read functions do
      not support them.  So far when quota files did have tail, we just refused to
      turn quotas on it.  Sadly this check has been wrong and so there are now
      plenty installations where quota files don't have NOTAIL flag set and so now
      after fixing the check, they suddently fail to turn quotas on.  Since it's
      easy to unpack the tail from kernel, do this from reiserfs_quota_on() which
      solves the problem and is generally nicer to users anyway.
      Signed-off-by: NJan Kara <jack@suse.cz>
      Reported-by: <urhausen@urifabi.net>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d5dee5c3
  14. 09 2月, 2008 1 次提交
  15. 22 10月, 2007 1 次提交
  16. 20 10月, 2007 1 次提交