1. 04 2月, 2009 1 次提交
    • C
      Btrfs: Change btree locking to use explicit blocking points · b4ce94de
      Chris Mason 提交于
      Most of the btrfs metadata operations can be protected by a spinlock,
      but some operations still need to schedule.
      
      So far, btrfs has been using a mutex along with a trylock loop,
      most of the time it is able to avoid going for the full mutex, so
      the trylock loop is a big performance gain.
      
      This commit is step one for getting rid of the blocking locks entirely.
      btrfs_tree_lock takes a spinlock, and the code explicitly switches
      to a blocking lock when it starts an operation that can schedule.
      
      We'll be able get rid of the blocking locks in smaller pieces over time.
      Tracing allows us to find the most common cause of blocking, so we
      can start with the hot spots first.
      
      The basic idea is:
      
      btrfs_tree_lock() returns with the spin lock held
      
      btrfs_set_lock_blocking() sets the EXTENT_BUFFER_BLOCKING bit in
      the extent buffer flags, and then drops the spin lock.  The buffer is
      still considered locked by all of the btrfs code.
      
      If btrfs_tree_lock gets the spinlock but finds the blocking bit set, it drops
      the spin lock and waits on a wait queue for the blocking bit to go away.
      
      Much of the code that needs to set the blocking bit finishes without actually
      blocking a good percentage of the time.  So, an adaptive spin is still
      used against the blocking bit to avoid very high context switch rates.
      
      btrfs_clear_lock_blocking() clears the blocking bit and returns
      with the spinlock held again.
      
      btrfs_tree_unlock() can be called on either blocking or spinning locks,
      it does the right thing based on the blocking bit.
      
      ctree.c has a helper function to set/clear all the locked buffers in a
      path as blocking.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      b4ce94de
  2. 06 1月, 2009 1 次提交
  3. 30 9月, 2008 1 次提交
    • C
      Btrfs: add and improve comments · d352ac68
      Chris Mason 提交于
      This improves the comments at the top of many functions.  It didn't
      dive into the guts of functions because I was trying to
      avoid merging problems with the new allocator and back reference work.
      
      extent-tree.c and volumes.c were both skipped, and there is definitely
      more work todo in cleaning and commenting the code.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      d352ac68
  4. 25 9月, 2008 7 次提交
    • C
      btrfs_search_slot: reduce lock contention by cowing in two stages · 65b51a00
      Chris Mason 提交于
      A btree block cow has two parts, the first is to allocate a destination
      block and the second is to copy the old bock over.
      
      The first part needs locks in the extent allocation tree, and may need to
      do IO.  This changeset splits that into a separate function that can be
      called without any tree locks held.
      
      btrfs_search_slot is changed to drop its path and start over if it has
      to COW a contended block.  This often means that many writers will
      pre-alloc a new destination for a the same contended block, but they
      cache their prealloc for later use on lower levels in the tree.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      65b51a00
    • Y
      Btrfs: implement memory reclaim for leaf reference cache · bcc63abb
      Yan 提交于
      The memory reclaiming issue happens when snapshot exists. In that
      case, some cache entries may not be used during old snapshot dropping,
      so they will remain in the cache until umount.
      
      The patch adds a field to struct btrfs_leaf_ref to record create time. Besides,
      the patch makes all dead roots of a given snapshot linked together in order of
      create time. After a old snapshot was completely dropped, we check the dead
      root list and remove all cache entries created before the oldest dead root in
      the list.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      bcc63abb
    • C
    • C
      Btrfs: Use mutex_lock_nested for tree locking · 6dddcbeb
      Chris Mason 提交于
      Lockdep has the notion of locking subclasses so that you can identify
      locks you expect to be taken after other locks of the same class.  This
      changes the per-extent buffer btree locking routines to use a subclass based
      on the level in the tree.
      
      Unfortunately, lockdep can only handle 8 total subclasses, and the btrfs
      max level is also 8.  So when lockdep is on, use a lower max level.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      6dddcbeb
    • C
      Btrfs: Use a mutex in the extent buffer for tree block locking · a61e6f29
      Chris Mason 提交于
      This replaces the use of the page cache lock bit for locking, which wasn't
      suitable for block size < page size and couldn't be used recursively.
      
      The mutexes alone don't fix either problem, but they are the first step.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      a61e6f29
    • C
      Btrfs: Reduce contention on the root node · f9efa9c7
      Chris Mason 提交于
      This calls unlock_up sooner in btrfs_search_slot in order to decrease the
      amount of work done with the higher level tree locks held.
      
      Also, it changes btrfs_tree_lock to spin for a big against the page lock
      before scheduling.  This makes a big difference in context switch rate under
      highly contended workloads.
      
      Longer term, a better locking structure is needed than the page lock.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f9efa9c7
    • C
      Btrfs: Start btree concurrency work. · 925baedd
      Chris Mason 提交于
      The allocation trees and the chunk trees are serialized via their own
      dedicated mutexes.  This means allocation location is still not very
      fine grained.
      
      The main FS btree is protected by locks on each block in the btree.  Locks
      are taken top / down, and as processing finishes on a given level of the
      tree, the lock is released after locking the lower level.
      
      The end result of a search is now a path where only the lowest level
      is locked.  Releasing or freeing the path drops any locks held.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      925baedd