1. 13 2月, 2009 1 次提交
    • C
      Btrfs: make a lockdep class for the extent buffer locks · 4008c04a
      Chris Mason 提交于
      Btrfs is currently using spin_lock_nested with a nested value based
      on the tree depth of the block.  But, this doesn't quite work because
      the max tree depth is bigger than what spin_lock_nested can deal with,
      and because locks are sometimes taken before the level field is filled in.
      
      The solution here is to use lockdep_set_class_and_name instead, and to
      set the class before unlocking the pages when the block is read from the
      disk and just after init of a freshly allocated tree block.
      
      btrfs_clear_path_blocking is also changed to take the locks in the proper
      order, and it also makes sure all the locks currently held are properly
      set to blocking before it tries to retake the spinlocks.  Otherwise, lockdep
      gets upset about bad lock orderin.
      
      The lockdep magic cam from Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      4008c04a
  2. 10 2月, 2009 1 次提交
    • C
      Btrfs: don't use spin_is_contended · 284b066a
      Chris Mason 提交于
      Btrfs was using spin_is_contended to see if it should drop locks before
      doing extent allocations during btrfs_search_slot.  The idea was to avoid
      expensive searches in the tree unless the lock was actually contended.
      
      But, spin_is_contended is specific to the ticket spinlocks on x86, so this
      is causing compile errors everywhere else.
      
      In practice, the contention could easily appear some time after we started
      doing the extent allocation, and it makes more sense to always drop the lock
      instead.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      284b066a
  3. 04 2月, 2009 1 次提交
    • C
      Btrfs: Change btree locking to use explicit blocking points · b4ce94de
      Chris Mason 提交于
      Most of the btrfs metadata operations can be protected by a spinlock,
      but some operations still need to schedule.
      
      So far, btrfs has been using a mutex along with a trylock loop,
      most of the time it is able to avoid going for the full mutex, so
      the trylock loop is a big performance gain.
      
      This commit is step one for getting rid of the blocking locks entirely.
      btrfs_tree_lock takes a spinlock, and the code explicitly switches
      to a blocking lock when it starts an operation that can schedule.
      
      We'll be able get rid of the blocking locks in smaller pieces over time.
      Tracing allows us to find the most common cause of blocking, so we
      can start with the hot spots first.
      
      The basic idea is:
      
      btrfs_tree_lock() returns with the spin lock held
      
      btrfs_set_lock_blocking() sets the EXTENT_BUFFER_BLOCKING bit in
      the extent buffer flags, and then drops the spin lock.  The buffer is
      still considered locked by all of the btrfs code.
      
      If btrfs_tree_lock gets the spinlock but finds the blocking bit set, it drops
      the spin lock and waits on a wait queue for the blocking bit to go away.
      
      Much of the code that needs to set the blocking bit finishes without actually
      blocking a good percentage of the time.  So, an adaptive spin is still
      used against the blocking bit to avoid very high context switch rates.
      
      btrfs_clear_lock_blocking() clears the blocking bit and returns
      with the spinlock held again.
      
      btrfs_tree_unlock() can be called on either blocking or spinning locks,
      it does the right thing based on the blocking bit.
      
      ctree.c has a helper function to set/clear all the locked buffers in a
      path as blocking.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      b4ce94de
  4. 06 1月, 2009 1 次提交
  5. 30 9月, 2008 1 次提交
    • C
      Btrfs: add and improve comments · d352ac68
      Chris Mason 提交于
      This improves the comments at the top of many functions.  It didn't
      dive into the guts of functions because I was trying to
      avoid merging problems with the new allocator and back reference work.
      
      extent-tree.c and volumes.c were both skipped, and there is definitely
      more work todo in cleaning and commenting the code.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      d352ac68
  6. 25 9月, 2008 7 次提交
    • C
      btrfs_search_slot: reduce lock contention by cowing in two stages · 65b51a00
      Chris Mason 提交于
      A btree block cow has two parts, the first is to allocate a destination
      block and the second is to copy the old bock over.
      
      The first part needs locks in the extent allocation tree, and may need to
      do IO.  This changeset splits that into a separate function that can be
      called without any tree locks held.
      
      btrfs_search_slot is changed to drop its path and start over if it has
      to COW a contended block.  This often means that many writers will
      pre-alloc a new destination for a the same contended block, but they
      cache their prealloc for later use on lower levels in the tree.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      65b51a00
    • Y
      Btrfs: implement memory reclaim for leaf reference cache · bcc63abb
      Yan 提交于
      The memory reclaiming issue happens when snapshot exists. In that
      case, some cache entries may not be used during old snapshot dropping,
      so they will remain in the cache until umount.
      
      The patch adds a field to struct btrfs_leaf_ref to record create time. Besides,
      the patch makes all dead roots of a given snapshot linked together in order of
      create time. After a old snapshot was completely dropped, we check the dead
      root list and remove all cache entries created before the oldest dead root in
      the list.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      bcc63abb
    • C
    • C
      Btrfs: Use mutex_lock_nested for tree locking · 6dddcbeb
      Chris Mason 提交于
      Lockdep has the notion of locking subclasses so that you can identify
      locks you expect to be taken after other locks of the same class.  This
      changes the per-extent buffer btree locking routines to use a subclass based
      on the level in the tree.
      
      Unfortunately, lockdep can only handle 8 total subclasses, and the btrfs
      max level is also 8.  So when lockdep is on, use a lower max level.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      6dddcbeb
    • C
      Btrfs: Use a mutex in the extent buffer for tree block locking · a61e6f29
      Chris Mason 提交于
      This replaces the use of the page cache lock bit for locking, which wasn't
      suitable for block size < page size and couldn't be used recursively.
      
      The mutexes alone don't fix either problem, but they are the first step.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      a61e6f29
    • C
      Btrfs: Reduce contention on the root node · f9efa9c7
      Chris Mason 提交于
      This calls unlock_up sooner in btrfs_search_slot in order to decrease the
      amount of work done with the higher level tree locks held.
      
      Also, it changes btrfs_tree_lock to spin for a big against the page lock
      before scheduling.  This makes a big difference in context switch rate under
      highly contended workloads.
      
      Longer term, a better locking structure is needed than the page lock.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      f9efa9c7
    • C
      Btrfs: Start btree concurrency work. · 925baedd
      Chris Mason 提交于
      The allocation trees and the chunk trees are serialized via their own
      dedicated mutexes.  This means allocation location is still not very
      fine grained.
      
      The main FS btree is protected by locks on each block in the btree.  Locks
      are taken top / down, and as processing finishes on a given level of the
      tree, the lock is released after locking the lower level.
      
      The end result of a search is now a path where only the lowest level
      is locked.  Releasing or freeing the path drops any locks held.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      925baedd