1. 29 10月, 2010 1 次提交
    • J
      Btrfs: create special free space cache inode · 0af3d00b
      Josef Bacik 提交于
      In order to save free space cache, we need an inode to hold the data, and we
      need a special item to point at the right inode for the right block group.  So
      first, create a special item that will point to the right inode, and the number
      of extent entries we will have and the number of bitmaps we will have.  We
      truncate and pre-allocate space everytime to make sure it's uptodate.
      
      This feature will be turned on as soon as you mount with -o space_cache, however
      it is safe to boot into old kernels, they will just generate the cache the old
      fashion way.  When you boot back into a newer kernel we will notice that we
      modified and not the cache and automatically discard the cache.
      Signed-off-by: NJosef Bacik <josef@redhat.com>
      0af3d00b
  2. 24 7月, 2009 1 次提交
    • J
      Btrfs: use hybrid extents+bitmap rb tree for free space · 96303081
      Josef Bacik 提交于
      Currently btrfs has a problem where it can use a ridiculous amount of RAM simply
      tracking free space.  As free space gets fragmented, we end up with thousands of
      entries on an rb-tree per block group, which usually spans 1 gig of area.  Since
      we currently don't ever flush free space cache back to disk this gets to be a
      bit unweildly on large fs's with lots of fragmentation.
      
      This patch solves this problem by using PAGE_SIZE bitmaps for parts of the free
      space cache.  Initially we calculate a threshold of extent entries we can
      handle, which is however many extent entries we can cram into 16k of ram.  The
      maximum amount of RAM that should ever be used to track 1 gigabyte of diskspace
      will be 32k of RAM, which scales much better than we did before.
      
      Once we pass the extent threshold, we start adding bitmaps and using those
      instead for tracking the free space.  This patch also makes it so that any free
      space thats less than 4 * sectorsize we go ahead and put into a bitmap.  This is
      nice since we try and allocate out of the front of a block group, so if the
      front of a block group is heavily fragmented and then has a huge chunk of free
      space at the end, we go ahead and add the fragmented areas to bitmaps and use a
      normal extent entry to track the big chunk at the back of the block group.
      
      I've also taken the opportunity to revamp how we search for free space.
      Previously we indexed free space via an offset indexed rb tree and a bytes
      indexed rb tree.  I've dropped the bytes indexed rb tree and use only the offset
      indexed rb tree.  This cuts the number of tree operations we were doing
      previously down by half, and gives us a little bit of a better allocation
      pattern since we will always start from a specific offset and search forward
      from there, instead of searching for the size we need and try and get it as
      close as possible to the offset we want.
      
      I've given this a healthy amount of testing pre-new format stuff, as well as
      post-new format stuff.  I've booted up my fedora box which is installed on btrfs
      with this patch and ran with it for a few days without issues.  I've not seen
      any performance regressions in any of my tests.
      
      Since the last patch Yan Zheng fixed a problem where we could have overlapping
      entries, so updating their offset inline would cause problems.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@redhat.com>
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      96303081
  3. 10 6月, 2009 1 次提交
    • C
      Btrfs: add mount -o ssd_spread to spread allocations out · 451d7585
      Chris Mason 提交于
      Some SSDs perform best when reusing block numbers often, while
      others perform much better when clustering strictly allocates
      big chunks of unused space.
      
      The default mount -o ssd will find rough groupings of blocks
      where there are a bunch of free blocks that might have some
      allocated blocks mixed in.
      
      mount -o ssd_spread will make sure there are no allocated blocks
      mixed in.  It should perform better on lower end SSDs.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      451d7585
  4. 03 4月, 2009 1 次提交
    • C
      Btrfs: rework allocation clustering · fa9c0d79
      Chris Mason 提交于
      Because btrfs is copy-on-write, we end up picking new locations for
      blocks very often.  This makes it fairly difficult to maintain perfect
      read patterns over time, but we can at least do some optimizations
      for writes.
      
      This is done today by remembering the last place we allocated and
      trying to find a free space hole big enough to hold more than just one
      allocation.  The end result is that we tend to write sequentially to
      the drive.
      
      This happens all the time for metadata and it happens for data
      when mounted -o ssd.  But, the way we record it is fairly racey
      and it tends to fragment the free space over time because we are trying
      to allocate fairly large areas at once.
      
      This commit gets rid of the races by adding a free space cluster object
      with dedicated locking to make sure that only one process at a time
      is out replacing the cluster.
      
      The free space fragmentation is somewhat solved by allowing a cluster
      to be comprised of smaller free space extents.  This part definitely
      adds some CPU time to the cluster allocations, but it allows the allocator
      to consume the small holes left behind by cow.
      Signed-off-by: NChris Mason <chris.mason@oracle.com>
      fa9c0d79