1. 12 11月, 2013 12 次提交
  2. 21 9月, 2013 4 次提交
  3. 01 9月, 2013 13 次提交
  4. 20 7月, 2013 3 次提交
  5. 02 7月, 2013 4 次提交
    • J
      Btrfs: make the chunk allocator completely tree lockless · 6df9a95e
      Josef Bacik 提交于
      When adjusting the enospc rules for relocation I ran into a deadlock because we
      were relocating the only system chunk and that forced us to try and allocate a
      new system chunk while holding locks in the chunk tree, which caused us to
      deadlock.  To fix this I've moved all of the dev extent addition and chunk
      addition out to the delayed chunk completion stuff.  We still keep the in-memory
      stuff which makes sure everything is consistent.
      
      One change I had to make was to search the commit root of the device tree to
      find a free dev extent, and hold onto any chunk em's that we allocated in that
      transaction so we do not allocate the same dev extent twice.  This has the side
      effect of fixing a bug with balance that has been there ever since balance
      existed.  Basically you can free a block group and it's dev extent and then
      immediately allocate that dev extent for a new block group and write stuff to
      that dev extent, all within the same transaction.  So if you happen to crash
      during a balance you could come back to a completely broken file system.  This
      patch should keep these sort of things from happening in the future since we
      won't be able to allocate free'd dev extents until after the transaction
      commits.  This has passed all of the xfstests and my super annoying stress test
      followed by a balance.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      6df9a95e
    • J
      Btrfs: check if we can nocow if we don't have data space · 7ee9e440
      Josef Bacik 提交于
      We always just try and reserve data space when we write, but if we are out of
      space but have prealloc'ed extents we should still successfully write.  This
      patch will try and see if we can write to prealloc'ed space and if we can go
      ahead and allow the write to continue.  With this patch we now pass xfstests
      generic/274.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      7ee9e440
    • J
      Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc · 925a6efb
      Josef Bacik 提交于
      try_to_writeback_inodes_sb_nr returns 1 if writeback is already underway, which
      is completely fraking useless for us as we need to make sure pages are actually
      written before we go and check if there are ordered extents.  So replace this
      with an open coding of try_to_writeback_inodes_sb_nr minus the writeback
      underway check so that we are sure to actually have flushed some dirty pages out
      and will have ordered extents to use.  With this patch xfstests generic/273 now
      passes.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      925a6efb
    • J
      Btrfs: use a percpu to keep track of possibly pinned bytes · b150a4f1
      Josef Bacik 提交于
      There are all of these checks in the ENOSPC code to see if committing the
      transaction would free up enough space to make the allocation.  This is because
      early on we just committed the transaction and hoped and prayed, which resulted
      in cases where it took _forever_ to get an ENOSPC when we really were out of
      space.  So we check space_info->bytes_pinned, except this isn't completely true
      because it doesn't account for space we may free but are stuck in delayed refs.
      So tests like xfstests 226 would fail because we wouldn't commit the transaction
      to free up the data space.  So instead add a percpu counter that will be a
      little fuzzier, it will add bytes as soon as we try to free up the space, and
      remove any space it doesn't actually free up when we get around to doing the
      actual free.  We then 0 out this counter every transaction period so we have a
      better idea of how much space we will actually free up by committing this
      transaction.  With this patch we now pass xfstests 226.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      b150a4f1
  6. 01 7月, 2013 2 次提交
    • J
      Btrfs: fix transaction throttling for delayed refs · 1be41b78
      Josef Bacik 提交于
      Dave has this fs_mark script that can make btrfs abort with sufficient amount of
      ram.  This is because with more ram we can keep more dirty metadata in cache
      which in a round about way makes for many more pending delayed refs.  What
      happens is we end up not throttling the transaction enough so when we go to
      commit the transaction when we've completely filled the file system we'll
      abort() because we use all of the space in the global reserve and we still have
      delayed refs to run.  To fix this we need to make the delayed ref flushing and
      the transaction throttling dependant upon the number of delayed refs that we
      have instead of how much reserved space is left in the global reserve.  With
      this patch we not only stop aborting transactions but we also get a smoother run
      speed with fs_mark and it makes us about 10% faster.  Thanks,
      Reported-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      1be41b78
    • J
      Btrfs: wake up delayed ref flushing waiters on abort · f971fe29
      Josef Bacik 提交于
      I hit a deadlock because we aborted when flushing delayed refs but didn't wake
      any of the other flushers up and so everybody was just sleeping forever.  This
      should fix the problem.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      f971fe29
  7. 14 6月, 2013 2 次提交
    • J
      Btrfs: exclude logged extents before replying when we are mixed · 8c2a1a30
      Josef Bacik 提交于
      With non-mixed block groups we replay the logs before we're allowed to do any
      writes, so we get away with not pinning/removing the data extents until right
      when we replay them.  However with mixed block groups we allocate out of the
      same pool, so we could easily allocate a metadata block that was logged in our
      tree log.  To deal with this we just need to notice that we have mixed block
      groups and do the normal excluding/removal dance during the pin stage of the log
      replay and that way we don't allocate metadata blocks from areas we have logged
      data extents.  With this patch we now pass xfstests generic/311 with mixed
      block groups turned on.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      8c2a1a30
    • J
      Btrfs: simplify unlink reservations · d52be818
      Josef Bacik 提交于
      Dave pointed out a problem where if you filled up a file system as much as
      possible you couldn't remove any files.  The whole unlink reservation thing is
      convoluted because it tries to guess if it's going to add space to unlink
      something or not, and has all these odd uncommented cases where it simply does
      not try.  So to fix this I've added a way to conditionally steal from the global
      reserve if we can't make our normal reservation.  If we have more than half the
      space in the global reserve free we will go ahead and steal from the global
      reserve.  With this patch Dave's reproducer now works and I can rm all the files
      on the file system.  Thanks,
      Reported-by: NDavid Sterba <dsterba@suse.cz>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      d52be818