1. 29 9月, 2015 1 次提交
  2. 09 8月, 2015 4 次提交
  3. 02 7月, 2015 1 次提交
  4. 03 6月, 2015 1 次提交
  5. 13 4月, 2015 1 次提交
    • D
      btrfs: qgroup: do a reservation in a higher level. · e2d1f923
      Dongsheng Yang 提交于
      There are two problems in qgroup:
      
      a). The PAGE_CACHE is 4K, even when we are writing a data of 1K,
      qgroup will reserve a 4K size. It will cause the last 3K in a qgroup
      is not available to user.
      
      b). When user is writing a inline data, qgroup will not reserve it,
      it means this is a window we can exceed the limit of a qgroup.
      
      The main idea of this patch is reserving the data size of write_bytes
      rather than the reserve_bytes. It means qgroup will not care about
      the data size btrfs will reserve for user, but only care about the
      data size user is going to write. Then reserve it when user want to
      write and release it in transaction committed.
      
      In this way, qgroup can be released from the complex procedure in
      btrfs and only do the reserve when user want to write and account
      when the data is written in commit_transaction().
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e2d1f923
  6. 11 4月, 2015 1 次提交
    • C
      Btrfs: allow block group cache writeout outside critical section in commit · 1bbc621e
      Chris Mason 提交于
      We loop through all of the dirty block groups during commit and write
      the free space cache.  In order to make sure the cache is currect, we do
      this while no other writers are allowed in the commit.
      
      If a large number of block groups are dirty, this can introduce long
      stalls during the final stages of the commit, which can block new procs
      trying to change the filesystem.
      
      This commit changes the block group cache writeout to take appropriate
      locks and allow it to run earlier in the commit.  We'll still have to
      redo some of the block groups, but it means we can get most of the work
      out of the way without blocking the entire FS.
      Signed-off-by: NChris Mason <clm@fb.com>
      1bbc621e
  7. 13 12月, 2014 2 次提交
  8. 04 10月, 2014 2 次提交
    • J
      Btrfs: fix build_backref_tree issue with multiple shared blocks · bbe90514
      Josef Bacik 提交于
      Marc Merlin sent me a broken fs image months ago where it would blow up in the
      upper->checked BUG_ON() in build_backref_tree.  This is because we had a
      scenario like this
      
      block a -- level 4 (not shared)
         |
      block b -- level 3 (reloc block, shared)
         |
      block c -- level 2 (not shared)
         |
      block d -- level 1 (shared)
         |
      block e -- level 0 (shared)
      
      We go to build a backref tree for block e, we notice block d is shared and add
      it to the list of blocks to lookup it's backrefs for.  Now when we loop around
      we will check edges for the block, so we will see we looked up block c last
      time.  So we lookup block d and then see that the block that points to it is
      block c and we can just skip that edge since we've already been up this path.
      The problem is because we clear need_check when we see block d (as it is shared)
      we never add block b as needing to be checked.  And because block c is in our
      path already we bail out before we walk up to block b and add it to the backref
      check list.
      
      To fix this we need to reset need_check if we trip over a block that doesn't
      need to be checked.  This will make sure that any subsequent blocks in the path
      as we're walking up afterwards are added to the list to be processed.  With this
      patch I can now mount Marc's fs image and it'll complete the balance without
      panicing.  Thanks,
      Reported-by: NMarc MERLIN <marc@merlins.org>
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      bbe90514
    • J
      Btrfs: cleanup error handling in build_backref_tree · 75bfb9af
      Josef Bacik 提交于
      When balance panics it tends to panic in the
      
      BUG_ON(!upper->checked);
      
      test, because it means it couldn't build the backref tree properly.  This is
      annoying to users and frankly a recoverable error, nothing in this function is
      actually fatal since it is just an in-memory building of the backrefs for a
      given bytenr.  So go through and change all the BUG_ON()'s to ASSERT()'s, and
      fix the BUG_ON(!upper->checked) thing to just return an error.
      
      This patch also fixes the error handling so it tears down the work we've done
      properly.  This code was horribly broken since we always just panic'ed instead
      of actually erroring out, so it needed to be completely re-worked.  With this
      patch my broken image no longer panics when I mount it.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      75bfb9af
  9. 02 10月, 2014 4 次提交
  10. 18 9月, 2014 2 次提交
  11. 10 6月, 2014 2 次提交
  12. 07 4月, 2014 1 次提交
    • J
      Btrfs: do not reset last_snapshot after relocation · ba8b0289
      Josef Bacik 提交于
      This was done to allow NO_COW to continue to be NO_COW after relocation but it
      is not right.  When relocating we will convert blocks to FULL_BACKREF that we
      relocate.  We can leave some of these full backref blocks behind if they are not
      cow'ed out during the relocation, like if we fail the relocation with ENOSPC and
      then just drop the reloc tree.  Then when we go to cow the block again we won't
      lookup the extent flags because we won't think there has been a snapshot
      recently which means we will do our normal ref drop thing instead of adding back
      a tree ref and dropping the shared ref.  This will cause btrfs_free_extent to
      blow up because it can't find the ref we are trying to free.  This was found
      with my ref verifying tool.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      ba8b0289
  13. 11 3月, 2014 1 次提交
  14. 29 1月, 2014 6 次提交
  15. 12 12月, 2013 3 次提交
  16. 12 11月, 2013 8 次提交
    • M
      Btrfs: rename btrfs_start_all_delalloc_inodes · 91aef86f
      Miao Xie 提交于
      rename the function -- btrfs_start_all_delalloc_inodes(), and make its
      name be compatible to btrfs_wait_ordered_roots(), since they are always
      used at the same place.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      91aef86f
    • M
      Btrfs: don't wait for the completion of all the ordered extents · b0244199
      Miao Xie 提交于
      It is very likely that there are lots of ordered extents in the filesytem,
      if we wait for the completion of all of them when we want to reclaim some
      space for the metadata space reservation, we would be blocked for a long
      time. The performance would drop down suddenly for a long time.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      b0244199
    • D
      btrfs: Use WARN_ON()'s return value in place of WARN_ON(1) · fae7f21c
      Dulshani Gunawardhana 提交于
      Use WARN_ON()'s return value in place of WARN_ON(1) for cleaner source
      code that outputs a more descriptive warnings. Also fix the styling
      warning of redundant braces that came up as a result of this fix.
      Signed-off-by: NDulshani Gunawardhana <dulshani.gunawardhana89@gmail.com>
      Reviewed-by: NZach Brown <zab@redhat.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      fae7f21c
    • W
      Btrfs: use 'u64' rather than 'int' to get extent's generation · 7fdf4b60
      Wang Shilong 提交于
      We define a 'int' to get extent's generation by mistake,fix it.
      Signed-off-by: NWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      7fdf4b60
    • J
      Btrfs: stop committing the transaction so much during relocate · 9e6a0c52
      Josef Bacik 提交于
      I noticed with my horrible snapshot excercisor that we were taking forever to
      relocate the larger the file system got.  This appeared to be because we were
      committing the transaction _constantly_.  There were a few places where we do
      braindead things with metadata reservation, like start a transaction and then
      try to refill the block rsv, which not only keeps us from committing a
      transaction during the enospc stuff, but keeps us from doing some of the harder
      flushing work which will make us more likely to need to commit the transaction.
      We also were checking the block rsv and committing the transaction if the block
      rsv was below a certain threshold, but we were doing this in a place where we
      don't actually keep anything in the block rsv so this was always ending up false
      so we always committed the transaction in this case.  I tested this to make sure
      it didn't break anything, but it takes about 10 hours to get the box to this
      state so I don't know how much of an impact it will really make.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      9e6a0c52
    • J
      Btrfs: return an error from btrfs_wait_ordered_range · 0ef8b726
      Josef Bacik 提交于
      I noticed that if the free space cache has an error writing out it's data it
      won't actually error out, it will just carry on.  This is because it doesn't
      check the return value of btrfs_wait_ordered_range, which didn't actually return
      anything.  So fix this in order to keep us from making free space cache look
      valid when it really isnt.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      0ef8b726
    • M
      Btrfs: fix BUG_ON() casued by the reserved space migration · 20dd2cbf
      Miao Xie 提交于
      When we did space balance and snapshot creation at the same time, we might
      meet the following oops:
       kernel BUG at fs/btrfs/inode.c:3038!
       [SNIP]
       Call Trace:
       [<ffffffffa0411ec7>] btrfs_orphan_cleanup+0x293/0x407 [btrfs]
       [<ffffffffa042dc45>] btrfs_mksubvol.isra.28+0x259/0x373 [btrfs]
       [<ffffffffa042de85>] btrfs_ioctl_snap_create_transid+0x126/0x156 [btrfs]
       [<ffffffffa042dff1>] btrfs_ioctl_snap_create_v2+0xd0/0x121 [btrfs]
       [<ffffffffa0430b2c>] btrfs_ioctl+0x414/0x1854 [btrfs]
       [<ffffffff813b60b7>] ? __do_page_fault+0x305/0x379
       [<ffffffff811215a9>] vfs_ioctl+0x1d/0x39
       [<ffffffff81121d7c>] do_vfs_ioctl+0x32d/0x3e2
       [<ffffffff81057fe7>] ? finish_task_switch+0x80/0xb8
       [<ffffffff81121e88>] SyS_ioctl+0x57/0x83
       [<ffffffff813b39ff>] ? do_device_not_available+0x12/0x14
       [<ffffffff813b99c2>] system_call_fastpath+0x16/0x1b
       [SNIP]
       RIP  [<ffffffffa040da40>] btrfs_orphan_add+0xc3/0x126 [btrfs]
      
      The reason of the problem is that the relocation root creation stole
      the reserved space, which was reserved for orphan item deletion.
      
      There are several ways to fix this problem, one is to increasing
      the reserved space size of the space balace, and then we can use
      that space to create the relocation tree for each fs/file trees.
      But it is hard to calculate the suitable size because we doesn't
      know how many fs/file trees we need relocate.
      
      We fixed this problem by reserving the space for relocation root creation
      actively since the space it need is very small (one tree block, used for
      root node copy), then we use that reserved space to create the
      relocation tree. If we don't reserve space for relocation tree creation,
      we will use the reserved space of the balance.
      Signed-off-by: NMiao Xie <miaox@cn.fujitsu.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      20dd2cbf
    • J
      Btrfs: relocate csums properly with prealloc extents · 4577b014
      Josef Bacik 提交于
      A user reported a problem where they were getting csum errors when running a
      balance and running systemd's journal.  This is because systemd is awesome and
      fallocate()'s its log space and writes into it.  Unfortunately we assume that
      when we read in all the csums for an extent that they are sequential starting at
      the bytenr we care about.  This obviously isn't the case for prealloc extents,
      where we could have written to the middle of the prealloc extent only, which
      means the csum would be for the bytenr in the middle of our range and not the
      front of our range.  Fix this by offsetting the new bytenr we are logging to
      based on the original bytenr the csum was for.  With this patch I no longer see
      the csum errors I was seeing.  Thanks,
      
      Cc: stable@vger.kernel.org
      Reported-by: NChris Murphy <lists@colorremedies.com>
      Signed-off-by: NJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: NChris Mason <chris.mason@fusionio.com>
      4577b014