1. 06 8月, 2018 4 次提交
  2. 12 4月, 2018 1 次提交
  3. 31 3月, 2018 5 次提交
  4. 30 6月, 2017 3 次提交
    • Q
      btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges · bc42bda2
      Qu Wenruo 提交于
      [BUG]
      For the following case, btrfs can underflow qgroup reserved space
      at an error path:
      (Page size 4K, function name without "btrfs_" prefix)
      
               Task A                  |             Task B
      ----------------------------------------------------------------------
      Buffered_write [0, 2K)           |
      |- check_data_free_space()       |
      |  |- qgroup_reserve_data()      |
      |     Range aligned to page      |
      |     range [0, 4K)          <<< |
      |     4K bytes reserved      <<< |
      |- copy pages to page cache      |
                                       | Buffered_write [2K, 4K)
                                       | |- check_data_free_space()
                                       | |  |- qgroup_reserved_data()
                                       | |     Range alinged to page
                                       | |     range [0, 4K)
                                       | |     Already reserved by A <<<
                                       | |     0 bytes reserved      <<<
                                       | |- delalloc_reserve_metadata()
                                       | |  And it *FAILED* (Maybe EQUOTA)
                                       | |- free_reserved_data_space()
                                            |- qgroup_free_data()
                                               Range aligned to page range
                                               [0, 4K)
                                               Freeing 4K
      (Special thanks to Chandan for the detailed report and analyse)
      
      [CAUSE]
      Above Task B is freeing reserved data range [0, 4K) which is actually
      reserved by Task A.
      
      And at writeback time, page dirty by Task A will go through writeback
      routine, which will free 4K reserved data space at file extent insert
      time, causing the qgroup underflow.
      
      [FIX]
      For btrfs_qgroup_free_data(), add @reserved parameter to only free
      data ranges reserved by previous btrfs_qgroup_reserve_data().
      So in above case, Task B will try to free 0 byte, so no underflow.
      Reported-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Tested-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      bc42bda2
    • Q
      btrfs: qgroup: Introduce extent changeset for qgroup reserve functions · 364ecf36
      Qu Wenruo 提交于
      Introduce a new parameter, struct extent_changeset for
      btrfs_qgroup_reserved_data() and its callers.
      
      Such extent_changeset was used in btrfs_qgroup_reserve_data() to record
      which range it reserved in current reserve, so it can free it in error
      paths.
      
      The reason we need to export it to callers is, at buffered write error
      path, without knowing what exactly which range we reserved in current
      allocation, we can free space which is not reserved by us.
      
      This will lead to qgroup reserved space underflow.
      Reviewed-by: NChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      364ecf36
    • Q
      btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function · d1b8b94a
      Qu Wenruo 提交于
      Quite a lot of qgroup corruption happens due to wrong time of calling
      btrfs_qgroup_prepare_account_extents().
      
      Since the safest time is to call it just before
      btrfs_qgroup_account_extents(), there is no need to separate these 2
      functions.
      
      Merging them will make code cleaner and less bug prone.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      [ changelog and comment adjustments ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d1b8b94a
  5. 18 4月, 2017 3 次提交
  6. 17 2月, 2017 1 次提交
    • Q
      btrfs: qgroup: Move half of the qgroup accounting time out of commit trans · fb235dc0
      Qu Wenruo 提交于
      Just as Filipe pointed out, the most time consuming parts of qgroup are
      btrfs_qgroup_account_extents() and
      btrfs_qgroup_prepare_account_extents().
      Which both call btrfs_find_all_roots() to get old_roots and new_roots
      ulist.
      
      What makes things worse is, we're calling that expensive
      btrfs_find_all_roots() at transaction committing time with
      TRANS_STATE_COMMIT_DOING, which will blocks all incoming transaction.
      
      Such behavior is necessary for @new_roots search as current
      btrfs_find_all_roots() can't do it correctly so we do call it just
      before switch commit roots.
      
      However for @old_roots search, it's not necessary as such search is
      based on commit_root, so it will always be correct and we can move it
      out of transaction committing.
      
      This patch moves the @old_roots search part out of
      commit_transaction(), so in theory we can half the time qgroup time
      consumption at commit_transaction().
      
      But please note that, this won't speedup qgroup overall, the total time
      consumption is still the same, just reduce the performance stall.
      
      Cc: Filipe Manana <fdmanana@suse.com>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      fb235dc0
  7. 14 2月, 2017 1 次提交
    • J
      btrfs: allow unlink to exceed subvolume quota · 003d7c59
      Jeff Mahoney 提交于
      Once a qgroup limit is exceeded, it's impossible to restore normal
      operation to the subvolume without modifying the limit or removing
      the subvolume.  This is a surprising situation for many users used
      to the typical workflow with quotas on other file systems where it's
      possible to remove files until the used space is back under the limit.
      
      When we go to unlink a file and start the transaction, we'll hit
      the qgroup limit while trying to reserve space for the items we'll
      modify while removing the file.  We discussed last month how best
      to handle this situation and agreed that there is no perfect solution.
      The best principle-of-least-surprise solution is to handle it similarly
      to how we already handle ENOSPC when unlinking, which is to allow
      the operation to succeed with the expectation that it will ultimately
      release space under most circumstances.
      
      This patch modifies the transaction start path to select whether to
      honor the qgroups limits.  btrfs_start_transaction_fallback_global_rsv
      is the only caller that skips enforcement.  The reservation and tracking
      still happens normally -- it just skips the enforcement step.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      003d7c59
  8. 06 12月, 2016 1 次提交
  9. 30 11月, 2016 3 次提交
  10. 25 8月, 2016 2 次提交
    • Q
      btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() · cb93b52c
      Qu Wenruo 提交于
      Refactor btrfs_qgroup_insert_dirty_extent() function, to two functions:
      1. btrfs_qgroup_insert_dirty_extent_nolock()
         Almost the same with original code.
         For delayed_ref usage, which has delayed refs locked.
      
         Change the return value type to int, since caller never needs the
         pointer, but only needs to know if they need to free the allocated
         memory.
      
      2. btrfs_qgroup_insert_dirty_extent()
         The more encapsulated version.
      
         Will do the delayed_refs lock, memory allocation, quota enabled check
         and other things.
      
      The original design is to keep exported functions to minimal, but since
      more btrfs hacks exposed, like replacing path in balance, we need to
      record dirty extents manually, so we have to add such functions.
      
      Also, add comment for both functions, to info developers how to keep
      qgroup correct when doing hacks.
      
      Cc: Mark Fasheh <mfasheh@suse.de>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-and-Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      cb93b52c
    • J
      btrfs: waiting on qgroup rescan should not always be interruptible · d06f23d6
      Jeff Mahoney 提交于
      We wait on qgroup rescan completion in three places: file system
      shutdown, the quota disable ioctl, and the rescan wait ioctl.  If the
      user sends a signal while we're waiting, we continue happily along.  This
      is expected behavior for the rescan wait ioctl.  It's racy in the shutdown
      path but mostly works due to other unrelated synchronization points.
      In the quota disable path, it Oopses the kernel pretty much immediately.
      
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d06f23d6
  11. 26 7月, 2016 1 次提交
  12. 22 10月, 2015 7 次提交
  13. 11 6月, 2015 5 次提交
  14. 13 4月, 2015 3 次提交
    • D
      btrfs: qgroup: do a reservation in a higher level. · e2d1f923
      Dongsheng Yang 提交于
      There are two problems in qgroup:
      
      a). The PAGE_CACHE is 4K, even when we are writing a data of 1K,
      qgroup will reserve a 4K size. It will cause the last 3K in a qgroup
      is not available to user.
      
      b). When user is writing a inline data, qgroup will not reserve it,
      it means this is a window we can exceed the limit of a qgroup.
      
      The main idea of this patch is reserving the data size of write_bytes
      rather than the reserve_bytes. It means qgroup will not care about
      the data size btrfs will reserve for user, but only care about the
      data size user is going to write. Then reserve it when user want to
      write and release it in transaction committed.
      
      In this way, qgroup can be released from the complex procedure in
      btrfs and only do the reserve when user want to write and account
      when the data is written in commit_transaction().
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e2d1f923
    • D
      Btrfs: qgroup: Introduce a may_use to account space_info->bytes_may_use. · 31193213
      Dongsheng Yang 提交于
      Currently, for pre_alloc or delay_alloc, the bytes will be accounted
      in space_info by the three guys.
      space_info->bytes_may_use --- space_info->reserved --- space_info->used.
      But on the other hand, in qgroup, there are only two counters to account the
      bytes, qgroup->reserved and qgroup->excl. And qg->reserved accounts
      bytes in space_info->bytes_may_use and qg->excl accounts bytes in
      space_info->used. So the bytes in space_info->reserved is not accounted
      in qgroup. If so, there is a window we can exceed the quota limit when
      bytes is in space_info->reserved.
      
      Example:
      	# btrfs quota enable /mnt
      	# btrfs qgroup limit -e 10M /mnt
      	# for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
      	# sync
      	# btrfs qgroup show -pcre /mnt
      qgroupid rfer     excl     max_rfer max_excl parent  child
      -------- ----     ----     -------- -------- ------  -----
      0/5      20987904 20987904 0        10485760 ---     ---
      
      qg->excl is 20987904 larger than max_excl 10485760.
      
      This patch introduce a new counter named may_use to qgroup, then
      there are three counters in qgroup to account bytes in space_info
      as below.
      space_info->bytes_may_use --- space_info->reserved --- space_info->used.
      qgroup->may_use           --- qgroup->reserved     --- qgroup->excl
      
      With this patch applied:
      	# btrfs quota enable /mnt
      	# btrfs qgroup limit -e 10M /mnt
      	# for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
      fallocate: /mnt/data9: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data10: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data11: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data12: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data13: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data14: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data15: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data16: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data17: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data18: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data19: fallocate failed: Disk quota exceeded
      	# sync
      	# btrfs qgroup show -pcre /mnt
      qgroupid rfer    excl    max_rfer max_excl parent  child
      -------- ----    ----    -------- -------- ------  -----
      0/5      9453568 9453568 0        10485760 ---     ---
      Reported-by: NCyril SCETBON <cyril.scetbon@free.fr>
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      31193213
    • D