1. 25 11月, 2015 2 次提交
    • M
      btrfs: qgroup: account shared subtree during snapshot delete · 82bd101b
      Mark Fasheh 提交于
      Commit 0ed4792a ('btrfs: qgroup: Switch to new extent-oriented qgroup
      mechanism.') removed our qgroup accounting during
      btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
      going bad shortly after a snapshot is removed.
      
      Fix this by adding a dirty extent record when we encounter extents during
      our shared subtree walk. This effectively restores the functionality we had
      with the original shared subtree walking code in 1152651a (btrfs: qgroup:
      account shared subtrees during snapshot delete).
      
      The idea with the original patch (and this one) is that shared subtrees can
      get skipped during drop_snapshot. The shared subtree walk then allows us a
      chance to visit those extents and add them to the qgroup work for later
      processing. This ultimately makes the accounting for drop snapshot work.
      
      The new qgroup code nicely handles all the other extents during the tree
      walk via the ref dec/inc functions so we don't have to add actions beyond
      what we had originally.
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Signed-off-by: NChris Mason <clm@fb.com>
      82bd101b
    • J
      btrfs: qgroup: fix quota disable during rescan · 967ef513
      Justin Maggard 提交于
      There's a race condition that leads to a NULL pointer dereference if you
      disable quotas while a quota rescan is running.  To fix this, we just need
      to wait for the quota rescan worker to actually exit before tearing down
      the quota structures.
      Signed-off-by: NJustin Maggard <jmaggard@netgear.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      967ef513
  2. 05 11月, 2015 3 次提交
    • F
      Btrfs: fix sleeping inside atomic context in qgroup rescan worker · 3b2ba7b3
      Filipe Manana 提交于
      We are holding a btree path with spinning locks and then we attempt to
      clone an extent buffer, which calls kmem_cache_alloc() and this function
      can sleep, causing the following trace to be reported on a debug kernel:
      
      [107118.218536] BUG: sleeping function called from invalid context at mm/slab.c:2871
      [107118.224110] in_atomic(): 1, irqs_disabled(): 0, pid: 19148, name: kworker/u32:3
      [107118.226120] INFO: lockdep is turned off.
      [107118.226843] Preemption disabled at:[<ffffffffa05ffa22>] btrfs_clear_lock_blocking_rw+0x96/0xea [btrfs]
      
      [107118.229175] CPU: 3 PID: 19148 Comm: kworker/u32:3 Tainted: G        W       4.3.0-rc5-btrfs-next-17+ #1
      [107118.231326] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
      [107118.233687] Workqueue: btrfs-qgroup-rescan btrfs_qgroup_rescan_helper [btrfs]
      [107118.236835]  0000000000000000 ffff880424bf3b78 ffffffff812566f4 0000000000000000
      [107118.238369]  ffff880424bf3ba0 ffffffff81070664 ffffffff817f1cd5 0000000000000b37
      [107118.239769]  0000000000000000 ffff880424bf3bc8 ffffffff8107070a 0000000000008850
      [107118.241244] Call Trace:
      [107118.241729]  [<ffffffff812566f4>] dump_stack+0x4e/0x79
      [107118.242602]  [<ffffffff81070664>] ___might_sleep+0x23a/0x241
      [107118.243586]  [<ffffffff8107070a>] __might_sleep+0x9f/0xa6
      [107118.244532]  [<ffffffff8115af70>] cache_alloc_debugcheck_before+0x25/0x36
      [107118.245939]  [<ffffffff8115d52b>] kmem_cache_alloc+0x50/0x215
      [107118.246930]  [<ffffffffa05e627e>] __alloc_extent_buffer+0x2a/0x11f [btrfs]
      [107118.248121]  [<ffffffffa05ecb1a>] btrfs_clone_extent_buffer+0x3d/0xdd [btrfs]
      [107118.249451]  [<ffffffffa06239ea>] btrfs_qgroup_rescan_worker+0x16d/0x434 [btrfs]
      [107118.250755]  [<ffffffff81087481>] ? arch_local_irq_save+0x9/0xc
      [107118.251754]  [<ffffffffa05f7952>] normal_work_helper+0x14c/0x32a [btrfs]
      [107118.252899]  [<ffffffffa05f7952>] ? normal_work_helper+0x14c/0x32a [btrfs]
      [107118.254195]  [<ffffffffa05f7c82>] btrfs_qgroup_rescan_helper+0x12/0x14 [btrfs]
      [107118.255436]  [<ffffffff81063b23>] process_one_work+0x24a/0x4ac
      [107118.263690]  [<ffffffff81064285>] worker_thread+0x206/0x2c2
      [107118.264888]  [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb
      [107118.267413]  [<ffffffff8106904d>] kthread+0xef/0xf7
      [107118.268417]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
      [107118.269505]  [<ffffffff8147d10f>] ret_from_fork+0x3f/0x70
      [107118.270491]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
      
      So just use blocking locks for our path to solve this.
      This fixes the patch titled:
        "btrfs: qgroup: Don't copy extent buffer to do qgroup rescan"
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      3b2ba7b3
    • F
      Btrfs: fix race waiting for qgroup rescan worker · 190631f1
      Filipe Manana 提交于
      We were initializing the completion (fs_info->qgroup_rescan_completion)
      object after releasing the qgroup rescan lock, which gives a small time
      window for a rescan waiter to not actually wait for the rescan worker
      to finish. Example:
      
               CPU 1                                                     CPU 2
      
       fs_info->qgroup_rescan_completion->done is 0
      
       btrfs_qgroup_rescan_worker()
         complete_all(&fs_info->qgroup_rescan_completion)
           sets fs_info->qgroup_rescan_completion->done
           to UINT_MAX / 2
      
       ... do some other stuff ....
      
       qgroup_rescan_init()
         mutex_lock(&fs_info->qgroup_rescan_lock)
         set flag BTRFS_QGROUP_STATUS_FLAG_RESCAN
           in fs_info->qgroup_flags
         mutex_unlock(&fs_info->qgroup_rescan_lock)
      
                                                             btrfs_qgroup_wait_for_completion()
                                                               mutex_lock(&fs_info->qgroup_rescan_lock)
                                                               sees flag BTRFS_QGROUP_STATUS_FLAG_RESCAN
                                                                 in fs_info->qgroup_flags
                                                               mutex_unlock(&fs_info->qgroup_rescan_lock)
      
                                                               wait_for_completion_interruptible(
                                                                 &fs_info->qgroup_rescan_completion)
      
                                                                 fs_info->qgroup_rescan_completion->done
                                                                 is > 0 so it returns immediately
      
        init_completion(&fs_info->qgroup_rescan_completion)
          sets fs_info->qgroup_rescan_completion->done to 0
      
      So fix this by initializing the completion object while holding the mutex
      fs_info->qgroup_rescan_lock.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      190631f1
    • J
      btrfs: qgroup: exit the rescan worker during umount · 7343dd61
      Justin Maggard 提交于
      I was hitting a consistent NULL pointer dereference during shutdown that
      showed the trace running through end_workqueue_bio().  I traced it back to
      the endio_meta_workers workqueue being poked after it had already been
      destroyed.
      
      Eventually I found that the root cause was a qgroup rescan that was still
      in progress while we were stopping all the btrfs workers.
      
      Currently we explicitly pause balance and scrub operations in
      close_ctree(), but we do nothing to stop the qgroup rescan.  We should
      probably be doing the same for qgroup rescan, but that's a much larger
      change.  This small change is good enough to allow me to unmount without
      crashing.
      Signed-off-by: NJustin Maggard <jmaggard@netgear.com>
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      7343dd61
  3. 27 10月, 2015 2 次提交
  4. 22 10月, 2015 7 次提交
  5. 07 8月, 2015 2 次提交
  6. 01 7月, 2015 1 次提交
  7. 11 6月, 2015 10 次提交
  8. 03 6月, 2015 1 次提交
  9. 13 4月, 2015 12 次提交
    • Q
      btrfs: quota: Automatically update related qgroups or mark INCONSISTENT flags... · 9c8b35b1
      Qu Wenruo 提交于
      btrfs: quota: Automatically update related qgroups or mark INCONSISTENT flags when assigning/deleting a qgroup relations.
      
      Operation like qgroups assigning/deleting qgroup relations will mostly
      cause qgroup data inconsistent, since it needs to do the full rescan to
      determine whether shared extents are exclusive or still shared in
      parent qgroups.
      
      But there are some exceptions, like qgroup with only exclusive extents
      (qgroup->excl == qgroup->rfer), in that case, we only needs to
      modify all its parents' excl and rfer.
      
      So this patch adds a quick path for such qgroup in qgroup
      assign/remove routine, and if quick path failed, the qgroup status will
      be marked INCONSISTENT, and return 1 to info user-land.
      
      BTW since the quick path is much the same of qgroup_excl_accounting(),
      so move the core of it to __qgroup_excl_accounting() and reuse it.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      9c8b35b1
    • D
      btrfs: qgroup: clear STATUS_FLAG_ON in disabling quota. · 8ea0ec9e
      Dongsheng Yang 提交于
      we forgot to clear STATUS_FLAG_ON in quota_disable(), it
      will cause a problem shown as below:
      
      	# mount /dev/sdc /mnt
      	# btrfs quota enable /mnt
      	# btrfs quota disable /mnt
      	# btrfs quota rescan /mnt
      	quota rescan started <--- expecting it fail here.
      	# echo $?
      	0
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      8ea0ec9e
    • Q
      btrfs: Update btrfs qgroup status item when rescan is done. · 53b7cde9
      Qu Wenruo 提交于
      Update qgroup status when rescan is done.
      
      Before this patch, status item is not updated on rescan finish, which
      causing the RESCAN and INCONSISTENT flags never cleared.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      53b7cde9
    • Q
      btrfs: qgroup: Fix dead judgement on qgroup_rescan_leaf() return value. · 3393168d
      Qu Wenruo 提交于
      Old qgroup_rescan_leaf() comment indicates ret == 2 as complete and
      cleared INCONSISTENT flag.
      
      This is not true since it will never return 2, and inside it no codes
      will clear INCONSISTENT flag.
      The flag clearance is done in btrfs_qgroup_rescan_work().
      This caused the bug that INCONSISTENT flag is never cleared.
      
      So change the comment and fix the dead judgment.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      3393168d
    • Q
      btrfs: Check qgroup level in kernel qgroup assign. · 8465ecec
      Qu Wenruo 提交于
      Although we have qgroup level check in btrfs-progs, it's not enough
      since other programe may still call ioctl directly not using
      btrfs-progs. For example, systemd.
      
      But it's btrfs-progs to be blame since we don't provide a
      full-function(like subvolume create things) btrfs library with enough
      check, and only rely on kernel ioctl.
      
      So Add level checks in kernel too.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      8465ecec
    • D
      btrfs: qgroup: allow to remove qgroup which has parent but no child. · f5a6b1c5
      Dongsheng Yang 提交于
      When a qgroup has parents but no child, it should be removable in
      Theory I think. But currently, we can not remove it when it has
      either parent or child.
      
      Example:
      	# btrfs quota enable /mnt
      	# btrfs qgroup create 1/0 /mnt
      	# btrfs qgroup create 2/0 /mnt
      	# btrfs qgroup assign 1/0 2/0 /mnt
      	# btrfs qgroup show -pcre /mnt
      qgroupid rfer  excl  max_rfer max_excl parent  child
      -------- ----  ----  -------- -------- ------  -----
      0/5      16384 16384 0        0        ---     ---
      1/0      0     0     0        0        2/0     ---
      2/0      0     0     0        0        ---     1/0
      
      At this time, there is no subvol or qgroup depending on it.
      Just a qgroup 2/0 is its parent, but 2/0 can work well without
      1/0. So I think 1/0 should be removalbe. But:
      	# btrfs qgroup destroy 1/0 /mnt
      ERROR: unable to destroy quota group: Device or resource busy
      
      This patch remove the check of qgroup->parent in removing it,
      then we can remove a qgroup when it has a parent.
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      f5a6b1c5
    • D
      btrfs: qgroup: return EINVAL if level of parent is not higher than child's. · 09870d27
      Dongsheng Yang 提交于
      When we create a subvol inheriting a qgroup, we need to check the level
      of them. Otherwise, there is a chance a qgroup can inherit another qgroup
      at the same level.
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      09870d27
    • D
      btrfs: qgroup: do a reservation in a higher level. · e2d1f923
      Dongsheng Yang 提交于
      There are two problems in qgroup:
      
      a). The PAGE_CACHE is 4K, even when we are writing a data of 1K,
      qgroup will reserve a 4K size. It will cause the last 3K in a qgroup
      is not available to user.
      
      b). When user is writing a inline data, qgroup will not reserve it,
      it means this is a window we can exceed the limit of a qgroup.
      
      The main idea of this patch is reserving the data size of write_bytes
      rather than the reserve_bytes. It means qgroup will not care about
      the data size btrfs will reserve for user, but only care about the
      data size user is going to write. Then reserve it when user want to
      write and release it in transaction committed.
      
      In this way, qgroup can be released from the complex procedure in
      btrfs and only do the reserve when user want to write and account
      when the data is written in commit_transaction().
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      e2d1f923
    • D
      Btrfs: qgroup: Introduce a may_use to account space_info->bytes_may_use. · 31193213
      Dongsheng Yang 提交于
      Currently, for pre_alloc or delay_alloc, the bytes will be accounted
      in space_info by the three guys.
      space_info->bytes_may_use --- space_info->reserved --- space_info->used.
      But on the other hand, in qgroup, there are only two counters to account the
      bytes, qgroup->reserved and qgroup->excl. And qg->reserved accounts
      bytes in space_info->bytes_may_use and qg->excl accounts bytes in
      space_info->used. So the bytes in space_info->reserved is not accounted
      in qgroup. If so, there is a window we can exceed the quota limit when
      bytes is in space_info->reserved.
      
      Example:
      	# btrfs quota enable /mnt
      	# btrfs qgroup limit -e 10M /mnt
      	# for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
      	# sync
      	# btrfs qgroup show -pcre /mnt
      qgroupid rfer     excl     max_rfer max_excl parent  child
      -------- ----     ----     -------- -------- ------  -----
      0/5      20987904 20987904 0        10485760 ---     ---
      
      qg->excl is 20987904 larger than max_excl 10485760.
      
      This patch introduce a new counter named may_use to qgroup, then
      there are three counters in qgroup to account bytes in space_info
      as below.
      space_info->bytes_may_use --- space_info->reserved --- space_info->used.
      qgroup->may_use           --- qgroup->reserved     --- qgroup->excl
      
      With this patch applied:
      	# btrfs quota enable /mnt
      	# btrfs qgroup limit -e 10M /mnt
      	# for((i=0;i<20;i++));do fallocate -l 1M /mnt/data$i; done
      fallocate: /mnt/data9: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data10: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data11: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data12: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data13: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data14: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data15: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data16: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data17: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data18: fallocate failed: Disk quota exceeded
      fallocate: /mnt/data19: fallocate failed: Disk quota exceeded
      	# sync
      	# btrfs qgroup show -pcre /mnt
      qgroupid rfer    excl    max_rfer max_excl parent  child
      -------- ----    ----    -------- -------- ------  -----
      0/5      9453568 9453568 0        10485760 ---     ---
      Reported-by: NCyril SCETBON <cyril.scetbon@free.fr>
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      31193213
    • D
    • D
      btrfs: qgroup: fix limit args override whole limit struct · 03477d94
      Dongsheng Yang 提交于
      btrfs_limit_group use arg limit to override the old qgroup_limit of
      corresponding qgroup. However, we should override part of old qgroup_limit
      according to the bit which has been set in arg limit.
      Signed-off-by: NFan Chengniang <fancn.fnst@cn.fujitsu.com>
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      03477d94
    • D
      btrfs: qgroup: update limit info in function btrfs_run_qgroups(). · d3001ed3
      Dongsheng Yang 提交于
      When we commit_transaction(), qgroups in btree should be updated.
      But, limit info is not considered currently. It will cause a problem
      when a qgroup of a snapshot inherit the limit info from srcqgroup,
      then there is an inconsistency.
      Signed-off-by: NDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d3001ed3