1. 17 2月, 2017 12 次提交
  2. 14 2月, 2017 2 次提交
    • J
      btrfs: allow unlink to exceed subvolume quota · 003d7c59
      Jeff Mahoney 提交于
      Once a qgroup limit is exceeded, it's impossible to restore normal
      operation to the subvolume without modifying the limit or removing
      the subvolume.  This is a surprising situation for many users used
      to the typical workflow with quotas on other file systems where it's
      possible to remove files until the used space is back under the limit.
      
      When we go to unlink a file and start the transaction, we'll hit
      the qgroup limit while trying to reserve space for the items we'll
      modify while removing the file.  We discussed last month how best
      to handle this situation and agreed that there is no perfect solution.
      The best principle-of-least-surprise solution is to handle it similarly
      to how we already handle ENOSPC when unlinking, which is to allow
      the operation to succeed with the expectation that it will ultimately
      release space under most circumstances.
      
      This patch modifies the transaction start path to select whether to
      honor the qgroups limits.  btrfs_start_transaction_fallback_global_rsv
      is the only caller that skips enforcement.  The reservation and tracking
      still happens normally -- it just skips the enforcement step.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      003d7c59
    • Q
      btrfs: Add WARN_ON for qgroup reserved underflow · 18dc22c1
      Qu Wenruo 提交于
      Goldwyn Rodrigues has exposed and fixed a bug which underflows btrfs
      qgroup reserved space, and leads to non-writable fs.
      
      This reminds us that we don't have enough underflow check for qgroup
      reserved space.
      
      For underflow case, we should not really underflow the numbers but warn
      and keeps qgroup still work.
      
      So add more check on qgroup reserved space and add WARN_ON() and
      btrfs_warn() for any underflow case.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Reviewed-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      18dc22c1
  3. 06 12月, 2016 4 次提交
  4. 30 11月, 2016 3 次提交
  5. 26 11月, 2016 1 次提交
    • F
      Btrfs: fix qgroup rescan worker initialization · 8d9eddad
      Filipe Manana 提交于
      We were setting the qgroup_rescan_running flag to true only after the
      rescan worker started (which is a task run by a queue). So if a user
      space task starts a rescan and immediately after asks to wait for the
      rescan worker to finish, this second call might happen before the rescan
      worker task starts running, in which case the rescan wait ioctl returns
      immediatley, not waiting for the rescan worker to finish.
      
      This was making the fstest btrfs/022 fail very often.
      
      Fixes: d2c609b8 (btrfs: properly track when rescan worker is running)
      Cc: stable@vger.kernel.org # 4.4+
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      8d9eddad
  6. 27 9月, 2016 2 次提交
  7. 26 9月, 2016 1 次提交
    • J
      Btrfs: add a flags field to btrfs_fs_info · afcdd129
      Josef Bacik 提交于
      We have a lot of random ints in btrfs_fs_info that can be put into flags.  This
      is mostly equivalent with the exception of how we deal with quota going on or
      off, now instead we set a flag when we are turning it on or off and deal with
      that appropriately, rather than just having a pending state that the current
      quota_enabled gets set to.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      afcdd129
  8. 25 8月, 2016 3 次提交
    • Q
      btrfs: qgroup: Refactor btrfs_qgroup_insert_dirty_extent() · cb93b52c
      Qu Wenruo 提交于
      Refactor btrfs_qgroup_insert_dirty_extent() function, to two functions:
      1. btrfs_qgroup_insert_dirty_extent_nolock()
         Almost the same with original code.
         For delayed_ref usage, which has delayed refs locked.
      
         Change the return value type to int, since caller never needs the
         pointer, but only needs to know if they need to free the allocated
         memory.
      
      2. btrfs_qgroup_insert_dirty_extent()
         The more encapsulated version.
      
         Will do the delayed_refs lock, memory allocation, quota enabled check
         and other things.
      
      The original design is to keep exported functions to minimal, but since
      more btrfs hacks exposed, like replacing path in balance, we need to
      record dirty extents manually, so we have to add such functions.
      
      Also, add comment for both functions, to info developers how to keep
      qgroup correct when doing hacks.
      
      Cc: Mark Fasheh <mfasheh@suse.de>
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-and-Tested-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      cb93b52c
    • J
      btrfs: waiting on qgroup rescan should not always be interruptible · d06f23d6
      Jeff Mahoney 提交于
      We wait on qgroup rescan completion in three places: file system
      shutdown, the quota disable ioctl, and the rescan wait ioctl.  If the
      user sends a signal while we're waiting, we continue happily along.  This
      is expected behavior for the rescan wait ioctl.  It's racy in the shutdown
      path but mostly works due to other unrelated synchronization points.
      In the quota disable path, it Oopses the kernel pretty much immediately.
      
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d06f23d6
    • J
      btrfs: properly track when rescan worker is running · d2c609b8
      Jeff Mahoney 提交于
      The qgroup_flags field is overloaded such that it reflects the on-disk
      status of qgroups and the runtime state.  The BTRFS_QGROUP_STATUS_FLAG_RESCAN
      flag is used to indicate that a rescan operation is in progress, but if
      the file system is unmounted while a rescan is running, the rescan
      operation is paused.  If the file system is then mounted read-only,
      the flag will still be present but the rescan operation will not have
      been resumed.  When we go to umount, btrfs_qgroup_wait_for_completion
      will see the flag and interpret it to mean that the rescan worker is
      still running and will wait for a completion that will never come.
      
      This patch uses a separate flag to indicate when the worker is
      running.  The locking and state surrounding the qgroup rescan worker
      needs a lot of attention beyond this patch but this is enough to
      avoid a hung umount.
      
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by; Jeff Mahoney <jeffm@suse.com>
      Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      d2c609b8
  9. 26 7月, 2016 3 次提交
  10. 26 5月, 2016 1 次提交
  11. 29 4月, 2016 2 次提交
  12. 04 4月, 2016 2 次提交
    • M
      btrfs: Add qgroup tracing · 0f5dcf8d
      Mark Fasheh 提交于
      This patch adds tracepoints to the qgroup code on both the reporting side
      (insert_dirty_extents) and the accounting side. Taken together it allows us
      to see what qgroup operations have happened, and what their result was.
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      0f5dcf8d
    • M
      btrfs: handle non-fatal errors in btrfs_qgroup_inherit() · 918c2ee1
      Mark Fasheh 提交于
      create_pending_snapshot() will go readonly on _any_ error return from
      btrfs_qgroup_inherit(). If qgroups are enabled, a user can crash their fs by
      just making a snapshot and asking it to inherit from an invalid qgroup. For
      example:
      
      $ btrfs sub snap -i 1/10 /btrfs/ /btrfs/foo
      
      Will cause a transaction abort.
      
      Fix this by only throwing errors in btrfs_qgroup_inherit() when we know
      going readonly is acceptable.
      
      The following xfstests test case reproduces this bug:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
      
        here=`pwd`
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
        	cd /
        	rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # remove previous $seqres.full before test
        rm -f $seqres.full
      
        # real QA test starts here
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
      
        rm -f $seqres.full
      
        _scratch_mkfs
        _scratch_mount
        _run_btrfs_util_prog quota enable $SCRATCH_MNT
        # The qgroup '1/10' does not exist and should be silently ignored
        _run_btrfs_util_prog subvolume snapshot -i 1/10 $SCRATCH_MNT $SCRATCH_MNT/snap1
      
        _scratch_unmount
      
        echo "Silence is golden"
      
        status=0
        exit
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      918c2ee1
  13. 25 11月, 2015 2 次提交
    • M
      btrfs: qgroup: account shared subtree during snapshot delete · 82bd101b
      Mark Fasheh 提交于
      Commit 0ed4792a ('btrfs: qgroup: Switch to new extent-oriented qgroup
      mechanism.') removed our qgroup accounting during
      btrfs_drop_snapshot(). Predictably, this results in qgroup numbers
      going bad shortly after a snapshot is removed.
      
      Fix this by adding a dirty extent record when we encounter extents during
      our shared subtree walk. This effectively restores the functionality we had
      with the original shared subtree walking code in 1152651a (btrfs: qgroup:
      account shared subtrees during snapshot delete).
      
      The idea with the original patch (and this one) is that shared subtrees can
      get skipped during drop_snapshot. The shared subtree walk then allows us a
      chance to visit those extents and add them to the qgroup work for later
      processing. This ultimately makes the accounting for drop snapshot work.
      
      The new qgroup code nicely handles all the other extents during the tree
      walk via the ref dec/inc functions so we don't have to add actions beyond
      what we had originally.
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Signed-off-by: NChris Mason <clm@fb.com>
      82bd101b
    • J
      btrfs: qgroup: fix quota disable during rescan · 967ef513
      Justin Maggard 提交于
      There's a race condition that leads to a NULL pointer dereference if you
      disable quotas while a quota rescan is running.  To fix this, we just need
      to wait for the quota rescan worker to actually exit before tearing down
      the quota structures.
      Signed-off-by: NJustin Maggard <jmaggard@netgear.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      967ef513
  14. 05 11月, 2015 2 次提交
    • F
      Btrfs: fix sleeping inside atomic context in qgroup rescan worker · 3b2ba7b3
      Filipe Manana 提交于
      We are holding a btree path with spinning locks and then we attempt to
      clone an extent buffer, which calls kmem_cache_alloc() and this function
      can sleep, causing the following trace to be reported on a debug kernel:
      
      [107118.218536] BUG: sleeping function called from invalid context at mm/slab.c:2871
      [107118.224110] in_atomic(): 1, irqs_disabled(): 0, pid: 19148, name: kworker/u32:3
      [107118.226120] INFO: lockdep is turned off.
      [107118.226843] Preemption disabled at:[<ffffffffa05ffa22>] btrfs_clear_lock_blocking_rw+0x96/0xea [btrfs]
      
      [107118.229175] CPU: 3 PID: 19148 Comm: kworker/u32:3 Tainted: G        W       4.3.0-rc5-btrfs-next-17+ #1
      [107118.231326] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
      [107118.233687] Workqueue: btrfs-qgroup-rescan btrfs_qgroup_rescan_helper [btrfs]
      [107118.236835]  0000000000000000 ffff880424bf3b78 ffffffff812566f4 0000000000000000
      [107118.238369]  ffff880424bf3ba0 ffffffff81070664 ffffffff817f1cd5 0000000000000b37
      [107118.239769]  0000000000000000 ffff880424bf3bc8 ffffffff8107070a 0000000000008850
      [107118.241244] Call Trace:
      [107118.241729]  [<ffffffff812566f4>] dump_stack+0x4e/0x79
      [107118.242602]  [<ffffffff81070664>] ___might_sleep+0x23a/0x241
      [107118.243586]  [<ffffffff8107070a>] __might_sleep+0x9f/0xa6
      [107118.244532]  [<ffffffff8115af70>] cache_alloc_debugcheck_before+0x25/0x36
      [107118.245939]  [<ffffffff8115d52b>] kmem_cache_alloc+0x50/0x215
      [107118.246930]  [<ffffffffa05e627e>] __alloc_extent_buffer+0x2a/0x11f [btrfs]
      [107118.248121]  [<ffffffffa05ecb1a>] btrfs_clone_extent_buffer+0x3d/0xdd [btrfs]
      [107118.249451]  [<ffffffffa06239ea>] btrfs_qgroup_rescan_worker+0x16d/0x434 [btrfs]
      [107118.250755]  [<ffffffff81087481>] ? arch_local_irq_save+0x9/0xc
      [107118.251754]  [<ffffffffa05f7952>] normal_work_helper+0x14c/0x32a [btrfs]
      [107118.252899]  [<ffffffffa05f7952>] ? normal_work_helper+0x14c/0x32a [btrfs]
      [107118.254195]  [<ffffffffa05f7c82>] btrfs_qgroup_rescan_helper+0x12/0x14 [btrfs]
      [107118.255436]  [<ffffffff81063b23>] process_one_work+0x24a/0x4ac
      [107118.263690]  [<ffffffff81064285>] worker_thread+0x206/0x2c2
      [107118.264888]  [<ffffffff8106407f>] ? rescuer_thread+0x2cb/0x2cb
      [107118.267413]  [<ffffffff8106904d>] kthread+0xef/0xf7
      [107118.268417]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
      [107118.269505]  [<ffffffff8147d10f>] ret_from_fork+0x3f/0x70
      [107118.270491]  [<ffffffff81068f5e>] ? kthread_parkme+0x24/0x24
      
      So just use blocking locks for our path to solve this.
      This fixes the patch titled:
        "btrfs: qgroup: Don't copy extent buffer to do qgroup rescan"
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      3b2ba7b3
    • F
      Btrfs: fix race waiting for qgroup rescan worker · 190631f1
      Filipe Manana 提交于
      We were initializing the completion (fs_info->qgroup_rescan_completion)
      object after releasing the qgroup rescan lock, which gives a small time
      window for a rescan waiter to not actually wait for the rescan worker
      to finish. Example:
      
               CPU 1                                                     CPU 2
      
       fs_info->qgroup_rescan_completion->done is 0
      
       btrfs_qgroup_rescan_worker()
         complete_all(&fs_info->qgroup_rescan_completion)
           sets fs_info->qgroup_rescan_completion->done
           to UINT_MAX / 2
      
       ... do some other stuff ....
      
       qgroup_rescan_init()
         mutex_lock(&fs_info->qgroup_rescan_lock)
         set flag BTRFS_QGROUP_STATUS_FLAG_RESCAN
           in fs_info->qgroup_flags
         mutex_unlock(&fs_info->qgroup_rescan_lock)
      
                                                             btrfs_qgroup_wait_for_completion()
                                                               mutex_lock(&fs_info->qgroup_rescan_lock)
                                                               sees flag BTRFS_QGROUP_STATUS_FLAG_RESCAN
                                                                 in fs_info->qgroup_flags
                                                               mutex_unlock(&fs_info->qgroup_rescan_lock)
      
                                                               wait_for_completion_interruptible(
                                                                 &fs_info->qgroup_rescan_completion)
      
                                                                 fs_info->qgroup_rescan_completion->done
                                                                 is > 0 so it returns immediately
      
        init_completion(&fs_info->qgroup_rescan_completion)
          sets fs_info->qgroup_rescan_completion->done to 0
      
      So fix this by initializing the completion object while holding the mutex
      fs_info->qgroup_rescan_lock.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      190631f1