1. 28 2月, 2017 1 次提交
  2. 17 2月, 2017 3 次提交
  3. 14 2月, 2017 3 次提交
    • J
      btrfs: allow unlink to exceed subvolume quota · 003d7c59
      Jeff Mahoney 提交于
      Once a qgroup limit is exceeded, it's impossible to restore normal
      operation to the subvolume without modifying the limit or removing
      the subvolume.  This is a surprising situation for many users used
      to the typical workflow with quotas on other file systems where it's
      possible to remove files until the used space is back under the limit.
      
      When we go to unlink a file and start the transaction, we'll hit
      the qgroup limit while trying to reserve space for the items we'll
      modify while removing the file.  We discussed last month how best
      to handle this situation and agreed that there is no perfect solution.
      The best principle-of-least-surprise solution is to handle it similarly
      to how we already handle ENOSPC when unlinking, which is to allow
      the operation to succeed with the expectation that it will ultimately
      release space under most circumstances.
      
      This patch modifies the transaction start path to select whether to
      honor the qgroups limits.  btrfs_start_transaction_fallback_global_rsv
      is the only caller that skips enforcement.  The reservation and tracking
      still happens normally -- it just skips the enforcement step.
      Signed-off-by: NJeff Mahoney <jeffm@suse.com>
      Reviewed-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      003d7c59
    • N
      btrfs: Make btrfs_ino take a struct btrfs_inode · 4a0cc7ca
      Nikolay Borisov 提交于
      Currently btrfs_ino takes a struct inode and this causes a lot of
      internal btrfs functions which consume this ino to take a VFS inode,
      rather than btrfs' own struct btrfs_inode. In order to fix this "leak"
      of VFS structs into the internals of btrfs first it's necessary to
      eliminate all uses of struct inode for the purpose of inode. This patch
      does that by using BTRFS_I to convert an inode to btrfs_inode. With
      this problem eliminated subsequent patches will start eliminating the
      passing of struct inode altogether, eventually resulting in a lot cleaner
      code.
      Signed-off-by: NNikolay Borisov <n.borisov.lkml@gmail.com>
      [ fix btrfs_get_extent tracepoint prototype ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      4a0cc7ca
    • S
      Btrfs: ACCESS_ONCE cleanup · 20c7bcec
      Seraphime Kirkovski 提交于
      This replaces ACCESS_ONCE macro with the corresponding
      READ|WRITE macros
      Signed-off-by: NSeraphime Kirkovski <kirkseraph@gmail.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      20c7bcec
  4. 06 12月, 2016 9 次提交
  5. 28 9月, 2016 1 次提交
  6. 27 9月, 2016 4 次提交
  7. 26 9月, 2016 1 次提交
    • J
      Btrfs: add a flags field to btrfs_fs_info · afcdd129
      Josef Bacik 提交于
      We have a lot of random ints in btrfs_fs_info that can be put into flags.  This
      is mostly equivalent with the exception of how we deal with quota going on or
      off, now instead we set a flag when we are turning it on or off and deal with
      that appropriately, rather than just having a pending state that the current
      quota_enabled gets set to.  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      afcdd129
  8. 25 8月, 2016 1 次提交
    • W
      btrfs: fix fsfreeze hang caused by delayed iputs deal · 9e7cc91a
      Wang Xiaoguang 提交于
      When running fstests generic/068, sometimes we got below deadlock:
        xfs_io          D ffff8800331dbb20     0  6697   6693 0x00000080
        ffff8800331dbb20 ffff88007acfc140 ffff880034d895c0 ffff8800331dc000
        ffff880032d243e8 fffffffeffffffff ffff880032d24400 0000000000000001
        ffff8800331dbb38 ffffffff816a9045 ffff880034d895c0 ffff8800331dbba8
        Call Trace:
        [<ffffffff816a9045>] schedule+0x35/0x80
        [<ffffffff816abab2>] rwsem_down_read_failed+0xf2/0x140
        [<ffffffff8118f5e1>] ? __filemap_fdatawrite_range+0xd1/0x100
        [<ffffffff8134f978>] call_rwsem_down_read_failed+0x18/0x30
        [<ffffffffa06631fc>] ? btrfs_alloc_block_rsv+0x2c/0xb0 [btrfs]
        [<ffffffff810d32b5>] percpu_down_read+0x35/0x50
        [<ffffffff81217dfc>] __sb_start_write+0x2c/0x40
        [<ffffffffa067f5d5>] start_transaction+0x2a5/0x4d0 [btrfs]
        [<ffffffffa067f857>] btrfs_join_transaction+0x17/0x20 [btrfs]
        [<ffffffffa068ba34>] btrfs_evict_inode+0x3c4/0x5d0 [btrfs]
        [<ffffffff81230a1a>] evict+0xba/0x1a0
        [<ffffffff812316b6>] iput+0x196/0x200
        [<ffffffffa06851d0>] btrfs_run_delayed_iputs+0x70/0xc0 [btrfs]
        [<ffffffffa067f1d8>] btrfs_commit_transaction+0x928/0xa80 [btrfs]
        [<ffffffffa0646df0>] btrfs_freeze+0x30/0x40 [btrfs]
        [<ffffffff81218040>] freeze_super+0xf0/0x190
        [<ffffffff81229275>] do_vfs_ioctl+0x4a5/0x5c0
        [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
        [<ffffffff810038cf>] ? syscall_trace_enter_phase1+0x11f/0x140
        [<ffffffff81229409>] SyS_ioctl+0x79/0x90
        [<ffffffff81003c12>] do_syscall_64+0x62/0x110
        [<ffffffff816acbe1>] entry_SYSCALL64_slow_path+0x25/0x25
      
      >From this warning, freeze_super() already holds SB_FREEZE_FS, but
      btrfs_freeze() will call btrfs_commit_transaction() again, if
      btrfs_commit_transaction() finds that it has delayed iputs to handle,
      it'll start_transaction(), which will try to get SB_FREEZE_FS lock
      again, then deadlock occurs.
      
      The root cause is that in btrfs, sync_filesystem(sb) does not make
      sure all metadata is updated. There still maybe some codes adding
      delayed iputs, see below sample race window:
      
               CPU1                                  |         CPU2
      |-> freeze_super()                             |
          |-> sync_filesystem(sb);                   |
          |                                          |-> cleaner_kthread()
          |                                          |   |-> btrfs_delete_unused_bgs()
          |                                          |       |-> btrfs_remove_chunk()
          |                                          |           |-> btrfs_remove_block_group()
          |                                          |               |-> btrfs_add_delayed_iput()
          |                                          |
          |-> sb->s_writers.frozen = SB_FREEZE_FS;   |
          |-> sb_wait_write(sb, SB_FREEZE_FS);       |
          |   acquire SB_FREEZE_FS lock.             |
          |                                          |
          |-> btrfs_freeze()                         |
              |-> btrfs_commit_transaction()         |
                  |-> btrfs_run_delayed_iputs()      |
                  |   will handle delayed iputs,     |
                  |   that means start_transaction() |
                  |   will be called, which will try |
                  |   to get SB_FREEZE_FS lock.      |
      
      To fix this issue, introduce a "int fs_frozen" to record internally whether
      fs has been frozen. If fs has been frozen, we can not handle delayed iputs.
      Signed-off-by: NWang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ add comment to btrfs_freeze ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      9e7cc91a
  9. 26 7月, 2016 3 次提交
  10. 23 6月, 2016 1 次提交
    • J
      Btrfs: track transid for delayed ref flushing · 31b9655f
      Josef Bacik 提交于
      Using the offwakecputime bpf script I noticed most of our time was spent waiting
      on the delayed ref throttling.  This is what is supposed to happen, but
      sometimes the transaction can commit and then we're waiting for throttling that
      doesn't matter anymore.  So change this stuff to be a little smarter by tracking
      the transid we were in when we initiated the throttling.  If the transaction we
      get is different then we can just bail out.  This resulted in a 50% speedup in
      my fs_mark test, and reduced the amount of time spent throttling by 60 seconds
      over the entire run (which is about 30 minutes).  Thanks,
      Signed-off-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      31b9655f
  11. 18 6月, 2016 2 次提交
  12. 13 5月, 2016 1 次提交
  13. 12 5月, 2016 2 次提交
    • D
      btrfs: build fixup for qgroup_account_snapshot · 2c1984f2
      David Sterba 提交于
      The macro btrfs_std_error got renamed to btrfs_handle_fs_error in an
      independent branch for the same merge target (4.7). To make the code
      compilable for bisectability reasons, add a temporary stub.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      2c1984f2
    • Q
      btrfs: qgroup: Fix qgroup accounting when creating snapshot · 6426c7ad
      Qu Wenruo 提交于
      Current btrfs qgroup design implies a requirement that after calling
      btrfs_qgroup_account_extents() there must be a commit root switch.
      
      Normally this is OK, as btrfs_qgroup_accounting_extents() is only called
      inside btrfs_commit_transaction() just be commit_cowonly_roots().
      
      However there is a exception at create_pending_snapshot(), which will
      call btrfs_qgroup_account_extents() but no any commit root switch.
      
      In case of creating a snapshot whose parent root is itself (create a
      snapshot of fs tree), it will corrupt qgroup by the following trace:
      (skipped unrelated data)
      ======
      btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1
      qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0
      qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384
      btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0
      ======
      
      The problem here is in first qgroup_account_extent(), the
      nr_new_roots of the extent is 1, which means its reference got
      increased, and qgroup increased its rfer and excl.
      
      But at second qgroup_account_extent(), its reference got decreased, but
      between these two qgroup_account_extent(), there is no switch roots.
      This leads to the same nr_old_roots, and this extent just got ignored by
      qgroup, which means this extent is wrongly accounted.
      
      Fix it by call commit_cowonly_roots() after qgroup_account_extent() in
      create_pending_snapshot(), with needed preparation.
      
      Mark: I added a check at the top of qgroup_account_snapshot() to skip this
      code if qgroups are turned off. xfstest btrfs/122 exposes this problem.
      Signed-off-by: NQu Wenruo <quwenruo@cn.fujitsu.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NMark Fasheh <mfasheh@suse.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6426c7ad
  14. 29 4月, 2016 1 次提交
  15. 28 4月, 2016 1 次提交
  16. 18 2月, 2016 2 次提交
  17. 07 1月, 2016 3 次提交
  18. 17 12月, 2015 1 次提交
    • F
      Btrfs: fix memory leaks after transaction is aborted · 7785a663
      Filipe Manana 提交于
      When a transaction is aborted, or its commit fails before writing the new
      superblock and calling btrfs_finish_extent_commit(), we leak reference
      counts on the block groups attached to the transaction's delete_bgs list,
      because btrfs_finish_extent_commit() is never called for those two cases.
      Fix this by dropping their references at btrfs_put_transaction(), which
      is called when transactions are aborted (by making the transaction kthread
      commit the transaction) or if their commits fail.
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      7785a663