1. 02 2月, 2018 7 次提交
    • N
      btrfs: Ignore errors from btrfs_qgroup_trace_extent_post · 952bd3db
      Nikolay Borisov 提交于
      Running generic/019 with qgroups on the scratch device enabled is almost
      guaranteed to trigger the BUG_ON in btrfs_free_tree_block. It's supposed
      to trigger only on -ENOMEM, in reality, however, it's possible to get
      -EIO from btrfs_qgroup_trace_extent_post. This function just finds the
      roots of the extent being tracked and sets the qrecord->old_roots list.
      If this operation fails nothing critical happens except the quota
      accounting can be considered wrong. In such case just set the
      INCONSISTENT flag for the quota and print a warning, rather than killing
      off the system. Additionally, it's possible to trigger a BUG_ON in
      btrfs_truncate_inode_items as well.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      [ error message adjustments ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      952bd3db
    • L
      Btrfs: fix unexpected -EEXIST when creating new inode · 900c9981
      Liu Bo 提交于
      The highest objectid, which is assigned to new inode, is decided at
      the time of initializing fs roots.  However, in cases where log replay
      gets processed, the btree which fs root owns might be changed, so we
      have to search it again for the highest objectid, otherwise creating
      new inode would end up with -EEXIST.
      
      cc: <stable@vger.kernel.org> v4.4-rc6+
      Fixes: f32e48e9 ("Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots")
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      900c9981
    • L
      Btrfs: fix use-after-free on root->orphan_block_rsv · 1a932ef4
      Liu Bo 提交于
      I got these from running generic/475,
      
      WARNING: CPU: 0 PID: 26384 at fs/btrfs/inode.c:3326 btrfs_orphan_commit_root+0x1ac/0x2b0 [btrfs]
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
      IP: btrfs_block_rsv_release+0x1c/0x70 [btrfs]
      Call Trace:
        btrfs_orphan_release_metadata+0x9f/0x200 [btrfs]
        btrfs_orphan_del+0x10d/0x170 [btrfs]
        btrfs_setattr+0x500/0x640 [btrfs]
        notify_change+0x7ae/0x870
        do_truncate+0xca/0x130
        vfs_truncate+0x2ee/0x3d0
        do_sys_truncate+0xaf/0xf0
        SyS_truncate+0xe/0x10
        entry_SYSCALL_64_fastpath+0x1f/0x96
      
      The race is between btrfs_orphan_commit_root and btrfs_orphan_del,
              t1                                        t2
      btrfs_orphan_commit_root                     btrfs_orphan_del
         spin_lock
         check (&root->orphan_inodes)
         root->orphan_block_rsv = NULL;
         spin_unlock
                                                   atomic_dec(&root->orphan_inodes);
                                                   access root->orphan_block_rsv
      
      Accessing root->orphan_block_rsv must be done before decreasing
      root->orphan_inodes.
      
      cc: <stable@vger.kernel.org> v3.12+
      Fixes: 703c88e0 ("Btrfs: fix tracking of orphan inode count")
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1a932ef4
    • L
      Btrfs: fix btrfs_evict_inode to handle abnormal inodes correctly · e8f1bc14
      Liu Bo 提交于
      This regression is introduced in
      commit 3d48d981 ("btrfs: Handle uninitialised inode eviction").
      
      There are two problems,
      
      a) it is ->destroy_inode() that does the final free on inode, not
         ->evict_inode(),
      b) clear_inode() must be called before ->evict_inode() returns.
      
      This could end up hitting BUG_ON(inode->i_state != (I_FREEING | I_CLEAR));
      in evict() because I_CLEAR is set in clear_inode().
      
      Fixes: commit 3d48d981 ("btrfs: Handle uninitialised inode eviction")
      Cc: <stable@vger.kernel.org> # v4.7-rc6+
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e8f1bc14
    • L
      Btrfs: fix extent state leak from tree log · 55237a5f
      Liu Bo 提交于
      It's possible that btrfs_sync_log() bails out after one of the two
      btrfs_write_marked_extents() which convert extent state's state bit into
      EXTENT_NEED_WAIT from EXTENT_DIRTY/EXTENT_NEW, however only EXTENT_DIRTY
      and EXTENT_NEW are searched by free_log_tree() so that those extent states
      with EXTENT_NEED_WAIT lead to memory leak.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      55237a5f
    • L
      Btrfs: fix crash due to not cleaning up tree log block's dirty bits · 1846430c
      Liu Bo 提交于
      In cases that the whole fs flips into readonly status due to failures in
      critical sections, then log tree's blocks are still dirty, and this leads
      to a crash during umount time, the crash is about use-after-free,
      
      umount
       -> close_ctree
          -> stop workers
          -> iput(btree_inode)
             -> iput_final
                -> write_inode_now
      	     -> ...
      	       -> queue job on stop'd workers
      
      cc: <stable@vger.kernel.org> v3.12+
      Fixes: 681ae509 ("Btrfs: cleanup reserved space when freeing tree log on error")
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1846430c
    • L
      Btrfs: fix deadlock in run_delalloc_nocow · e8916699
      Liu Bo 提交于
      @cur_offset is not set back to what it should be (@cow_start) if
      btrfs_next_leaf() returns something wrong, and the range [cow_start,
      cur_offset) remains locked forever.
      
      cc: <stable@vger.kernel.org>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e8916699
  2. 30 1月, 2018 2 次提交
  3. 23 1月, 2018 7 次提交
  4. 22 1月, 2018 24 次提交
    • N
      btrfs: Use IS_ALIGNED in btrfs_truncate_block instead of opencoding it · b03ebd99
      Nikolay Borisov 提交于
      No functional changes, just makes the code more readable
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      b03ebd99
    • L
      Btrfs: noinline merge_extent_mapping · 5f4791f4
      Liu Bo 提交于
      In order to debug subtle bugs around merge_extent_mapping(), perf probe
      can be used to check the arguments, but sometimes merge_extent_mapping()
      got inlined by compiler and couldn't be probed.
      
      This is adding noinline attribute to merge_extent_mapping().
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5f4791f4
    • L
      Btrfs: add WARN_ONCE to detect unexpected error from merge_extent_mapping · 9a7e10e7
      Liu Bo 提交于
      This is a subtle case, so in order to understand the problem, it'd be good
      to know the content of existing and em when any error occurs.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      9a7e10e7
    • L
      Btrfs: extent map selftest: dio write vs dio read · cd77f4f8
      Liu Bo 提交于
      This test case simulates the racy situation of dio write vs dio read,
      and see if btrfs_get_extent() would return -EEXIST.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      cd77f4f8
    • L
      Btrfs: extent map selftest: buffered write vs dio read · fd87526f
      Liu Bo 提交于
      This test case simulates the racy situation of buffered write vs dio
      read, and see if btrfs_get_extent() would return -EEXIST.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      fd87526f
    • L
      Btrfs: add extent map selftests · 72b28077
      Liu Bo 提交于
      We've observed that btrfs_get_extent() and merge_extent_mapping() could
      return -EEXIST in several cases, and they are caused by some racy
      condition, e.g dio read vs dio write, which makes the problem very tricky
      to reproduce.
      
      This adds extent map selftests in order to simulate those racy situations.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      [ minor string adjustments ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      72b28077
    • L
      Btrfs: move extent map specific code to extent_map.c · c04e61b5
      Liu Bo 提交于
      These helpers are extent map specific, move them to extent_map.c.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c04e61b5
    • L
      Btrfs: add helper for em merge logic · 7b4df058
      Liu Bo 提交于
      This is a prepare work for the following extent map selftest, which
      runs tests against em merge logic.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      7b4df058
    • L
      Btrfs: fix unexpected EEXIST from btrfs_get_extent · 18e83ac7
      Liu Bo 提交于
      This fixes a corner case that is caused by a race of dio write vs dio
      read/write.
      
      Here is how the race could happen.
      
      Suppose that no extent map has been loaded into memory yet.
      There is a file extent [0, 32K), two jobs are running concurrently
      against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
      read from [0, 4K) or [4K, 8K).
      
      t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).
      
      ------------------------------------------------------
                   t1                                t2
            btrfs_get_blocks_direct()         btrfs_get_blocks_direct()
             -> btrfs_get_extent()              -> btrfs_get_extent()
                 -> lookup_extent_mapping()
                 -> add_extent_mapping()            -> lookup_extent_mapping()
                    # load [0, 32K)
             -> btrfs_new_extent_direct()
                 -> btrfs_drop_extent_cache()
                    # split [0, 32K) and
      	      # drop [8K, 32K)
                 -> add_extent_mapping()
                    # add [8K, 32K)
                                                    -> add_extent_mapping()
                                                       # handle -EEXIST when adding
                                                       # [0, 32K)
      ------------------------------------------------------
      About how t2(dio read/write) runs into -EEXIST:
      
      a) add_extent_mapping() gets -EEXIST for adding em [0, 32k),
      
      b) search_extent_mapping() then returns [0, 8k) as the existing em,
         even though start == existing->start, em is [0, 32k) so that
         extent_map_end(em) > extent_map_end(existing), i.e. 32k > 8k,
      
      c) then it goes thru merge_extent_mapping() which tries to add a [8k, 8k)
         (with a length 0) and returns -EEXIST as [8k, 32k) is already in tree,
      
      d) so btrfs_get_extent() ends up returning -EEXIST to dio read/write,
         which is confusing applications.
      
      Here I conclude all the possible situations,
      1) start < existing->start
      
                  +-----------+em+-----------+
      +--prev---+ |     +-------------+      |
      |         | |     |             |      |
      +---------+ +     +---+existing++      ++
                      +
                      |
                      +
                   start
      
      2) start == existing->start
      
            +------------em------------+
            |     +-------------+      |
            |     |             |      |
            +     +----existing-+      +
                  |
                  |
                  +
               start
      
      3) start > existing->start && start < (existing->start + existing->len)
      
            +------------em------------+
            |     +-------------+      |
            |     |             |      |
            +     +----existing-+      +
                     |
                     |
                     +
                   start
      
      4) start >= (existing->start + existing->len)
      
      +-----------+em+-----------+
      |     +-------------+      | +--next---+
      |     |             |      | |         |
      +     +---+existing++      + +---------+
                            +
                            |
                            +
                         start
      
      As we can see, it turns out that if start is within existing em (front
      inclusive), then the existing em should be returned as is, otherwise,
      we try our best to merge candidate em with sibling ems to form a
      larger em (in order to reduce the total number of em).
      Reported-by: NDavid Vallender <david.vallender@landmark.co.uk>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      18e83ac7
    • L
      Btrfs: fix incorrect block_len in merge_extent_mapping · a520a7e0
      Liu Bo 提交于
      %block_len could be checked on deciding if two em are mergeable.
      
      merge_extent_mapping() has only added the front pad if the front part
      of em gets truncated, but it's possible that the end part gets
      truncated.
      
      For both compressed extent and inline extent, em->block_len is not
      adjusted accordingly, and for regular extent, em->block_len always
      equals to em->len, hence this sets em->block_len with em->len.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      a520a7e0
    • M
      btrfs: Remove unused readahead spinlock · 3cbf26da
      Matthew Wilcox 提交于
      The reada_lock in struct btrfs_device was only initialised, and not
      actually used.  That's good because there's another lock also called
      reada_lock in the btrfs_fs_info that was quite heavily used.  Remove
      this one.
      Signed-off-by: NMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3cbf26da
    • L
      Btrfs: raid56: fix race between merge_bio and rbio_orig_end_io · 7583d8d0
      Liu Bo 提交于
      Before rbio_orig_end_io() goes to free rbio, rbio may get merged with
      more bios from other rbios and rbio->bio_list becomes non-empty,
      in that case, these newly merged bios don't end properly.
      
      Once unlock_stripe() is done, rbio->bio_list will not be updated any
      more and we can call bio_endio() on all queued bios.
      
      It should only happen in error-out cases, the normal path of recover
      and full stripe write have already set RBIO_RMW_LOCKED_BIT to disable
      merge before doing IO, so rbio_orig_end_io() called by them doesn't
      have the above issue.
      Reported-by: NJérôme Carretero <cJ-ko@zougloub.eu>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      7583d8d0
    • L
      Btrfs: do not cache rbio pages if using raid6 recover · 44ac474d
      Liu Bo 提交于
      Since raid6 recover tries all possible combinations of failed stripes,
      
      - when raid6 rebuild algorithm is used, i.e. raid6_datap_recov() and
        raid6_2data_recov(), it may change the in-memory content of failed
        stripes, if such a raid bio is cached, a later raid write rmw or recover
        can steal @stripe_pages from it instead of reading from disks, such that
        it carries the wrong content to do write rmw or recovery and ends up
        with corruption or recovery failures.
      
      - when raid5 rebuild algorithm is used, i.e. xor, raid bio can be cached
        because the only failed stripe which contains @rbio->bio_pages gets
        modified, others remain the same so that their in-memory content is
        consistent with their on-disk content.
      
      This adds a check to skip caching rbio if using raid6 recover.
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      44ac474d
    • L
      Btrfs: raid56: iterate raid56 internal bio with bio_for_each_segment_all · 0198e5b7
      Liu Bo 提交于
      Bio iterated by set_bio_pages_uptodate() is raid56 internal one, so it
      will never be a BIO_CLONED bio, and since this is called by end_io
      functions, bio->bi_iter.bi_size is zero, we mustn't use
      bio_for_each_segment() as that is a no-op if bi_size is zero.
      
      Fixes: 6592e58c ("Btrfs: fix write corruption due to bio cloning on raid5/6")
      Cc: <stable@vger.kernel.org> # v4.12-rc6+
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      0198e5b7
    • S
      btrfs: correct wrong comment about magic number of index_cnt · df6703e1
      Su Yue 提交于
      There is no function named btrfs_get_inode_index_count.
      Explanation for magic number index_cnt=2 in btrfs_new_inode() is
      actually located in btrfs_set_inode_index_count().
      
      So replace 'btrfs_get_inode_index_count' in the comment by
      'btrfs_set_inode_index_count'.
      Signed-off-by: NSu Yue <suy.fnst@cn.fujitsu.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      df6703e1
    • N
      btrfs: Make btrfs_inode_rsv_release static · d2560ebd
      Nikolay Borisov 提交于
      It's not used outside of extent-tree so there is no reason to not be
      static.
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      d2560ebd
    • A
      btrfs: cleanup btrfs_free_stale_device() usage · 1c94da9d
      Anand Jain 提交于
      We call btrfs_free_stale_device() only when we alloc a new struct
      btrfs_device (ret=1), so move it closer to where we alloc the new
      device. Also drop the comments.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1c94da9d
    • D
      btrfs: tree-check: reduce stack consumption in check_dir_item · e2683fc9
      David Sterba 提交于
      I've noticed that the updated item checker stack consumption increased
      dramatically in 542f5385e20cf97447 ("btrfs: tree-checker: Add checker
      for dir item")
      
      tree-checker.c:check_leaf                    +552 (176 -> 728)
      
      The array is 255 bytes long, dynamic allocation would slow down the
      sanity checks so it's more reasonable to keep it on-stack. Moving the
      variable to the scope of use reduces the stack usage again
      
      tree-checker.c:check_leaf                    -264 (728 -> 464)
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e2683fc9
    • X
      btrfs: use correct string length in DEV_INFO ioctl · 6670d4c2
      Xiongfeng Wang 提交于
      gcc-8 reports:
      
      fs/btrfs/ioctl.c: In function 'btrfs_ioctl':
      ./include/linux/string.h:245:9: warning: '__builtin_strncpy' specified
      bound 1024 equals destination size [-Wstringop-truncation]
      
      We need one less byte or call strlcpy() to make it a nul-terminated
      string. This is done on the next line anyway, but we want to avoid the
      warning.
      Signed-off-by: NXiongfeng Wang <xiongfeng.wang@linaro.org>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ update changelog ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6670d4c2
    • A
      btrfs: fail mount when sb flag is not in BTRFS_SUPER_FLAG_SUPP · 6f794e3c
      Anand Jain 提交于
      It appears from the original commit [1] that there isn't any design
      specific reason not to fail the mount instead of just warning. This
      patch will change it to fail.
      
      [1]
       commit 319e4d06
          btrfs: Enhance super validation check
      
      Fixes: 319e4d06 ("btrfs: Enhance super validation check")
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6f794e3c
    • A
      btrfs: add support for SUPER_FLAG_CHANGING_FSID · 98820a7e
      Anand Jain 提交于
      The UUID change by btrfstune sets SUPER_FLAG_CHANGING_FSID and resets it
      only when changing fsid is complete. Its not a good idea to mount the
      device anything in between, reading metadata blocks would fail with UUID
      mismatch.
      
      This patch doesn't add SUPER_FLAG_CHANGING_FSID into
      BTRFS_SUPER_FLAG_SUPP list, so mount will fail (along with the fix in
      the next patch) when SUPER_FLAG_CHANGING_FSID is set.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      [ update changelog ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      98820a7e
    • A
      btrfs: define SUPER_FLAG_METADUMP_V2 · e2731e55
      Anand Jain 提交于
      btrfs-progs uses super flag bit BTRFS_SUPER_FLAG_METADUMP_V2 (1ULL << 34).
      So just define that in kernel so that we know its been used.
      Signed-off-by: NAnand Jain <anand.jain@oracle.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e2731e55
    • L
      Btrfs: avoid losing data raid profile when deleting a device · a6f93c71
      Liu Bo 提交于
      We've avoided data losing raid profile when doing balance, but it
      turns out that deleting a device could also result in the same
      problem.
      
      Say we have 3 disks, and they're created with '-d raid1' profile.
      
      - We have chunk P (the only data chunk on the empty btrfs).
      
      - Suppose that chunk P's two raid1 copies reside in disk A and disk B.
      
      - Now, 'btrfs device remove disk B'
               btrfs_rm_device()
      	   -> btrfs_shrink_device()
      	      -> btrfs_relocate_chunk() #relocate any chunk on disk B
      	      	 			 to other places.
      
      - Chunk P will be removed and a new chunk will be created to hold
        those data, but as chunk P is the only one holding raid1 profile,
        after it goes away, the new chunk will be created as single profile
        which is our default profile.
      
      This fixes the problem by creating an empty data chunk before
      relocating the data chunk.
      
      Metadata/System chunk are supposed to have non-zero bytes all the time
      so their raid profile is preserved.
      Reported-by: NJames Alandt <James.Alandt@wdc.com>
      Signed-off-by: NLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      a6f93c71
    • F
      Btrfs: fix space leak after fallocate and zero range operations · 81fdf638
      Filipe Manana 提交于
      If we do a buffered write after a zero range operation that has an
      unaligned (with the filesystem's sector size) end which also falls within
      an unwritten (prealloc) extent that is currently beyond the inode's
      i_size, and the zero range operation has the flag FALLOC_FL_KEEP_SIZE,
      we end up leaking data and metadata space. This happens because when
      zeroing a range we call btrfs_truncate_block(), which does delalloc
      (loads the page and partially zeroes its content), and in the buffered
      write path we only clear existing delalloc space reservation for the
      range we are writing into if that range starts at an offset smaller then
      the inode's i_size, which makes sense since we can not have delalloc
      extents beyond the i_size, only unwritten extents are allowed.
      
      Example reproducer:
      
       $ mkfs.btrfs -f /dev/sdb
       $ mount /dev/sdb /mnt
       $ xfs_io -f -c "falloc -k 428K 4K" /mnt/foobar
       $ xfs_io -c "fzero -k 0 430K" /mnt/foobar
       $ xfs_io -c "pwrite -S 0xaa 428K 4K" /mnt/foobar
       $ umount /mnt
      
      After the unmount we get the metadata and data space leaks reported in
      dmesg/syslog:
      
       [95794.602253] ------------[ cut here ]------------
       [95794.603322] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9561 btrfs_destroy_inode+0x4e/0x206 [btrfs]
       [95794.605167] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.613000] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.614448] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.615972] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.617114] RIP: 0010:btrfs_destroy_inode+0x4e/0x206 [btrfs]
       [95794.618001] RSP: 0018:ffffc90001737d00 EFLAGS: 00010202
       [95794.618721] RAX: 0000000000000000 RBX: ffff880070fa1418 RCX: ffffc90001737c7c
       [95794.619645] RDX: 0000000175aa0240 RSI: 0000000000000001 RDI: ffff880070fa1418
       [95794.620711] RBP: ffffc90001737d38 R08: 0000000000000000 R09: 0000000000000000
       [95794.621932] R10: ffffc90001737c48 R11: ffff88007123e158 R12: ffff880075b6a000
       [95794.623124] R13: ffff88006145c000 R14: ffff880070fa1418 R15: ffff880070c3b4a0
       [95794.624188] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.625578] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.626522] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.627647] Call Trace:
       [95794.628128]  destroy_inode+0x3d/0x55
       [95794.628573]  evict+0x177/0x17e
       [95794.629010]  dispose_list+0x50/0x71
       [95794.629478]  evict_inodes+0x132/0x141
       [95794.630289]  generic_shutdown_super+0x3f/0x10b
       [95794.630864]  kill_anon_super+0x12/0x1c
       [95794.631383]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.631930]  deactivate_locked_super+0x30/0x68
       [95794.632539]  deactivate_super+0x36/0x39
       [95794.633200]  cleanup_mnt+0x49/0x67
       [95794.633818]  __cleanup_mnt+0x12/0x14
       [95794.634416]  task_work_run+0x82/0xa6
       [95794.634902]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.635525]  syscall_return_slowpath+0x18c/0x1af
       [95794.636122]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.636834] RIP: 0033:0x7fa678cb99a7
       [95794.637370] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.638672] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.639596] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.640703] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.641773] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.643150] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.644249] Code: ff 4c 8b a8 80 06 00 00 48 8b 87 c0 01 00 00 48 85 c0 74 02 0f ff 48 83 bb e0 02 00 00 00 74 02 0f ff 83 bb 3c ff ff ff 00 74 02 <0f> ff 83 bb 40 ff ff ff 00 74 02 0f ff 48 83 bb f8 fe ff ff 00
       [95794.646929] ---[ end trace e95877675c6ec007 ]---
       [95794.647751] ------------[ cut here ]------------
       [95794.648509] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9562 btrfs_destroy_inode+0x59/0x206 [btrfs]
       [95794.649842] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.654659] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.655894] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.657546] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.658433] RIP: 0010:btrfs_destroy_inode+0x59/0x206 [btrfs]
       [95794.659279] RSP: 0018:ffffc90001737d00 EFLAGS: 00010202
       [95794.660054] RAX: 0000000000000000 RBX: ffff880070fa1418 RCX: ffffc90001737c7c
       [95794.660753] RDX: 0000000175aa0240 RSI: 0000000000000001 RDI: ffff880070fa1418
       [95794.661513] RBP: ffffc90001737d38 R08: 0000000000000000 R09: 0000000000000000
       [95794.662289] R10: ffffc90001737c48 R11: ffff88007123e158 R12: ffff880075b6a000
       [95794.663393] R13: ffff88006145c000 R14: ffff880070fa1418 R15: ffff880070c3b4a0
       [95794.664342] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.665673] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.666593] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.667629] Call Trace:
       [95794.668065]  destroy_inode+0x3d/0x55
       [95794.668637]  evict+0x177/0x17e
       [95794.669179]  dispose_list+0x50/0x71
       [95794.669830]  evict_inodes+0x132/0x141
       [95794.670416]  generic_shutdown_super+0x3f/0x10b
       [95794.671103]  kill_anon_super+0x12/0x1c
       [95794.671786]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.672552]  deactivate_locked_super+0x30/0x68
       [95794.673393]  deactivate_super+0x36/0x39
       [95794.674107]  cleanup_mnt+0x49/0x67
       [95794.674706]  __cleanup_mnt+0x12/0x14
       [95794.675279]  task_work_run+0x82/0xa6
       [95794.675795]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.676507]  syscall_return_slowpath+0x18c/0x1af
       [95794.677275]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.678006] RIP: 0033:0x7fa678cb99a7
       [95794.678600] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.679739] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.680779] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.681837] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.682867] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.683891] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.684843] Code: c0 01 00 00 48 85 c0 74 02 0f ff 48 83 bb e0 02 00 00 00 74 02 0f ff 83 bb 3c ff ff ff 00 74 02 0f ff 83 bb 40 ff ff ff 00 74 02 <0f> ff 48 83 bb f8 fe ff ff 00 74 02 0f ff 48 83 bb 00 ff ff ff
       [95794.687156] ---[ end trace e95877675c6ec008 ]---
       [95794.687876] ------------[ cut here ]------------
       [95794.688579] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9565 btrfs_destroy_inode+0x7d/0x206 [btrfs]
       [95794.689735] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.695015] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.696396] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.697956] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.698925] RIP: 0010:btrfs_destroy_inode+0x7d/0x206 [btrfs]
       [95794.699763] RSP: 0018:ffffc90001737d00 EFLAGS: 00010206
       [95794.700434] RAX: 0000000000000000 RBX: ffff880070fa1418 RCX: ffffc90001737c7c
       [95794.701445] RDX: 0000000175aa0240 RSI: 0000000000000001 RDI: ffff880070fa1418
       [95794.702448] RBP: ffffc90001737d38 R08: 0000000000000000 R09: 0000000000000000
       [95794.703557] R10: ffffc90001737c48 R11: ffff88007123e158 R12: ffff880075b6a000
       [95794.704441] R13: ffff88006145c000 R14: ffff880070fa1418 R15: ffff880070c3b4a0
       [95794.705270] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.706341] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.707001] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.708030] Call Trace:
       [95794.708466]  destroy_inode+0x3d/0x55
       [95794.709071]  evict+0x177/0x17e
       [95794.709497]  dispose_list+0x50/0x71
       [95794.709973]  evict_inodes+0x132/0x141
       [95794.710564]  generic_shutdown_super+0x3f/0x10b
       [95794.711200]  kill_anon_super+0x12/0x1c
       [95794.711633]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.712139]  deactivate_locked_super+0x30/0x68
       [95794.712608]  deactivate_super+0x36/0x39
       [95794.713093]  cleanup_mnt+0x49/0x67
       [95794.713514]  __cleanup_mnt+0x12/0x14
       [95794.713933]  task_work_run+0x82/0xa6
       [95794.714543]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.715247]  syscall_return_slowpath+0x18c/0x1af
       [95794.715952]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.716653] RIP: 0033:0x7fa678cb99a7
       [95794.721100] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.722052] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.722856] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.723698] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.724736] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.725928] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.726728] Code: 40 ff ff ff 00 74 02 0f ff 48 83 bb f8 fe ff ff 00 74 02 0f ff 48 83 bb 00 ff ff ff 00 74 02 0f ff 48 83 bb 30 ff ff ff 00 74 02 <0f> ff 48 83 bb 08 ff ff ff 00 74 02 0f ff 4d 85 e4 0f 84 52 01
       [95794.729203] ---[ end trace e95877675c6ec009 ]---
       [95794.841054] ------------[ cut here ]------------
       [95794.841829] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:5831 btrfs_free_block_groups+0x235/0x36a [btrfs]
       [95794.843425] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.850658] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.852590] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.854752] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.855812] RIP: 0010:btrfs_free_block_groups+0x235/0x36a [btrfs]
       [95794.856811] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.857805] RAX: 0000000080000000 RBX: ffff88006145c000 RCX: 0000000000000001
       [95794.859014] RDX: 00000001810af668 RSI: 0000000000000002 RDI: 00000000ffffffff
       [95794.860270] RBP: ffffc90001737d98 R08: 0000000000000000 R09: ffffffff817e22b9
       [95794.861525] R10: ffffc90001737c80 R11: 00000000000337fd R12: 0000000000000000
       [95794.862700] R13: ffff88006145c0c0 R14: ffff88021b61a800 R15: ffff88006145c100
       [95794.863810] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.865149] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.866099] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.867198] Call Trace:
       [95794.867626]  close_ctree+0x1db/0x2b8 [btrfs]
       [95794.868188]  ? evict_inodes+0x132/0x141
       [95794.869037]  btrfs_put_super+0x15/0x17 [btrfs]
       [95794.870400]  generic_shutdown_super+0x6a/0x10b
       [95794.871262]  kill_anon_super+0x12/0x1c
       [95794.872046]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.872746]  deactivate_locked_super+0x30/0x68
       [95794.873687]  deactivate_super+0x36/0x39
       [95794.874639]  cleanup_mnt+0x49/0x67
       [95794.875504]  __cleanup_mnt+0x12/0x14
       [95794.876126]  task_work_run+0x82/0xa6
       [95794.876788]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.877777]  syscall_return_slowpath+0x18c/0x1af
       [95794.878381]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.878888] RIP: 0033:0x7fa678cb99a7
       [95794.879307] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.880204] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.881640] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.882690] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.883538] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.884562] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.885664] Code: 89 ef e8 07 ec 32 e1 e8 9d c0 ea e0 48 8d b3 28 02 00 00 48 83 c9 ff 31 d2 48 89 df e8 29 c5 ff ff 48 83 bb 80 02 00 00 00 74 02 <0f> ff 48 83 bb 88 02 00 00 00 74 02 0f ff 48 83 bb d8 02 00 00
       [95794.887980] ---[ end trace e95877675c6ec00a ]---
       [95794.888739] ------------[ cut here ]------------
       [95794.889405] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:5832 btrfs_free_block_groups+0x241/0x36a [btrfs]
       [95794.891020] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.897551] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.898509] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.899685] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.900592] RIP: 0010:btrfs_free_block_groups+0x241/0x36a [btrfs]
       [95794.901387] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.902300] RAX: 0000000080000000 RBX: ffff88006145c000 RCX: 0000000000000001
       [95794.903260] RDX: 00000001810af668 RSI: 0000000000000002 RDI: 00000000ffffffff
       [95794.904332] RBP: ffffc90001737d98 R08: 0000000000000000 R09: ffffffff817e22b9
       [95794.905300] R10: ffffc90001737c80 R11: 00000000000337fd R12: 0000000000000000
       [95794.906439] R13: ffff88006145c0c0 R14: ffff88021b61a800 R15: ffff88006145c100
       [95794.907459] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.908625] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.909511] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.910630] Call Trace:
       [95794.911153]  close_ctree+0x1db/0x2b8 [btrfs]
       [95794.911837]  ? evict_inodes+0x132/0x141
       [95794.912344]  btrfs_put_super+0x15/0x17 [btrfs]
       [95794.912975]  generic_shutdown_super+0x6a/0x10b
       [95794.913788]  kill_anon_super+0x12/0x1c
       [95794.914424]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.915142]  deactivate_locked_super+0x30/0x68
       [95794.915831]  deactivate_super+0x36/0x39
       [95794.916433]  cleanup_mnt+0x49/0x67
       [95794.917045]  __cleanup_mnt+0x12/0x14
       [95794.917665]  task_work_run+0x82/0xa6
       [95794.918309]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.919021]  syscall_return_slowpath+0x18c/0x1af
       [95794.919722]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.920426] RIP: 0033:0x7fa678cb99a7
       [95794.921039] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.922303] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.923335] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.924364] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.925435] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.926533] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.927557] Code: 48 8d b3 28 02 00 00 48 83 c9 ff 31 d2 48 89 df e8 29 c5 ff ff 48 83 bb 80 02 00 00 00 74 02 0f ff 48 83 bb 88 02 00 00 00 74 02 <0f> ff 48 83 bb d8 02 00 00 00 74 02 0f ff 48 83 bb e0 02 00 00
       [95794.930166] ---[ end trace e95877675c6ec00b ]---
       [95794.930961] ------------[ cut here ]------------
       [95794.931727] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:9953 btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.932729] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.938394] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.939842] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.941455] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.942336] RIP: 0010:btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.943268] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.944127] RAX: ffff8802004fd0e8 RBX: ffff88006145c000 RCX: 0000000000000001
       [95794.945211] RDX: 00000001810af668 RSI: 0000000000000002 RDI: 00000000ffffffff
       [95794.946316] RBP: ffffc90001737d98 R08: 0000000000000000 R09: ffffffff817e22b9
       [95794.947271] R10: ffffc90001737c80 R11: 00000000000337fd R12: ffff8802004fd0e8
       [95794.948219] R13: ffff88006145c0c0 R14: ffff88006145e598 R15: ffff88006145c100
       [95794.949193] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.950495] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.951338] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95794.952361] Call Trace:
       [95794.952811]  close_ctree+0x1db/0x2b8 [btrfs]
       [95794.953522]  ? evict_inodes+0x132/0x141
       [95794.954543]  btrfs_put_super+0x15/0x17 [btrfs]
       [95794.955231]  generic_shutdown_super+0x6a/0x10b
       [95794.955916]  kill_anon_super+0x12/0x1c
       [95794.956414]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95794.956953]  deactivate_locked_super+0x30/0x68
       [95794.957635]  deactivate_super+0x36/0x39
       [95794.958256]  cleanup_mnt+0x49/0x67
       [95794.958701]  __cleanup_mnt+0x12/0x14
       [95794.959181]  task_work_run+0x82/0xa6
       [95794.959635]  prepare_exit_to_usermode+0xe1/0x10c
       [95794.960182]  syscall_return_slowpath+0x18c/0x1af
       [95794.960731]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95794.961438] RIP: 0033:0x7fa678cb99a7
       [95794.961990] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95794.963111] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95794.963975] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95794.964680] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95794.965763] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95794.966868] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95794.967800] Code: 00 00 00 4c 8b a3 98 25 00 00 49 83 bc 24 60 ff ff ff 00 75 16 49 83 bc 24 68 ff ff ff 00 75 0b 49 83 bc 24 70 ff ff ff 00 74 16 <0f> ff 49 8d b4 24 18 ff ff ff 31 c9 31 d2 48 89 df e8 93 7a ff
       [95794.970629] ---[ end trace e95877675c6ec00c ]---
       [95794.971451] BTRFS info (device sdi): space_info 1 has 7680000 free, is not full
       [95794.972351] BTRFS info (device sdi): space_info total=8388608, used=704512, pinned=0, reserved=0, may_use=4096, readonly=0
       [95794.973595] ------------[ cut here ]------------
       [95794.974353] WARNING: CPU: 0 PID: 31496 at fs/btrfs/extent-tree.c:9953 btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.980163] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata scsi_mod e1000 [last unloaded: btrfs]
       [95794.986461] CPU: 0 PID: 31496 Comm: umount Tainted: G        W       4.14.0-rc6-btrfs-next-54+ #1
       [95794.987591] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
       [95794.988929] task: ffff880075aa0240 task.stack: ffffc90001734000
       [95794.989922] RIP: 0010:btrfs_free_block_groups+0x2bc/0x36a [btrfs]
       [95794.990715] RSP: 0018:ffffc90001737d70 EFLAGS: 00010206
       [95794.991431] RAX: ffff88020f6e70e8 RBX: ffff88006145c000 RCX: ffffffff8115a906
       [95794.992455] RDX: ffffffff8115a902 RSI: ffff880075aa0b40 RDI: ffff880075aa0b40
       [95794.993535] RBP: ffffc90001737d98 R08: 0000000000000020 R09: fffffffffffffff7
       [95794.994573] R10: 00000000ffffffc4 R11: ffff8800633b1bc0 R12: ffff88020f6e70e8
       [95794.996250] R13: 0000000000000038 R14: ffff88006145e598 R15: 0000000000000000
       [95794.997233] FS:  00007fa6793c92c0(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
       [95794.998592] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       [95794.999484] CR2: 000056338670d048 CR3: 00000000610dc005 CR4: 00000000001606f0
       [95795.000542] Call Trace:
       [95795.001138]  close_ctree+0x1db/0x2b8 [btrfs]
       [95795.001885]  ? evict_inodes+0x132/0x141
       [95795.002407]  btrfs_put_super+0x15/0x17 [btrfs]
       [95795.003093]  generic_shutdown_super+0x6a/0x10b
       [95795.003720]  kill_anon_super+0x12/0x1c
       [95795.004353]  btrfs_kill_super+0x16/0x21 [btrfs]
       [95795.005095]  deactivate_locked_super+0x30/0x68
       [95795.005716]  deactivate_super+0x36/0x39
       [95795.006388]  cleanup_mnt+0x49/0x67
       [95795.006939]  __cleanup_mnt+0x12/0x14
       [95795.007512]  task_work_run+0x82/0xa6
       [95795.008124]  prepare_exit_to_usermode+0xe1/0x10c
       [95795.008994]  syscall_return_slowpath+0x18c/0x1af
       [95795.009831]  entry_SYSCALL_64_fastpath+0xab/0xad
       [95795.010610] RIP: 0033:0x7fa678cb99a7
       [95795.011193] RSP: 002b:00007ffccf0aaed8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
       [95795.012327] RAX: 0000000000000000 RBX: 0000563386706030 RCX: 00007fa678cb99a7
       [95795.013432] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000056338670ca90
       [95795.014558] RBP: 000056338670ca90 R08: 000056338670c740 R09: 0000000000000015
       [95795.015577] R10: 00000000000006b4 R11: 0000000000000246 R12: 00007fa6791bae64
       [95795.016569] R13: 0000000000000000 R14: 0000563386706210 R15: 00007ffccf0ab160
       [95795.017662] Code: 00 00 00 4c 8b a3 98 25 00 00 49 83 bc 24 60 ff ff ff 00 75 16 49 83 bc 24 68 ff ff ff 00 75 0b 49 83 bc 24 70 ff ff ff 00 74 16 <0f> ff 49 8d b4 24 18 ff ff ff 31 c9 31 d2 48 89 df e8 93 7a ff
       [95795.020538] ---[ end trace e95877675c6ec00d ]---
       [95795.021259] BTRFS info (device sdi): space_info 4 has 1072775168 free, is not full
       [95795.022390] BTRFS info (device sdi): space_info total=1073741824, used=114688, pinned=0, reserved=0, may_use=786432, readonly=65536
      
      Fix this by ensuring the zero range operation does not call
      btrfs_truncate_block() if the corresponding extent is an unwritten one
      (it's pointless anyway, since reading from an unwritten extent yields
      zeroes).
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Tested-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      81fdf638