1. 24 3月, 2020 9 次提交
  2. 31 1月, 2020 3 次提交
    • J
      btrfs: drop the -EBUSY case in __extent_writepage_io · 5ab58055
      Josef Bacik 提交于
      Now that we only return 0 or -EAGAIN from btrfs_writepage_cow_fixup, we
      do not need this -EBUSY case.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5ab58055
    • N
      btrfs: Correctly handle empty trees in find_first_clear_extent_bit · 5750c375
      Nikolay Borisov 提交于
      Raviu reported that running his regular fs_trim segfaulted with the
      following backtrace:
      
      [  237.525947] assertion failed: prev, in ../fs/btrfs/extent_io.c:1595
      [  237.525984] ------------[ cut here ]------------
      [  237.525985] kernel BUG at ../fs/btrfs/ctree.h:3117!
      [  237.525992] invalid opcode: 0000 [#1] SMP PTI
      [  237.525998] CPU: 4 PID: 4423 Comm: fstrim Tainted: G     U     OE     5.4.14-8-vanilla #1
      [  237.526001] Hardware name: ASUSTeK COMPUTER INC.
      [  237.526044] RIP: 0010:assfail.constprop.58+0x18/0x1a [btrfs]
      [  237.526079] Call Trace:
      [  237.526120]  find_first_clear_extent_bit+0x13d/0x150 [btrfs]
      [  237.526148]  btrfs_trim_fs+0x211/0x3f0 [btrfs]
      [  237.526184]  btrfs_ioctl_fitrim+0x103/0x170 [btrfs]
      [  237.526219]  btrfs_ioctl+0x129a/0x2ed0 [btrfs]
      [  237.526227]  ? filemap_map_pages+0x190/0x3d0
      [  237.526232]  ? do_filp_open+0xaf/0x110
      [  237.526238]  ? _copy_to_user+0x22/0x30
      [  237.526242]  ? cp_new_stat+0x150/0x180
      [  237.526247]  ? do_vfs_ioctl+0xa4/0x640
      [  237.526278]  ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
      [  237.526283]  do_vfs_ioctl+0xa4/0x640
      [  237.526288]  ? __do_sys_newfstat+0x3c/0x60
      [  237.526292]  ksys_ioctl+0x70/0x80
      [  237.526297]  __x64_sys_ioctl+0x16/0x20
      [  237.526303]  do_syscall_64+0x5a/0x1c0
      [  237.526310]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      That was due to btrfs_fs_device::aloc_tree being empty. Initially I
      thought this wasn't possible and as a percaution have put the assert in
      find_first_clear_extent_bit. Turns out this is indeed possible and could
      happen when a file system with SINGLE data/metadata profile has a 2nd
      device added. Until balance is run or a new chunk is allocated on this
      device it will be completely empty.
      
      In this case find_first_clear_extent_bit should return the full range
      [0, -1ULL] and let the caller handle this i.e for trim the end will be
      capped at the size of actual device.
      
      Link: https://lore.kernel.org/linux-btrfs/izW2WNyvy1dEDweBICizKnd2KDwDiDyY2EYQr4YCwk7pkuIpthx-JRn65MPBde00ND6V0_Lh8mW0kZwzDiLDv25pUYWxkskWNJnVP0kgdMA=@protonmail.com/
      Fixes: 45bfcfc1 ("btrfs: Implement find_first_clear_extent_bit")
      CC: stable@vger.kernel.org # 5.2+
      Signed-off-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      5750c375
    • J
      btrfs: flush write bio if we loop in extent_write_cache_pages · 42ffb0bf
      Josef Bacik 提交于
      There exists a deadlock with range_cyclic that has existed forever.  If
      we loop around with a bio already built we could deadlock with a writer
      who has the page locked that we're attempting to write but is waiting on
      a page in our bio to be written out.  The task traces are as follows
      
        PID: 1329874  TASK: ffff889ebcdf3800  CPU: 33  COMMAND: "kworker/u113:5"
         #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f
         #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3
         #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42
         #3 [ffffc900297bb708] __lock_page at ffffffff811f145b
         #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502
         #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684
         #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff
         #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0
         #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2
         #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd
      
        PID: 2167901  TASK: ffff889dc6a59c00  CPU: 14  COMMAND:
        "aio-dio-invalid"
         #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f
         #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3
         #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42
         #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6
         #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7
         #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359
         #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933
         #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8
         #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d
         #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032
      
      I used drgn to find the respective pages we were stuck on
      
      page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901
      page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874
      
      As you can see the kworker is waiting for bit 0 (PG_locked) on index
      7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index
      8148.  aio-dio-invalid has 7680, and the kworker epd looks like the
      following
      
        crash> struct extent_page_data ffffc900297bbbb0
        struct extent_page_data {
          bio = 0xffff889f747ed830,
          tree = 0xffff889eed6ba448,
          extent_locked = 0,
          sync_io = 0
        }
      
      Probably worth mentioning as well that it waits for writeback of the
      page to complete while holding a lock on it (at prepare_pages()).
      
      Using drgn I walked the bio pages looking for page
      0xffffea00fbfc7500 which is the one we're waiting for writeback on
      
        bio = Object(prog, 'struct bio', address=0xffff889f747ed830)
        for i in range(0, bio.bi_vcnt.value_()):
            bv = bio.bi_io_vec[i]
            if bv.bv_page.value_() == 0xffffea00fbfc7500:
      	  print("FOUND IT")
      
      which validated what I suspected.
      
      The fix for this is simple, flush the epd before we loop back around to
      the beginning of the file during writeout.
      
      Fixes: b293f02e ("Btrfs: Add writepages support")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      42ffb0bf
  3. 20 1月, 2020 6 次提交
  4. 13 12月, 2019 1 次提交
  5. 19 11月, 2019 5 次提交
    • D
      btrfs: drop bdev argument from submit_extent_page · fa17ed06
      David Sterba 提交于
      After previous patches removing bdev being passed around to set it to
      bio, it has become unused in submit_extent_page. So it now has "only" 13
      parameters.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      fa17ed06
    • D
      btrfs: remove extent_map::bdev · a019e9e1
      David Sterba 提交于
      We can now remove the bdev from extent_map. Previous patches made sure
      that bio_set_dev is correctly in all places and that we don't need to
      grab it from latest_bdev or pass it around inside the extent map.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      a019e9e1
    • D
      btrfs: drop bio_set_dev where not needed · 1a418027
      David Sterba 提交于
      bio_set_dev sets a bdev to a bio and is not only setting a pointer bug
      also changing some state bits if there was a different bdev set before.
      This is one thing that's not needed.
      
      Another thing is that setting a bdev at bio allocation time is too early
      and actually does not work with plain redundancy profiles, where each
      time we submit a bio to a device, the bdev is set correctly.
      
      In many places the bio bdev is set to latest_bdev that seems to serve as
      a stub pointer "just to put something to bio". But we don't have to do
      that.
      
      Where do we know which bdev to set:
      
      * for regular IO: submit_stripe_bio that's called by btrfs_map_bio
      
      * repair IO: repair_io_failure, read or write from specific device
      
      * super block write (using buffer_heads but uses raw bdev) and barriers
      
      * scrub: this does not use all regular IO paths as it needs to reach all
        copies, verify and fixup eventually, and for that all bdev management
        is independent
      
      * raid56: rbio_add_io_page, for the RMW write
      
      * integrity-checker: does it's own low-level block tracking
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1a418027
    • D
      btrfs: get bdev directly from fs_devices in submit_extent_page · 429aebc0
      David Sterba 提交于
      This is preparatory patch to remove @bdev parameter from
      submit_extent_page. It can't be removed completely, because the cgroups
      need it for wbc when initializing the bio
      
      wbc_init_bio
        bio_associate_blkg_from_css
          dereference bdev->bi_disk->queue
      
      The bdev pointer is the same as latest_bdev, thus no functional change.
      We can retrieve it from fs_devices that's reachable through several
      dereferences. The local variable shadows the parameter, but that's only
      temporary.
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      429aebc0
    • D
      btrfs: sink write_flags to __extent_writepage_io · 57e5ffeb
      David Sterba 提交于
      __extent_writepage reads write flags from wbc and passes both to
      __extent_writepage_io. This makes write_flags redundant and we can
      remove it.
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      57e5ffeb
  6. 18 11月, 2019 9 次提交
    • T
      btrfs: Avoid getting stuck during cyclic writebacks · f7bddf1e
      Tejun Heo 提交于
      During a cyclic writeback, extent_write_cache_pages() uses done_index
      to update the writeback_index after the current run is over.  However,
      instead of current index + 1, it gets to to the current index itself.
      
      Unfortunately, this, combined with returning on EOF instead of looping
      back, can lead to the following pathlogical behavior.
      
      1. There is a single file which has accumulated enough dirty pages to
         trigger balance_dirty_pages() and the writer appending to the file
         with a series of short writes.
      
      2. balance_dirty_pages kicks in, wakes up background writeback and sleeps.
      
      3. Writeback kicks in and the cursor is on the last page of the dirty
         file.  Writeback is started or skipped if already in progress.  As
         it's EOF, extent_write_cache_pages() returns and the cursor is set
         to done_index which is pointing to the last page.
      
      4. Writeback is done.  Nothing happens till balance_dirty_pages
         finishes, at which point we go back to #1.
      
      This can almost completely stall out writing back of the file and keep
      the system over dirty threshold for a long time which can mess up the
      whole system.  We encountered this issue in production with a package
      handling application which can reliably reproduce the issue when
      running under tight memory limits.
      
      Reading the comment in the error handling section, this seems to be to
      avoid accidentally skipping a page in case the write attempt on the
      page doesn't succeed.  However, this concern seems bogus.
      
      On each page, the code either:
      
      * Skips and moves onto the next page.
      
      * Fails issue and sets done_index to index + 1.
      
      * Successfully issues and continue to the next page if budget allows
        and not EOF.
      
      IOW, as long as it's not EOF and there's budget, the code never
      retries writing back the same page.  Only when a page happens to be
      the last page of a particular run, we end up retrying the page, which
      can't possibly guarantee anything data integrity related.  Besides,
      cyclic writes are only used for non-syncing writebacks meaning that
      there's no data integrity implication to begin with.
      
      Fix it by always setting done_index past the current page being
      processed.
      
      Note that this problem exists in other writepages too.
      
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f7bddf1e
    • C
      Btrfs: extent_write_locked_range() should attach inode->i_wb · dbb70bec
      Chris Mason 提交于
      extent_write_locked_range() is used when we're falling back to buffered
      IO from inside of compression.  It allocates its own wbc and should
      associate it with the inode's i_wb to make sure the IO goes down from
      the correct cgroup.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      dbb70bec
    • C
      Btrfs: use REQ_CGROUP_PUNT for worker thread submitted bios · ec39f769
      Chris Mason 提交于
      Async CRCs and compression submit IO through helper threads, which means
      they have IO priority inversions when cgroup IO controllers are in use.
      
      This flags all of the writes submitted by btrfs helper threads as
      REQ_CGROUP_PUNT.  submit_bio() will punt these to dedicated per-blkcg
      work items to avoid the priority inversion.
      
      For the compression code, we take a reference on the wbc's blkg css and
      pass it down to the async workers.
      
      For the async CRCs, the bio already has the correct css, we just need to
      tell the block layer to use REQ_CGROUP_PUNT.
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      Modified-and-reviewed-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      ec39f769
    • C
      Btrfs: only associate the locked page with one async_chunk struct · 1d53c9e6
      Chris Mason 提交于
      The btrfs writepages function collects a large range of pages flagged
      for delayed allocation, and then sends them down through the COW code
      for processing.  When compression is on, we allocate one async_chunk
      structure for every 512K, and then run those pages through the
      compression code for IO submission.
      
      writepages starts all of this off with a single page, locked by the
      original call to extent_write_cache_pages(), and it's important to keep
      track of this page because it has already been through
      clear_page_dirty_for_io().
      
      The btrfs async_chunk struct has a pointer to the locked_page, and when
      we're redirtying the page because compression had to fallback to
      uncompressed IO, we use page->index to decide if a given async_chunk
      struct really owns that page.
      
      But, this is racey.  If a given delalloc range is broken up into two
      async_chunks (chunkA and chunkB), we can end up with something like
      this:
      
       compress_file_range(chunkA)
       submit_compress_extents(chunkA)
       submit compressed bios(chunkA)
       put_page(locked_page)
      
      				 compress_file_range(chunkB)
      				 ...
      
      Or:
      
       async_cow_submit
        submit_compressed_extents <--- falls back to buffered writeout
         cow_file_range
          extent_clear_unlock_delalloc
           __process_pages_contig
             put_page(locked_pages)
      
      					    async_cow_submit
      
      The end result is that chunkA is completed and cleaned up before chunkB
      even starts processing.  This means we can free locked_page() and reuse
      it elsewhere.  If we get really lucky, it'll have the same page->index
      in its new home as it did before.
      
      While we're processing chunkB, we might decide we need to fall back to
      uncompressed IO, and so compress_file_range() will call
      __set_page_dirty_nobufers() on chunkB->locked_page.
      
      Without cgroups in use, this creates as a phantom dirty page, which
      isn't great but isn't the end of the world. What can happen, it can go
      through the fixup worker and the whole COW machinery again:
      
      in submit_compressed_extents():
        while (async extents) {
        ...
          cow_file_range
          if (!page_started ...)
            extent_write_locked_range
          else if (...)
            unlock_page
          continue;
      
      This hasn't been observed in practice but is still possible.
      
      With cgroups in use, we might crash in the accounting code because
      page->mapping->i_wb isn't set.
      
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
        IP: percpu_counter_add_batch+0x11/0x70
        PGD 66534e067 P4D 66534e067 PUD 66534f067 PMD 0
        Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
        CPU: 16 PID: 2172 Comm: rm Not tainted
        RIP: 0010:percpu_counter_add_batch+0x11/0x70
        RSP: 0018:ffffc9000a97bbe0 EFLAGS: 00010286
        RAX: 0000000000000005 RBX: 0000000000000090 RCX: 0000000000026115
        RDX: 0000000000000030 RSI: ffffffffffffffff RDI: 0000000000000090
        RBP: 0000000000000000 R08: fffffffffffffff5 R09: 0000000000000000
        R10: 00000000000260c0 R11: ffff881037fc26c0 R12: ffffffffffffffff
        R13: ffff880fe4111548 R14: ffffc9000a97bc90 R15: 0000000000000001
        FS:  00007f5503ced480(0000) GS:ffff880ff7200000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000000000d0 CR3: 00000001e0459005 CR4: 0000000000360ee0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
         account_page_cleaned+0x15b/0x1f0
         __cancel_dirty_page+0x146/0x200
         truncate_cleanup_page+0x92/0xb0
         truncate_inode_pages_range+0x202/0x7d0
         btrfs_evict_inode+0x92/0x5a0
         evict+0xc1/0x190
         do_unlinkat+0x176/0x280
         do_syscall_64+0x63/0x1a0
         entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      The fix here is to make asyc_chunk->locked_page NULL everywhere but the
      one async_chunk struct that's allowed to do things to the locked page.
      
      Link: https://lore.kernel.org/linux-btrfs/c2419d01-5c84-3fb4-189e-4db519d08796@suse.com/
      Fixes: 771ed689 ("Btrfs: Optimize compressed writeback and reads")
      Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      [ update changelog from mail thread discussion ]
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      1d53c9e6
    • J
      btrfs: move the failrec tree stuff into extent-io-tree.h · b3f167aa
      Josef Bacik 提交于
      This needs to be cleaned up in the future, but for now it belongs to the
      extent-io-tree stuff since it uses the internal tree search code.
      Needed to export get_state_failrec and set_state_failrec as well since
      we're not going to move the actual IO part of the failrec stuff out at
      this point.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      b3f167aa
    • J
      btrfs: export find_delalloc_range · 083e75e7
      Josef Bacik 提交于
      This utilizes internal stuff to the extent_io_tree, so we need to export
      it before we move it.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      083e75e7
    • J
      btrfs: move extent_io_tree defs to their own header · 9c7d3a54
      Josef Bacik 提交于
      extent_io.c/h are huge, encompassing a bunch of different things.  The
      extent_io_tree code can live on its own, so separate this out.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      9c7d3a54
    • J
      btrfs: separate out the extent io init function · 6f0d04f8
      Josef Bacik 提交于
      We are moving extent_io_tree into it's on file, so separate out the
      extent_state init stuff from extent_io_tree_init().
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6f0d04f8
    • J
      btrfs: separate out the extent leak code · 33ca832f
      Josef Bacik 提交于
      We check both extent buffer and extent state leaks in the same function,
      separate these two functions out so we can move them around.
      Signed-off-by: NJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      33ca832f
  7. 24 9月, 2019 2 次提交
  8. 12 9月, 2019 1 次提交
    • F
      Btrfs: fix unwritten extent buffers and hangs on future writeback attempts · 18dfa711
      Filipe Manana 提交于
      The lock_extent_buffer_io() returns 1 to the caller to tell it everything
      went fine and the callers needs to start writeback for the extent buffer
      (submit a bio, etc), 0 to tell the caller everything went fine but it does
      not need to start writeback for the extent buffer, and a negative value if
      some error happened.
      
      When it's about to return 1 it tries to lock all pages, and if a try lock
      on a page fails, and we didn't flush any existing bio in our "epd", it
      calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or
      an error. The page might have been locked elsewhere, not with the goal
      of starting writeback of the extent buffer, and even by some code other
      than btrfs, like page migration for example, so it does not mean the
      writeback of the extent buffer was already started by some other task,
      so returning a 0 tells the caller (btree_write_cache_pages()) to not
      start writeback for the extent buffer. Note that epd might currently have
      either no bio, so flush_write_bio() returns 0 (success) or it might have
      a bio for another extent buffer with a lower index (logical address).
      
      Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the
      extent buffer and writeback is never started for the extent buffer,
      future attempts to writeback the extent buffer will hang forever waiting
      on that bit to be cleared, since it can only be cleared after writeback
      completes. Such hang is reported with a trace like the following:
      
        [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds.
        [49887.347059]       Not tainted 5.2.13-gentoo #2
        [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        [49887.347062] btrfs-transacti D    0  1752      2 0x80004000
        [49887.347064] Call Trace:
        [49887.347069]  ? __schedule+0x265/0x830
        [49887.347071]  ? bit_wait+0x50/0x50
        [49887.347072]  ? bit_wait+0x50/0x50
        [49887.347074]  schedule+0x24/0x90
        [49887.347075]  io_schedule+0x3c/0x60
        [49887.347077]  bit_wait_io+0x8/0x50
        [49887.347079]  __wait_on_bit+0x6c/0x80
        [49887.347081]  ? __lock_release.isra.29+0x155/0x2d0
        [49887.347083]  out_of_line_wait_on_bit+0x7b/0x80
        [49887.347084]  ? var_wake_function+0x20/0x20
        [49887.347087]  lock_extent_buffer_for_io+0x28c/0x390
        [49887.347089]  btree_write_cache_pages+0x18e/0x340
        [49887.347091]  do_writepages+0x29/0xb0
        [49887.347093]  ? kmem_cache_free+0x132/0x160
        [49887.347095]  ? convert_extent_bit+0x544/0x680
        [49887.347097]  filemap_fdatawrite_range+0x70/0x90
        [49887.347099]  btrfs_write_marked_extents+0x53/0x120
        [49887.347100]  btrfs_write_and_wait_transaction.isra.4+0x38/0xa0
        [49887.347102]  btrfs_commit_transaction+0x6bb/0x990
        [49887.347103]  ? start_transaction+0x33e/0x500
        [49887.347105]  transaction_kthread+0x139/0x15c
      
      So fix this by not overwriting the return value (ret) with the result
      from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK
      bit in case flush_write_bio() returns an error, otherwise it will hang
      any future attempts to writeback the extent buffer, and undo all work
      done before (set back EXTENT_BUFFER_DIRTY, etc).
      
      This is a regression introduced in the 5.2 kernel.
      
      Fixes: 2e3c2513 ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()")
      Fixes: f4340622 ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up")
      Reported-by: NZdenek Sojka <zsojka@seznam.cz>
      Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#uReported-by: NStefan Priebe - Profihost AG <s.priebe@profihost.ag>
      Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#tReported-by: NDrazen Kacar <drazen.kacar@oradian.com>
      Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      18dfa711
  9. 09 9月, 2019 2 次提交
  10. 10 7月, 2019 1 次提交
  11. 06 7月, 2019 1 次提交