1. 26 9月, 2022 4 次提交
  2. 26 7月, 2022 3 次提交
    • C
      btrfs: don't call btrfs_page_set_checked in finish_compressed_bio_read · 0b078d9d
      Christoph Hellwig 提交于
      This flag was used to communicate that the low-level compression code
      already did verify the checksum to the high-level I/O completion code.
      
      But it has been unused for a long time as the upper btrfs_bio for the
      decompressed data had a NULL csum pointer basically since that pointer
      existed and the code already checks for that a little later.
      
      Note that this does not affect the other use of the checked flag, which
      is only used for the COW fixup worker.
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      0b078d9d
    • C
      btrfs: fix repair of compressed extents · 81bd9328
      Christoph Hellwig 提交于
      Currently the checksum of compressed extents is verified based on the
      compressed data and the lower btrfs_bio, but the actual repair process
      is driven by end_bio_extent_readpage on the upper btrfs_bio for the
      decompressed data.
      
      This has a bunch of issues, including not being able to properly
      communicate the failed mirror up in case that the I/O submission got
      preempted, a general loss of if an error was an I/O error or a checksum
      verification failure, but most importantly that this design causes
      btrfs_clean_io_failure to eventually write back the uncompressed good
      data onto the disk sectors that are supposed to contain compressed data.
      
      Fix this by moving the repair to the lower btrfs_bio.  To do so, a fair
      amount of code has to be reshuffled:
      
       a) the lower btrfs_bio now needs a valid csum pointer.  The easiest way
          to achieve that is to pass NULL btrfs_lookup_bio_sums and just use
          the btrfs_bio management of csums.  For a compressed_bio that is
          split into multiple btrfs_bios this means additional memory
          allocations, but the code becomes a lot more regular.
       b) checksum verification now runs directly on the lower btrfs_bio instead
          of the compressed_bio.  This actually nicely simplifies the end I/O
          processing.
       c) btrfs_repair_one_sector can't just look up the logical address for
          the file offset any more, as there is no corresponding relative
          offsets that apply to the file offset and the logic address for
          compressed extents.  Instead require that the saved bvec_iter in the
          btrfs_bio is filled out for all read bios and use that, which again
          removes a fair amount of code.
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      81bd9328
    • C
      btrfs: simplify the pending I/O counting in struct compressed_bio · 524bcd1e
      Christoph Hellwig 提交于
      Instead of counting the sectors just count the bios, with an extra
      reference held during submission.  This significantly simplifies the
      submission side error handling.
      
      This slightly changes completion and error handling of
      btrfs_submit_compressed_{read,write} because with the old code the
      compressed_bio could have been completed in
      submit_compressed_{read,write} only if there was an error during
      submission for one of the lower bio, whilst with the new code there is a
      chance for this to happen even for successful submission if the all the
      lower bios complete before the end of the function is reached.
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Reviewed-by: NBoris Burkov <boris@bur.io>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      524bcd1e
  3. 25 7月, 2022 5 次提交
  4. 15 7月, 2022 1 次提交
  5. 16 5月, 2022 4 次提交
  6. 06 4月, 2022 1 次提交
  7. 14 3月, 2022 6 次提交
  8. 07 1月, 2022 1 次提交
  9. 03 1月, 2022 1 次提交
  10. 27 10月, 2021 14 次提交
    • D
      Revert "btrfs: compression: drop kmap/kunmap from generic helpers" · 3a60f653
      David Sterba 提交于
      This reverts commit 4c2bf276.
      
      The kmaps in compression code are still needed and cause crashes on
      32bit machines (ARM, x86). Reproducible eg. by running fstest btrfs/004
      with enabled LZO or ZSTD compression.
      
      Link: https://lore.kernel.org/all/CAJCQCtT+OuemovPO7GZk8Y8=qtOObr0XTDp8jh4OHD6y84AFxw@mail.gmail.com/
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=214839Signed-off-by: NDavid Sterba <dsterba@suse.com>
      3a60f653
    • Q
      btrfs: subpage: make end_compressed_bio_writeback() compatible · 741ec653
      Qu Wenruo 提交于
      In end_compressed_writeback() we just clear the full page writeback.
      For subpage case, if there are two delalloc ranges in the same page, the
      2nd range will trigger a BUG_ON() as the page writeback is already
      cleared by previous range.
      
      Fix it by using btrfs_page_clamp_clear_writeback() helper.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      741ec653
    • Q
      btrfs: subpage: make btrfs_submit_compressed_write() compatible · bbbff01a
      Qu Wenruo 提交于
      There is a WARN_ON() checking if @start is aligned to PAGE_SIZE, not
      sectorsize, which will cause false alert for subpage.  Fix it to check
      against sectorsize.
      
      Furthermore:
      
      - Use ASSERT() to do the check
        So that in the future we may skip the check for production build
      
      - Also check alignment for @len
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      bbbff01a
    • Q
      btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write · 91507240
      Qu Wenruo 提交于
      Currently btrfs_submit_compressed_write() will check
      btrfs_bio_fits_in_stripe() each time a new page is going to be added.
      Even if compressed extent is small, we don't really need to do that for
      every page.
      
      Align the behavior to extent_io.c, by determining the stripe boundary
      when allocating a bio.
      
      Unlike extent_io.c, in compressed.c we don't need to bother things like
      different bio flags, thus no need to re-use bio_ctrl.
      
      Here we just manually introduce new local variable, next_stripe_start,
      and use that value returned from alloc_compressed_bio() to calculate
      the stripe boundary.
      
      Then each time we add some page range into the bio, we check if we
      reached the boundary.  And if reached, submit it.
      
      Also, since we have @cur_disk_bytenr to determine whether we're the last
      bio, we don't need a explicit last_bio: tag for error handling any more.
      
      And since we use @cur_disk_bytenr to wait, there is no need for
      pending_bios, also remove it to save some memory of compressed_bio.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      91507240
    • Q
      btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_read · f472c28f
      Qu Wenruo 提交于
      Currently btrfs_submit_compressed_read() will check
      btrfs_bio_fits_in_stripe() each time a new page is going to be added.
      Even if compressed extent is small, we don't really need to do that for
      every page.
      
      This patch will align the behavior to extent_io.c, by determining the
      stripe boundary when allocating a bio.
      
      Unlike extent_io.c, in compressed.c we don't need to bother things like
      different bio flags, thus no need to re-use bio_ctrl.
      
      Here we just manually introduce new local variable, next_stripe_start,
      and teach alloc_compressed_bio() to calculate the stripe boundary.
      
      Then each time we add some page range into the bio, we check if we
      reached the boundary.  And if reached, submit it.
      
      Also, since we have @cur_disk_byte to determine whether we're the last
      bio, we don't need a explicit last_bio: tag for error handling any more.
      
      And we can use @cur_disk_byte to track which range has been added to
      bio, we can also use @cur_disk_byte to calculate the wait condition, no
      need for @pending_bios.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      f472c28f
    • Q
      btrfs: introduce alloc_compressed_bio() for compression · 22c306fe
      Qu Wenruo 提交于
      Just aggregate the bio allocation code into one helper, so that we can
      replace 4 call sites.
      
      There is one special note for zoned write.
      
      Currently btrfs_submit_compressed_write() will only allocate the first
      bio using ZONE_APPEND.  If we have to submit current bio due to stripe
      boundary, the new bio allocated will not use ZONE_APPEND.
      
      In theory this should be a bug, but considering zoned mode currently
      only support SINGLE profile, which doesn't have any stripe boundary
      limit, it should never be a problem and we have assertions in place.
      
      This function will provide a good entrance for any work which needs to
      be done at bio allocation time. Like determining the stripe boundary.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      22c306fe
    • Q
      btrfs: introduce submit_compressed_bio() for compression · 2d4e0b84
      Qu Wenruo 提交于
      The new helper, submit_compressed_bio(), will aggregate the following
      work:
      
      - Increase compressed_bio::pending_bios
      - Remap the endio function
      - Map and submit the bio
      
      This slightly reorders calls to btrfs_csum_one_bio or
      btrfs_lookup_bio_sums but but none of them does anything regarding IO
      submission so this is effectively no change. We mainly care about order
      of
      
      - atomic_inc
      - btrfs_bio_wq_end_io
      - btrfs_map_bio
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      2d4e0b84
    • Q
      btrfs: handle errors properly inside btrfs_submit_compressed_write() · 6853c64a
      Qu Wenruo 提交于
      Just like btrfs_submit_compressed_read(), there are quite some BUG_ON()s
      inside btrfs_submit_compressed_write() for the bio submission path.
      
      Fix them using the same method:
      
      - For last bio, just endio the bio
        As in that case, one of the endio function of all these submitted bio
        will be able to free the compressed_bio
      
      - For half-submitted bio, wait and finish the compressed_bio manually
        In this case, as long as all other bio finish, we're the only one
        referring the compressed bio, and can manually finish it.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6853c64a
    • Q
      btrfs: handle errors properly inside btrfs_submit_compressed_read() · 86ccbb4d
      Qu Wenruo 提交于
      There are quite some BUG_ON()s inside btrfs_submit_compressed_read(),
      namely all errors inside the for() loop relies on BUG_ON() to handle
      -ENOMEM.
      
      Handle these errors properly by:
      
      - Wait for submitted bios to finish first
        Using wake_var_event() APIs to wait without introducing extra memory
        overhead inside compressed_bio.
        This allows us to wait for any submitted bio to finish, while still
        keeps the compressed_bio from being freed.
      
      - Introduce finish_compressed_bio_read() to finish the compressed_bio
      
      - Properly end the bio and finish compressed_bio when error happens
      
      Now in btrfs_submit_compressed_read() even when the bio submission
      failed, we can properly handle the error without triggering BUG_ON().
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      86ccbb4d
    • Q
      btrfs: subpage: add bitmap for PageChecked flag · e4f94347
      Qu Wenruo 提交于
      Although in btrfs we have very limited usage of PageChecked flag, it's
      still some page flag not yet subpage compatible.
      
      Fix it by introducing btrfs_subpage::checked_offset to do the convert.
      
      For most call sites, especially for free-space cache, COW fixup and
      btrfs_invalidatepage(), they all work in full page mode anyway.
      
      For other call sites, they work as subpage compatible mode.
      
      Some call sites need extra modification:
      
      - btrfs_drop_pages()
        Needs extra parameter to get the real range we need to clear checked
        flag.
      
        Also since btrfs_drop_pages() will accept pages beyond the dirtied
        range, update btrfs_subpage_clamp_range() to handle such case
        by setting @len to 0 if the page is beyond target range.
      
      - btrfs_invalidatepage()
        We need to call subpage helper before calling __btrfs_releasepage(),
        or it will trigger ASSERT() as page->private will be cleared.
      
      - btrfs_verify_data_csum()
        In theory we don't need the io_bio->csum check anymore, but it's
        won't hurt.  Just change the comment.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      e4f94347
    • Q
      btrfs: introduce compressed_bio::pending_sectors to trace compressed bio · 6ec9765d
      Qu Wenruo 提交于
      For btrfs_submit_compressed_read() and btrfs_submit_compressed_write(),
      we have a pretty weird dance around compressed_bio::pending_bios:
      
        btrfs_submit_compressed_read/write()
        {
      	cb = kmalloc()
      	refcount_set(&cb->pending_bios, 0);
      	bio = btrfs_alloc_bio();
      
      	/* NOTE here, we haven't yet submitted any bio */
      	refcount_set(&cb->pending_bios, 1);
      
      	for (pg_index = 0; pg_index < cb->nr_pages; pg_index++) {
      		if (submit) {
      			/* Here we submit bio, but we always have one
      			 * extra pending_bios */
      			refcount_inc(&cb->pending_bios);
      			ret = btrfs_map_bio();
      		}
      	}
      
      	/* Submit the last bio */
      	ret = btrfs_map_bio();
        }
      
      There are two reasons why we do this:
      
      - compressed_bio::pending_bios is a refcount
        Thus if it's reduced to 0, it can not be increased again.
      
      - To ensure the compressed_bio is not freed by some submitted bios
        If the submitted bio is finished before the next bio submitted,
        we can free the compressed_bio completely.
      
      But the above code is sometimes confusing, and we can do it better by
      introducing a new member, compressed_bio::pending_sectors.
      
      Now we use compressed_bio::pending_sectors to indicate whether we have
      any pending sectors under IO or not yet submitted.
      
      If pending_sectors == 0, we're definitely the last bio of compressed_bio,
      and is OK to release the compressed bio.
      
      Now the workflow looks like this:
      
        btrfs_submit_compressed_read/write()
        {
      	cb = kmalloc()
      	atomic_set(&cb->pending_bios, 0);
      	refcount_set(&cb->pending_sectors,
      		     compressed_len >> sectorsize_bits);
      	bio = btrfs_alloc_bio();
      
      	for (pg_index = 0; pg_index < cb->nr_pages; pg_index++) {
      		if (submit) {
      			refcount_inc(&cb->pending_bios);
      			ret = btrfs_map_bio();
      		}
      	}
      
      	/* Submit the last bio */
      	refcount_inc(&cb->pending_bios);
      	ret = btrfs_map_bio();
        }
      
      For now we still need pending_bios for later error handling, but will
      remove pending_bios eventually after properly handling the errors.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6ec9765d
    • Q
      btrfs: subpage: make add_ra_bio_pages() compatible · 6a404910
      Qu Wenruo 提交于
      [BUG]
      If we remove the subpage limitation in add_ra_bio_pages(), then read a
      compressed extent which has part of its range in next page, like the
      following inode layout:
      
      	0	32K	64K	96K	128K
      	|<--------------|-------------->|
      
      Btrfs will trigger ASSERT() in endio function:
      
        assertion failed: atomic_read(&subpage->readers) >= nbits
        ------------[ cut here ]------------
        kernel BUG at fs/btrfs/ctree.h:3431!
        Internal error: Oops - BUG: 0 [#1] SMP
        Workqueue: btrfs-endio btrfs_work_helper [btrfs]
        Call trace:
         assertfail.constprop.0+0x28/0x2c [btrfs]
         btrfs_subpage_end_reader+0x148/0x14c [btrfs]
         end_page_read+0x8c/0x100 [btrfs]
         end_bio_extent_readpage+0x320/0x6b0 [btrfs]
         bio_endio+0x15c/0x1dc
         end_workqueue_fn+0x44/0x64 [btrfs]
         btrfs_work_helper+0x74/0x250 [btrfs]
         process_one_work+0x1d4/0x47c
         worker_thread+0x180/0x400
         kthread+0x11c/0x120
         ret_from_fork+0x10/0x30
        ---[ end trace c8b7b552d3bb408c ]---
      
      [CAUSE]
      When we read the page range [0, 64K), we find it's a compressed extent,
      and we will try to add extra pages in add_ra_bio_pages() to avoid
      reading the same compressed extent.
      
      But when we add such page into the read bio, it doesn't follow the
      behavior of btrfs_do_readpage() to properly set subpage::readers.
      
      This means, for page [64K, 128K), its subpage::readers is still 0.
      
      And when endio is executed on both pages, since page [64K, 128K) has 0
      subpage::readers, it triggers above ASSERT()
      
      [FIX]
      Function add_ra_bio_pages() is far from subpage compatible, it always
      assume PAGE_SIZE == sectorsize, thus when it skip to next range it
      always just skip PAGE_SIZE.
      
      Make it subpage compatible by:
      
      - Skip to next page properly when needed
        If we find there is already a page cache, we need to skip to next page.
        For that case, we shouldn't just skip PAGE_SIZE bytes, but use
        @pg_index to calculate the next bytenr and continue.
      
      - Only add the page range covered by current extent map
        We need to calculate which range is covered by current extent map and
        only add that part into the read bio.
      
      - Update subpage::readers before submitting the bio
      
      - Use proper cursor other than confusing @last_offset
      
      - Calculate the missed threshold based on sector size
        It's no longer using missed pages, as for 64K page size, we have at
        most 3 pages to skip. (If aligned only 2 pages)
      
      - Add ASSERT() to make sure our bytenr is always aligned
      
      - Add comment for the function
        Add a special note for subpage case, as the function won't really
        work well for subpage cases.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      6a404910
    • Q
      btrfs: remove unused parameter nr_pages in add_ra_bio_pages() · cd9255be
      Qu Wenruo 提交于
      Variable @nr_pages only gets increased but never used.  Remove it.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      cd9255be
    • Q
      btrfs: rename struct btrfs_io_bio to btrfs_bio · c3a3b19b
      Qu Wenruo 提交于
      Previously we had "struct btrfs_bio", which records IO context for
      mirrored IO and RAID56, and "strcut btrfs_io_bio", which records extra
      btrfs specific info for logical bytenr bio.
      
      With "btrfs_bio" renamed to "btrfs_io_context", we are safe to rename
      "btrfs_io_bio" to "btrfs_bio" which is a more suitable name now.
      
      The struct btrfs_bio changes meaning by this commit. There was a
      suggested name like btrfs_logical_bio but it's a bit long and we'd
      prefer to use a shorter name.
      
      This could be a concern for backports to older kernels where the
      different meaning could possibly cause confusion or bugs. Comparing the
      new and old structures, there's no overlap among the struct members so a
      build would break in case of incorrect backport.
      
      We haven't had many backports to bio code anyway so this is more of a
      theoretical cause of bugs and a matter of precaution but we'll need to
      keep the semantic change in mind.
      Signed-off-by: NQu Wenruo <wqu@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      c3a3b19b