提交 · 8e81aa16a42169faae1ba15cd648cc8bb83eaa48 · openeuler / Kernel

16 2月, 2023 1 次提交

iomap: remove IOMAP_F_ZONE_APPEND · 8e81aa16

由 Christoph Hellwig 提交于 1月 21, 2023

No users left now that btrfs takes REQ_OP_WRITE bios from iomap and
splits and converts them to REQ_OP_ZONE_APPEND internally.
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8e81aa16

29 11月, 2022 1 次提交

iomap: write iomap validity checks · d7b64041

由 Dave Chinner 提交于 11月 29, 2022

A recent multithreaded write data corruption has been uncovered in
the iomap write code. The core of the problem is partial folio
writes can be flushed to disk while a new racing write can map it
and fill the rest of the page:

writeback			new write

allocate blocks
  blocks are unwritten
submit IO
.....
				map blocks
				iomap indicates UNWRITTEN range
				loop {
				  lock folio
				  copyin data
.....
IO completes
  runs unwritten extent conv
    blocks are marked written
				  <iomap now stale>
				  get next folio
				}

Now add memory pressure such that memory reclaim evicts the
partially written folio that has already been written to disk.

When the new write finally gets to the last partial page of the new
write, it does not find it in cache, so it instantiates a new page,
sees the iomap is unwritten, and zeros the part of the page that
it does not have data from. This overwrites the data on disk that
was originally written.

The full description of the corruption mechanism can be found here:

https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/

To solve this problem, we need to check whether the iomap is still
valid after we lock each folio during the write. We have to do it
after we lock the page so that we don't end up with state changes
occurring while we wait for the folio to be locked.

Hence we need a mechanism to be able to check that the cached iomap
is still valid (similar to what we already do in buffered
writeback), and we need a way for ->begin_write to back out and
tell the high level iomap iterator that we need to remap the
remaining write range.

The iomap needs to grow some storage for the validity cookie that
the filesystem provides to travel with the iomap. XFS, in
particular, also needs to know some more information about what the
iomap maps (attribute extents rather than file data extents) to for
the validity cookie to cover all the types of iomaps we might need
to validate.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

d7b64041

23 11月, 2022 1 次提交

xfs,iomap: move delalloc punching to iomap · 9c7babf9

由 Dave Chinner 提交于 11月 23, 2022

Because that's what Christoph wants for this error handling path
only XFS uses.

It requires a new iomap export for handling errors over delalloc
ranges. This is basically the XFS code as is stands, but even though
Christoph wants this as iomap funcitonality, we still have 
to call it from the filesystem specific ->iomap_end callback, and
call into the iomap code with yet another filesystem specific
callback to punch the delalloc extent within the defined ranges.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

9c7babf9

03 8月, 2022 1 次提交

mm/migrate: Add filemap_migrate_folio() · 2ec810d5

由 Matthew Wilcox (Oracle) 提交于 6月 06, 2022

There is nothing iomap-specific about iomap_migratepage(), and it fits
a pattern used by several other filesystems, so move it to mm/migrate.c,
convert it to be filemap_migrate_folio() and convert the iomap filesystems
to use it.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

2ec810d5

23 7月, 2022 1 次提交

iomap: remove iomap_writepage · 478af190

由 Christoph Hellwig 提交于 7月 22, 2022

Unused now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

478af190

11 6月, 2022 1 次提交

teach iomap_dio_rw() to suppress dsync · 36518b6b

由 Al Viro 提交于 6月 07, 2022

New flag, equivalent to removal of IOCB_DSYNC from iocb flags.
This mimics what btrfs is doing (and that's what btrfs will
switch to).  However, I'm not at all sure that we want to
suppress REQ_FUA for those - all btrfs hack really cares about
is suppression of generic_write_sync().  For now let's keep
the existing behaviour, but I really want to hear more detailed
arguments pro or contra.

[folded brain fix from willy]
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChristian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

36518b6b

16 5月, 2022 2 次提交

iomap: add per-iomap_iter private data · 786f847f

由 Christoph Hellwig 提交于 5月 05, 2022

Allow the file system to keep state for all iterations.  For now only
wire it up for direct I/O as there is an immediate need for it there.
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

786f847f

iomap: allow the file system to provide a bio_set for direct I/O · 908c5490

由 Christoph Hellwig 提交于 5月 05, 2022

Allow the file system to provide a specific bio_set for allocating
direct I/O bios.  This will allow file systems that use the
->submit_io hook to stash away additional information for file system
use.

To make use of this additional space for information in the completion
path, the file system needs to override the ->bi_end_io callback and
then call back into iomap, so export iomap_dio_bio_end_io for that.
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDavid Sterba <dsterba@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

908c5490

10 5月, 2022 2 次提交

iomap: Convert to release_folio · 8597447d

由 Matthew Wilcox (Oracle) 提交于 4月 30, 2022

Change all the filesystems which used iomap_releasepage to use the
new function.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NJeff Layton <jlayton@kernel.org>

8597447d

fs: Convert iomap_readpage to iomap_read_folio · 7479c505

由 Matthew Wilcox (Oracle) 提交于 4月 29, 2022

A straightforward conversion as iomap_readpage already worked in folios.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>

7479c505

15 3月, 2022 2 次提交

iomap: Remove iomap_invalidatepage() · d82354f6

由 Matthew Wilcox (Oracle) 提交于 2月 09, 2022

Use iomap_invalidate_folio() in all the iomap-based filesystems
and rename the iomap_invalidatepage tracepoint.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs
Tested-by: David Howells <dhowells@redhat.com> # afs

d82354f6

fs: Convert is_partially_uptodate to folios · 2e7e80f7

由 Matthew Wilcox (Oracle) 提交于 2月 09, 2022

Since the uptodate property is maintained on a per-folio basis, the
is_partially_uptodate method should also take a folio.  Fix the types
at the same time so it's clear that it returns true/false and takes
the count in bytes, not blocks.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs
Tested-by: David Howells <dhowells@redhat.com> # afs

2e7e80f7

27 1月, 2022 1 次提交

xfs, iomap: limit individual ioend chain lengths in writeback · ebb7fb15

由 Dave Chinner 提交于 1月 26, 2022

Trond Myklebust reported soft lockups in XFS IO completion such as
this:

 watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [kworker/12:1:3106]
 CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-305.10.2.el8_4.x86_64 #1
 Workqueue: xfs-conv/md127 xfs_end_io [xfs]
 RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
 Call Trace:
  wake_up_page_bit+0x8a/0x110
  iomap_finish_ioend+0xd7/0x1c0
  iomap_finish_ioends+0x7f/0xb0
  xfs_end_ioend+0x6b/0x100 [xfs]
  xfs_end_io+0xb9/0xe0 [xfs]
  process_one_work+0x1a7/0x360
  worker_thread+0x1fa/0x390
  kthread+0x116/0x130
  ret_from_fork+0x35/0x40

Ioends are processed as an atomic completion unit when all the
chained bios in the ioend have completed their IO. Logically
contiguous ioends can also be merged and completed as a single,
larger unit.  Both of these things can be problematic as both the
bio chains per ioend and the size of the merged ioends processed as
a single completion are both unbound.

If we have a large sequential dirty region in the page cache,
write_cache_pages() will keep feeding us sequential pages and we
will keep mapping them into ioends and bios until we get a dirty
page at a non-sequential file offset. These large sequential runs
can will result in bio and ioend chaining to optimise the io
patterns. The pages iunder writeback are pinned within these chains
until the submission chaining is broken, allowing the entire chain
to be completed. This can result in huge chains being processed
in IO completion context.

We get deep bio chaining if we have large contiguous physical
extents. We will keep adding pages to the current bio until it is
full, then we'll chain a new bio to keep adding pages for writeback.
Hence we can build bio chains that map millions of pages and tens of
gigabytes of RAM if the page cache contains big enough contiguous
dirty file regions. This long bio chain pins those pages until the
final bio in the chain completes and the ioend can iterate all the
chained bios and complete them.

OTOH, if we have a physically fragmented file, we end up submitting
one ioend per physical fragment that each have a small bio or bio
chain attached to them. We do not chain these at IO submission time,
but instead we chain them at completion time based on file
offset via iomap_ioend_try_merge(). Hence we can end up with unbound
ioend chains being built via completion merging.

XFS can then do COW remapping or unwritten extent conversion on that
merged chain, which involves walking an extent fragment at a time
and running a transaction to modify the physical extent information.
IOWs, we merge all the discontiguous ioends together into a
contiguous file range, only to then process them individually as
discontiguous extents.

This extent manipulation is computationally expensive and can run in
a tight loop, so merging logically contiguous but physically
discontigous ioends gains us nothing except for hiding the fact the
fact we broke the ioends up into individual physical extents at
submission and then need to loop over those individual physical
extents at completion.

Hence we need to have mechanisms to limit ioend sizes and
to break up completion processing of large merged ioend chains:

1. bio chains per ioend need to be bound in length. Pure overwrites
go straight to iomap_finish_ioend() in softirq context with the
exact bio chain attached to the ioend by submission. Hence the only
way to prevent long holdoffs here is to bound ioend submission
sizes because we can't reschedule in softirq context.

2. iomap_finish_ioends() has to handle unbound merged ioend chains
correctly. This relies on any one call to iomap_finish_ioend() being
bound in runtime so that cond_resched() can be issued regularly as
the long ioend chain is processed. i.e. this relies on mechanism #1
to limit individual ioend sizes to work correctly.

3. filesystems have to loop over the merged ioends to process
physical extent manipulations. This means they can loop internally,
and so we break merging at physical extent boundaries so the
filesystem can easily insert reschedule points between individual
extent manipulations.
Signed-off-by: NDave Chinner <dchinner@redhat.com>
Reported-and-tested-by: NTrond Myklebust <trondmy@hammerspace.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

ebb7fb15

18 12月, 2021 1 次提交

iomap,xfs: Convert ->discard_page to ->discard_folio · 6e478521

由 Matthew Wilcox (Oracle) 提交于 7月 30, 2021

XFS has the only implementation of ->discard_page today, so convert it
to use folios in the same patch as converting the API.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

6e478521

17 12月, 2021 1 次提交

iomap: Add iomap_invalidate_folio · 8306a5f5

由 Matthew Wilcox (Oracle) 提交于 4月 28, 2021

Keep iomap_invalidatepage around as a wrapper for use in address_space
operations.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

8306a5f5

05 12月, 2021 1 次提交

iomap: add a IOMAP_DAX flag · 952da063

由 Christoph Hellwig 提交于 11月 29, 2021

Add a flag so that the file system can easily detect DAX operations
based just on the iomap operation requested instead of looking at
inode state using IS_DAX. This will be needed to apply the to be
added partition offset only for operations that actually use DAX,
but not things like fiemap that are based on the block device.
In the long run it should also allow turning the bdev, dax_dev
and inline_data into a union.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDan Williams <dan.j.williams@intel.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20211129102203.2243509-25-hch@lst.deSigned-off-by: NDan Williams <dan.j.williams@intel.com>

952da063

24 10月, 2021 2 次提交

iomap: Add done_before argument to iomap_dio_rw · 4fdccaa0

由 Andreas Gruenbacher 提交于 7月 24, 2021

Add a done_before argument to iomap_dio_rw that indicates how much of
the request has already been transferred.  When the request succeeds, we
report that done_before additional bytes were tranferred.  This is
useful for finishing a request asynchronously when part of the request
has already been completed synchronously.

We'll use that to allow iomap_dio_rw to be used with page faults
disabled: when a page fault occurs while submitting a request, we
synchronously complete the part of the request that has already been
submitted.  The caller can then take care of the page fault and call
iomap_dio_rw again for the rest of the request, passing in the number of
bytes already tranferred.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

4fdccaa0

iomap: Support partial direct I/O on user copy failures · 97308f8b

由 Andreas Gruenbacher 提交于 7月 23, 2021

In iomap_dio_rw, when iomap_apply returns an -EFAULT error and the
IOMAP_DIO_PARTIAL flag is set, complete the request synchronously and
return a partial result.  This allows the caller to deal with the page
fault and retry the remainder of the request.
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>

97308f8b

18 10月, 2021 2 次提交

block: switch polling to be bio based · 3e08773c

由 Christoph Hellwig 提交于 10月 12, 2021

Replace the blk_poll interface that requires the caller to keep a queue
and cookie from the submissions with polling based on the bio.

Polling for the bio itself leads to a few advantages:

 - the cookie construction can made entirely private in blk-mq.c
 - the caller does not need to remember the request_queue and cookie
   separately and thus sidesteps their lifetime issues
 - keeping the device and the cookie inside the bio allows to trivially
   support polling BIOs remapping by stacking drivers
 - a lot of code to propagate the cookie back up the submission path can
   be removed entirely.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NMark Wunderlich <mark.wunderlich@intel.com>
Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

3e08773c

block: replace the spin argument to blk_iopoll with a flags argument · ef99b2d3

由 Christoph Hellwig 提交于 10月 12, 2021

Switch the boolean spin argument to blk_poll to passing a set of flags
instead.  This will allow to control polling behavior in a more fine
grained way.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NMark Wunderlich <mark.wunderlich@intel.com>
Link: https://lore.kernel.org/r/20211012111226.760968-10-hch@lst.de
[axboe: adapt to changed io_uring iopoll]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ef99b2d3

17 8月, 2021 9 次提交

iomap: constify iomap_iter_srcmap · fad0a1ab

由 Christoph Hellwig 提交于 8月 10, 2021

The srcmap returned from iomap_iter_srcmap is never modified, so mark
the iomap returned from it const and constify a lot of code that never
modifies the iomap.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

fad0a1ab

iomap: rework unshare flag · b74b1293

由 Christoph Hellwig 提交于 8月 10, 2021

Instead of another internal flags namespace inside of buffered-io.c,
just pass a UNSHARE hint in the main iomap flags field.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

b74b1293

iomap: remove iomap_apply · 57320a01

由 Christoph Hellwig 提交于 8月 10, 2021

iomap_apply is unused now, so remove it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[djwong: rebase this patch to preserve git history of iomap loop control]
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

57320a01

iomap: switch __iomap_dio_rw to use iomap_iter · a6d3d495

由 Christoph Hellwig 提交于 8月 10, 2021

Switch __iomap_dio_rw to use iomap_iter.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

a6d3d495

iomap: add the new iomap_iter model · f4b896c2

由 Christoph Hellwig 提交于 8月 10, 2021

The iomap_iter struct provides a convenient way to package up and
maintain all the arguments to the various mapping and operation
functions.  It is operated on using the iomap_iter() function that
is called in loop until the whole range has been processed.  Compared
to the existing iomap_apply() function this avoid an indirect call
for each iteration.

For now iomap_iter() calls back into the existing ->iomap_begin and
->iomap_end methods, but in the future this could be further optimized
to avoid indirect calls entirely.

Based on an earlier patch from Matthew Wilcox <willy@infradead.org>.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[djwong: add to apply.c to preserve git history of iomap loop control]
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NDave Chinner <dchinner@redhat.com>

f4b896c2

iomap: mark the iomap argument to iomap_inline_data_valid const · e3c4ffb0

由 Christoph Hellwig 提交于 8月 10, 2021

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

e3c4ffb0

iomap: mark the iomap argument to iomap_inline_data const · 4495c33e

由 Christoph Hellwig 提交于 8月 10, 2021

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

4495c33e

iomap: mark the iomap argument to iomap_sector const · 66b8165e

由 Christoph Hellwig 提交于 8月 10, 2021

Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

66b8165e

iomap: remove the iomap arguments to ->page_{prepare,done} · 1d25d0ae

由 Christoph Hellwig 提交于 8月 10, 2021

These aren't actually used by the only instance implementing the methods.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

1d25d0ae

04 8月, 2021 1 次提交

iomap: support reading inline data from non-zero pos · 69f4a26c

由 Gao Xiang 提交于 8月 03, 2021

The existing inline data support only works for cases where the entire
file is stored as inline data.  For larger files, EROFS stores the
initial blocks separately and the remainder of the file ("file tail")
adjacent to the inode.  Generalise inline data to allow reading the
inline file tail.  Tails may not cross a page boundary in memory.

We currently have no filesystems that support tails and writing,
so that case is currently disabled (see iomap_write_begin_inline).
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: NGao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

69f4a26c

30 6月, 2021 1 次提交

iomap: use __set_page_dirty_nobuffers · fd7353f8

由 Matthew Wilcox (Oracle) 提交于 6月 28, 2021

The only difference between iomap_set_page_dirty() and
__set_page_dirty_nobuffers() is that the latter includes a debugging check
that a !Uptodate page has private data.

Link: https://lkml.kernel.org/r/20210615162342.1669332-4-willy@infradead.orgSigned-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fd7353f8

04 5月, 2021 1 次提交

iomap: remove unused private field from ioend · 6e552494

由 Brian Foster 提交于 5月 04, 2021

The only remaining user of ->io_private is the generic ioend merging
infrastructure. The only user of that is XFS, which no longer sets
->io_private or passes an associated merge callback. Remove the
unused parameter and the ->io_private field.

CC: linux-fsdevel@vger.kernel.org
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

6e552494

09 2月, 2021 1 次提交

iomap: support REQ_OP_ZONE_APPEND · c3b0e880

由 Naohiro Aota 提交于 2月 04, 2021

A ZONE_APPEND bio must follow hardware restrictions (e.g. not exceeding
max_zone_append_sectors) not to be split. bio_iov_iter_get_pages builds
such restricted bio using __bio_iov_append_get_pages if bio_op(bio) ==
REQ_OP_ZONE_APPEND.

To utilize it, we need to set the bio_op before calling
bio_iov_iter_get_pages(). This commit introduces IOMAP_F_ZONE_APPEND, so
that iomap user can set the flag to indicate they want REQ_OP_ZONE_APPEND
and restricted bio.
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NNaohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

c3b0e880

24 1月, 2021 2 次提交

iomap: add a IOMAP_DIO_OVERWRITE_ONLY flag · 213f6271

由 Christoph Hellwig 提交于 1月 23, 2021

Add a flag to signal that only pure overwrites are allowed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

213f6271

iomap: pass a flags argument to iomap_dio_rw · 2f632965

由 Christoph Hellwig 提交于 1月 23, 2021

Pass a set of flags to iomap_dio_rw instead of the boolean
wait_for_completion argument.  The IOMAP_DIO_FORCE_WAIT flag
replaces the wait_for_completion, but only needs to be passed
when the iocb isn't synchronous to start with to simplify the
callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Reviewed-by: NBrian Foster <bfoster@redhat.com>
[djwong: rework xfs_file.c so that we can push iomap changes separately]
Reviewed-by: NDarrick J. Wong <djwong@kernel.org>
Signed-off-by: NDarrick J. Wong <djwong@kernel.org>

2f632965

05 11月, 2020 1 次提交

iomap: support partial page discard on writeback block mapping failure · 763e4cdc

由 Brian Foster 提交于 10月 29, 2020

iomap writeback mapping failure only calls into ->discard_page() if
the current page has not been added to the ioend. Accordingly, the
XFS callback assumes a full page discard and invalidation. This is
problematic for sub-page block size filesystems where some portion
of a page might have been mapped successfully before a failure to
map a delalloc block occurs. ->discard_page() is not called in that
error scenario and the bio is explicitly failed by iomap via the
error return from ->prepare_ioend(). As a result, the filesystem
leaks delalloc blocks and corrupts the filesystem block counters.

Since XFS is the only user of ->discard_page(), tweak the semantics
to invoke the callback unconditionally on mapping errors and provide
the file offset that failed to map. Update xfs_discard_page() to
discard the corresponding portion of the file and pass the range
along to iomap_invalidatepage(). The latter already properly handles
both full and sub-page scenarios by not changing any iomap or page
state on sub-page invalidations.
Signed-off-by: NBrian Foster <bfoster@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

763e4cdc

28 9月, 2020 1 次提交

iomap: Allow filesystem to call iomap_dio_complete without i_rwsem · c3d4ed1a

由 Christoph Hellwig 提交于 9月 28, 2020

This is to avoid the deadlock caused in btrfs because of O_DIRECT |
O_DSYNC.

Filesystems such as btrfs require i_rwsem while performing sync on a
file. iomap_dio_rw() is called under i_rw_sem. This leads to a
deadlock because of:

iomap_dio_complete()
  generic_write_sync()
    btrfs_sync_file()

Separate out iomap_dio_complete() from iomap_dio_rw(), so filesystems
can call iomap_dio_complete() after unlocking i_rwsem.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJosef Bacik <josef@toxicpanda.com>
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>

c3d4ed1a

04 6月, 2020 1 次提交

iomap: fix the iomap_fiemap prototype · 27328818

由 Christoph Hellwig 提交于 5月 23, 2020

iomap_fiemap should take u64 start and len arguments, just like the
->fiemap prototype.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NRitesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Link: https://lore.kernel.org/r/20200523073016.2944131-6-hch@lst.deSigned-off-by: NTheodore Ts'o <tytso@mit.edu>

27328818

03 6月, 2020 1 次提交

iomap: convert from readpages to readahead · 9d24a13a

由 Matthew Wilcox (Oracle) 提交于 6月 01, 2020

Use the new readahead operation in iomap.  Convert XFS and ZoneFS to use
it.
Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NWilliam Kucharski <william.kucharski@oracle.com>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Gao Xiang <gaoxiang25@huawei.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Link: http://lkml.kernel.org/r/20200414150233.24495-26-willy@infradead.orgSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9d24a13a

25 5月, 2020 1 次提交

iomap: add a filesystem hook for direct I/O bio submission · 8cecd0ba

由 Goldwyn Rodrigues 提交于 5月 14, 2019

This helps filesystems to perform tasks on the bio while submitting for
I/O. This could be post-write operations such as data CRC or data
replication for fs-handled RAID.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: NGoldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NDavid Sterba <dsterba@suse.com>

8cecd0ba

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功