提交 · 8dd87f3c5283de7f95396a236e420487226f3951 · openeuler / Kernel

29 7月, 2022 2 次提交

dm: Allow dm_call_pr to be used for path searches · 8dd87f3c

由 Mike Christie 提交于 7月 17, 2022

The specs state that if you send a reserve down a path that is already
the holder success must be returned and if it goes down a path that
is not the holder reservation conflict must be returned. Windows
failover clustering will send a second reservation and expects that a
device returns success. The problem for multipathing is that for an
All Registrants reservation, we can send the reserve down any path but
for all other reservation types there is one path that is the holder.

To handle this we could add PR state to dm but that can get nasty.
Look at target_core_pr.c for an example of the type of things we'd
have to track. It will also get more complicated because other
initiators can change the state so we will have to add in async
event/sense handling.

This commit, and the 3 commits that follow, tries to keep dm simple
and keep just doing passthrough. This commit modifies dm_call_pr to be
able to find the first usable path that can execute our pr_op then
return. When dm_pr_reserve is converted to dm_call_pr in the next
commit for the normal case we will use the same path for every
reserve.
Signed-off-by: NMike Christie <michael.christie@oracle.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

8dd87f3c

dm: return early from dm_pr_call() if DM device is suspended · e120a5f1

由 Mike Snitzer 提交于 7月 22, 2022

Otherwise PR ops may be issued while the broader DM device is being
reconfigured, etc.

Fixes: 9c72bad1 ("dm: call PR reserve/unreserve on each underlying device")
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

e120a5f1

07 7月, 2022 3 次提交

dm table: audit all dm_table_get_target() callers · 564b5c54

由 Mike Snitzer 提交于 7月 05, 2022

All callers of dm_table_get_target() are expected to do proper bounds
checking on the index they pass.

Move dm_table_get_target() to dm-core.h to make it extra clear that only
DM core code should be using it. Switch it to be inlined while at it.

Standardize all DM core callers to use the same for loop pattern and
make associated variables as local as possible. Rename some variables
(e.g. s/table/t/ and s/tgt/ti/) along the way.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

564b5c54

dm table: remove dm_table_get_num_targets() wrapper · 2aec377a

由 Mike Snitzer 提交于 7月 05, 2022

More efficient and readable to just access table->num_targets directly.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

2aec377a

dm: add two stage requeue mechanism · 8b211aac

由 Ming Lei 提交于 6月 24, 2022

Commit 61b6e2e5 ("dm: fix BLK_STS_DM_REQUEUE handling when dm_io
represents split bio") reverted DM core's bio splitting back to using
bio_split()+bio_chain() because it was found that otherwise DM's
BLK_STS_DM_REQUEUE would trigger a live-lock waiting for bio
completion that would never occur.

Restore using bio_trim()+bio_inc_remaining(), like was done in commit
7dd76d1f ("dm: improve bio splitting and associated IO
accounting"), but this time with proper handling for the above
scenario that is covered in more detail in the commit header for
61b6e2e5.

Solve this issue by adding a two staged dm_io requeue mechanism that
uses the new dm_bio_rewind() via dm_io_rewind():

1) requeue the dm_io into the requeue_list added to struct
   mapped_device, and schedule it via new added requeue work. This
   workqueue just clones the dm_io->orig_bio (which DM saves and
   ensures its end sector isn't modified). dm_io_rewind() uses the
   sectors and sectors_offset members of the dm_io that are recorded
   relative to the end of orig_bio: dm_bio_rewind()+bio_trim() are
   then used to make that cloned bio reflect the subset of the
   original bio that is represented by the dm_io that is being
   requeued.

2) the 2nd stage requeue is same with original requeue, but
   io->orig_bio points to new cloned bio (which matches the requeued
   dm_io as described above).

This allows DM core to shift the need for bio cloning from bio-split
time (during IO submission) to the less likely BLK_STS_DM_REQUEUE
handling (after IO completes with that error).
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

8b211aac

30 6月, 2022 2 次提交

dm: improve BLK_STS_DM_REQUEUE and BLK_STS_AGAIN handling · 444fe04f

由 Ming Lei 提交于 6月 24, 2022

If either BLK_STS_DM_REQUEUE or BLK_STS_AGAIN is returned for POLLED
io, we requeue the original bio into deferred list and kick md->wq to
re-submit it to block layer.

Improve the handling in the following way:

1) Factor out dm_handle_requeue() for handling dm_io requeue.

2) Unify handling for BLK_STS_DM_REQUEUE and BLK_STS_AGAIN: clear
   REQ_POLLED for BLK_STS_DM_REQUEUE too, for the sake of simplicity,
   given BLK_STS_DM_REQUEUE is very unusual.

3) Queue md->wq explicitly in dm_handle_requeue(), so requeue handling
   becomes more robust.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

444fe04f

dm: refactor dm_md_mempool allocation · e810cb78

由 Christoph Hellwig 提交于 6月 08, 2022

The current split between dm_table_alloc_md_mempools and
dm_alloc_md_mempools is rather arbitrary, so merge the two
into one easy to follow function.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

e810cb78

28 6月, 2022 1 次提交

block: remove blk_cleanup_disk · 8b9ab626

由 Christoph Hellwig 提交于 6月 19, 2022

blk_cleanup_disk is nothing but a trivial wrapper for put_disk now,
so remove it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

8b9ab626

27 6月, 2022 1 次提交

dm: open code blk_max_size_offset in max_io_len · c3949322

由 Christoph Hellwig 提交于 6月 14, 2022

max_io_len always passes an explicitly non-zero chunk_sectors into
blk_max_size_offset. That means much of blk_max_size_offset is not
needed and can be open coded to simplify the code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/r/20220614090934.570632-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

c3949322

24 6月, 2022 1 次提交

dm: fix BLK_STS_DM_REQUEUE handling when dm_io represents split bio · 61b6e2e5

由 Ming Lei 提交于 6月 23, 2022

Commit 7dd76d1f ("dm: improve bio splitting and associated IO
accounting") removed using cloned bio when dm io splitting is needed.
Using bio_trim()+bio_inc_remaining() rather than bio_split()+bio_chain()
causes multiple dm_io instances to share the same original bio, and it
works fine if IOs are completed successfully.

But a regression was caused for the case when BLK_STS_DM_REQUEUE is
returned from any one of DM's cloned bios (whose dm_io share the same
orig_bio). In this BLK_STS_DM_REQUEUE case only the mapped subset of
the original bio for the current exact dm_io needs to be re-submitted.
However, since the original bio is shared among all dm_io instances,
the ->orig_bio actually only represents the last dm_io instance, so
requeue can't work as expected. Also when more than one dm_io is
requeued, the same original bio is requeued from all dm_io's
completion handler, then race is caused.

Fix this issue by still allocating one clone bio for completing io
only, then io accounting can rely on ->orig_bio being unmodified. This
is needed because the dm_io's sector_offset and sectors members are
recorded relative to an unmodified ->orig_bio.

In the future, we can go back to using bio_trim()+bio_inc_remaining()
for dm's io splitting but then delay needing a bio clone only when
handling BLK_STS_DM_REQUEUE, but that approach is a bit complicated
(so it needs a development cycle):
1) bio clone needs to be done in task context
2) a block interface for unwinding bio is required

Fixes: 7dd76d1f ("dm: improve bio splitting and associated IO accounting")
Reported-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

61b6e2e5

22 6月, 2022 1 次提交

dm: do not return early from dm_io_complete if BLK_STS_AGAIN without polling · 78ccef91

由 Mike Snitzer 提交于 6月 21, 2022

Commit 52919840 ("dm: fix bio polling to handle possibile
BLK_STS_AGAIN") inadvertently introduced an early return from
dm_io_complete() without first queueing the bio to DM if BLK_STS_AGAIN
occurs and bio-polling is _not_ being used.

Fix this by only returning early from dm_io_complete() if the bio has
first been properly queued to DM. Otherwise, the bio will never finish
via bio_endio.

Fixes: 52919840 ("dm: fix bio polling to handle possibile BLK_STS_AGAIN")
Cc: stable@vger.kernel.org
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

78ccef91

17 6月, 2022 2 次提交

dm: fix narrow race for REQ_NOWAIT bios being issued despite no support · 1ee88de3

由 Mikulas Patocka 提交于 6月 16, 2022

Starting with the commit 63a225c9fd20, device mapper has an optimization
that it will take cheaper table lock (dm_get_live_table_fast instead of
dm_get_live_table) if the bio has REQ_NOWAIT. The bios with REQ_NOWAIT
must not block in the target request routine, if they did, we would be
blocking while holding rcu_read_lock, which is prohibited.

The targets that are suitable for REQ_NOWAIT optimization (and that don't
block in the map routine) have the flag DM_TARGET_NOWAIT set. Device
mapper will test if all the targets and all the devices in a table
support nowait (see the function dm_table_supports_nowait) and it will set
or clear the QUEUE_FLAG_NOWAIT flag on its request queue according to
this check.

There's a test in submit_bio_noacct: "if ((bio->bi_opf & REQ_NOWAIT) &&
!blk_queue_nowait(q)) goto not_supported" - this will make sure that
REQ_NOWAIT bios can't enter a request queue that doesn't support them.

This mechanism works to prevent REQ_NOWAIT bios from reaching dm targets
that don't support the REQ_NOWAIT flag (and that may block in the map
routine) - except that there is a small race condition:

submit_bio_noacct checks if the queue has the QUEUE_FLAG_NOWAIT without
holding any locks. Immediatelly after this check, the device mapper table
may be reloaded with a table that doesn't support REQ_NOWAIT (for example,
if we start moving the logical volume or if we activate a snapshot).
However the REQ_NOWAIT bio that already passed the check in
submit_bio_noacct would be sent to device mapper, where it could be
redirected to a dm target that doesn't support REQ_NOWAIT - the result is
sleeping while we hold rcu_read_lock.

In order to fix this race, we double-check if the target supports
REQ_NOWAIT while we hold the table lock (so that the table can't change
under us).

Fixes: 563a225c ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio")
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

1ee88de3

dm: fix use-after-free in dm_put_live_table_bio · 5d7362d0

由 Mikulas Patocka 提交于 6月 16, 2022

dm_put_live_table_bio is called from the end of dm_submit_bio.
However, at this point, the bio may be already finished and the caller
may have freed the bio. Consequently, dm_put_live_table_bio accesses
the stale "bio" pointer.

Fix this bug by loading the bi_opf value and passing it to
dm_get_live_table_bio and dm_put_live_table_bio instead of the bio.

This bug was found by running the lvm2 testsuite with kasan.

Fixes: 563a225c ("dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio")
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

5d7362d0

15 6月, 2022 1 次提交

dm: fix race in dm_start_io_acct · 10eb3a0d

由 Benjamin Marzinski 提交于 6月 14, 2022

After commit 82f6cdcc ("dm: switch dm_io booleans over to proper
flags") dm_start_io_acct stopped atomically checking and setting
was_accounted, which turned into the DM_IO_ACCOUNTED flag. This opened
the possibility for a race where IO accounting is started twice for
duplicate bios. To remove the race, check the flag while holding the
io->lock.

Fixes: 82f6cdcc ("dm: switch dm_io booleans over to proper flags")
Cc: stable@vger.kernel.org
Signed-off-by: NBenjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

10eb3a0d

11 6月, 2022 1 次提交

dm: fix zoned locking imbalance due to needless check in clone_endio · dddf3056

由 Mike Snitzer 提交于 6月 10, 2022

After the commit ca522482 ("dm: pass NULL bdev to bio_alloc_clone"),
clone_endio() only calls dm_zone_endio() when DM targets remap the
clone bio's bdev to something other than the md->disk->part0 default.

However, if a DM target (e.g. dm-crypt) stacked ontop of a dm-zoned
does not remap the clone bio using bio_set_dev() then dm_zone_endio()
is not called at completion of the bios and zone locks are not
properly unlocked. This triggers a hang, in dm_zone_map_bio(), when
blktests block/004 is run for dm-crypt on zoned block devices. To
avoid the hang, simply remove the clone_endio() check that verifies
the target remapped the clone bio to a device other than the default.

Fixes: ca522482 ("dm: pass NULL bdev to bio_alloc_clone")
Reported-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

dddf3056

09 6月, 2022 1 次提交

dm: fix bio_set allocation · 29dec90a

由 Christoph Hellwig 提交于 6月 08, 2022

The use of bioset_init_from_src mean that the pre-allocated pools weren't
used for anything except parameter passing, and the integrity pool
creation got completely lost for the actual live mapped_device. Fix that
by assigning the actual preallocated dm_md_mempools to the mapped_device
and using that for I/O instead of creating new mempools.

Fixes: 2a2a4c51 ("dm: use bioset_init_from_src() to copy bio_set")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

29dec90a

17 5月, 2022 2 次提交

dax: add .recovery_write dax_operation · 047218ec

由 Jane Chu 提交于 4月 22, 2022

Introduce dax_recovery_write() operation. The function is used to
recover a dax range that contains poison. Typical use case is when
a user process receives a SIGBUS with si_code BUS_MCEERR_AR
indicating poison(s) in a dax range, in response, the user process
issues a pwrite() to the page-aligned dax range, thus clears the
poison and puts valid data in the range.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJane Chu <jane.chu@oracle.com>
Link: https://lore.kernel.org/r/20220422224508.440670-6-jane.chu@oracle.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

047218ec

dax: introduce DAX_RECOVERY_WRITE dax access mode · e511c4a3

由 Jane Chu 提交于 5月 13, 2022

Up till now, dax_direct_access() is used implicitly for normal
access, but for the purpose of recovery write, dax range with
poison is requested.  To make the interface clear, introduce
	enum dax_access_mode {
		DAX_ACCESS,
		DAX_RECOVERY_WRITE,
	}
where DAX_ACCESS is used for normal dax access, and
DAX_RECOVERY_WRITE is used for dax recovery write.
Suggested-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJane Chu <jane.chu@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: NVivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/165247982851.52965.11024212198889762949.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

e511c4a3

12 5月, 2022 1 次提交

dm: pass NULL bdev to bio_alloc_clone · ca522482

由 Mike Snitzer 提交于 5月 11, 2022

Most DM targets will remap the clone bio passed to their ->map
function using bio_set_bdev(). So this change to pass NULL bdev to
bio_alloc_clone avoids clone-time work that sets up resources for a
bdev association that will not be used in practice (e.g. clone issued
to underlying device will not use DM device's blk-cgroups resources).

But clone->bi_bdev is still initialized following bio_alloc_clone to
preserve DM target expectations that clone->bi_bdev will be set.
Follow-up work is needed to audit DM targets to remove accesses to a
clone->bi_bdev that the target didn't initialize with bio_set_dev().

Depends-on: 7ecc56c6 ("block: allow passing a NULL bdev to bio_alloc_clone/bio_init_clone")
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

ca522482

06 5月, 2022 18 次提交

dm: improve abnormal bio processing · 4edadf6d

由 Mike Snitzer 提交于 4月 17, 2022

Read/write/flush are the most common operations, optimize switch in
is_abnormal_io() for those cases. Follows same pattern established in
block perf-wip commit ("block: optimise blk_may_split for normal rw")

Also, push is_abnormal_io() check and blk_queue_split() down from
dm_submit_bio() to dm_split_and_process_bio() and set new
'is_abnormal_io' flag in clone_info. Optimize __split_and_process_bio
and __process_abnormal_io by leveraging ci.is_abnormal_io flag.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

4edadf6d

dm: simplify bio-based IO accounting further · 9d20653f

由 Mike Snitzer 提交于 4月 15, 2022

Now that io splitting is recorded prior to, or during, ->map IO
accounting can happen immediately rather than defer until after
bio splitting in dm_split_and_process_bio().

Remove the DM_IO_START_ACCT flag and also remove dm_io's map_task
member because there is no longer any need to wait for splitting to
occur before accounting.

Also move dm_io struct's 'flags' member to consolidate struct holes.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

9d20653f

dm: put all polled dm_io instances into a single list · ec211631

由 Ming Lei 提交于 4月 12, 2022

Now that bio_split() isn't used by DM's bio splitting, it is a bit
overkill to link dm_io into an hlist given there is only single dm_io
in the list.

Convert to using a single list for holding all dm_io instances
associated with this bio.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

ec211631

dm: improve dm_io reference counting · 0f14d60a

由 Ming Lei 提交于 4月 12, 2022

Currently each dm_io's reference counter is grabbed before calling
__map_bio(), this way isn't efficient since we can move this grabbing
to initialization time inside alloc_io().

Meantime it becomes typical async io reference counter model: one is
for submission side, the other is for completion side, and the io won't
be completed until both sides are done.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

0f14d60a

dm: don't grab target io reference in dm_zone_map_bio · 2e803cd9

由 Ming Lei 提交于 4月 12, 2022

dm_zone_map_bio() is only called from __map_bio in which the io's
reference is grabbed already, and the reference won't be released
until the bio is submitted, so not necessary to do it dm_zone_map_bio
any more.
Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

2e803cd9

dm: improve bio splitting and associated IO accounting · 7dd76d1f

由 Ming Lei 提交于 4月 12, 2022

The current DM code (ab)uses late assignment of dm_io->orig_bio (after
__map_bio() returns and any bio splitting is complete) to indicate the
FS bio has been processed and can be accounted. This results in
awkward waiting until ->orig_bio is set in dm_submit_bio_remap().

Also the bio splitting was implemented using bio_split()+bio_chain()
-- a well-worn pattern but it requires bio cloning purely for the
benefit of more natural IO accounting.  The bio_split() result was
stored in ->orig_bio to represent the mapped part of the original FS
bio.

DM has switched to the bdev based IO accounting interface.  DM's IO
accounting can be implemented in terms of the original FS bio (now
stored early in ->orig_bio) via access to its sectors/bio_op. And
if/when splitting is needed, set a new DM_IO_WAS_SPLIT flag and use
new dm_io fields of .sector_offset & .sectors to allow IO accounting
for split bios _without_ needing to clone a new bio to store in
->orig_bio.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Co-developed-by: NMike Snitzer <snitzer@kernel.org>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

7dd76d1f

dm: switch to bdev based IO accounting interfaces · d3de6d12

由 Ming Lei 提交于 4月 12, 2022

DM splits flush with data into empty flush followed by bio with data
payload, switch dm_io_acct() to use bdev_{start,end}_io_acct() to do
this accoiunting more naturally (rather than temporarily changing the
bio's bi_size).

This will allow DM to more easily account bios that are split (in
following commit).
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

d3de6d12

dm: pass dm_io instance to dm_io_acct directly · e6926ad0

由 Ming Lei 提交于 4月 12, 2022

All the other 4 parameters are retrieved from the 'dm_io' instance, so
it's not necessary to pass all four to dm_io_acct().
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

e6926ad0

dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct · b992b40d

由 Ming Lei 提交于 4月 12, 2022

dm->orig_bio is always passed to __dm_start_io_acct and dm_end_io_acct,
so it isn't necessary to take one bio parameter for the two helpers.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

b992b40d

dm: use bio_sectors in dm_aceept_partial_bio · bdb34759

由 Mike Snitzer 提交于 4月 15, 2022

Rename 'bi_size' to 'bio_sectors' given bi_size is being stored in
sectors.  Also, use bio_sectors() rather than open-coding it.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

bdb34759

dm: conditionally enable branching for less used features · 442761fd

由 Mike Snitzer 提交于 3月 26, 2022

Use jump_labels to further reduce cost of unlikely branches for zoned
block devices, dm-stats and swap_bios throttling.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

442761fd

dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio · 563a225c

由 Mike Snitzer 提交于 3月 26, 2022

If a bio is marked REQ_NOWAIT optimize dm_submit_bio()'s dm_table RCU
usage to dm_{get,put}_live_table_fast.

DM core offers protection against blocking (via suspend) if REQ_NOWAIT.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

563a225c

M
dm: add local variables to clone_endio and __map_bio · 6cbce280
由 Mike Snitzer 提交于 3月 26, 2022
```
Avoid redundant dereferences in both functions.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>
```
6cbce280
M
dm: mark various branches unlikely · fe221db4
由 Mike Snitzer 提交于 3月 25, 2022
```
Signed-off-by: NMike Snitzer <snitzer@kernel.org>
```
fe221db4

dm: simplify dm_start_io_acct · 3b03f7c1

由 Mike Snitzer 提交于 3月 25, 2022

Pull common DM_IO_ACCOUNTED check out to beginning of dm_start_io_acct.
Also, use dm_tio_is_normal (and move it to dm-core.h).
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

3b03f7c1

M
dm: simplify dm_io access in dm_split_and_process_bio · 4857abf6
由 Mike Snitzer 提交于 3月 25, 2022
```
Use local variable instead of redudant access using ci.io
Signed-off-by: NMike Snitzer <snitzer@kernel.org>
```
4857abf6
M
dm: factor out dm_io_set_error and __dm_io_dec_pending · 84b98f4c
由 Mike Snitzer 提交于 3月 17, 2022
```
Also eliminate need to use errno_to_blk_status().
Signed-off-by: NMike Snitzer <snitzer@kernel.org>
```
84b98f4c

dm: conditionally enable BIOSET_PERCPU_CACHE for dm_io bioset · cfc97abc

由 Mike Snitzer 提交于 3月 24, 2022

A bioset's per-cpu alloc cache may have broader utility in the future
but for now constrain it to being tightly coupled to QUEUE_FLAG_POLL.

Also change dm_io_complete() to use bio_clear_polled() so that it
properly clears all associated bio state on requeue.

This commit improves DM's hipri bio polling (REQ_POLLED) perf by
7 - 20% depending on the system.
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

cfc97abc

18 4月, 2022 1 次提交

block: remove QUEUE_FLAG_DISCARD · 70200574

由 Christoph Hellwig 提交于 4月 15, 2022

Just use a non-zero max_discard_sectors as an indicator for discard
support, similar to what is done for write zeroes.

The only places where needs special attention is the RAID5 driver,
which must clear discard support for security reasons by default,
even if the default stacking rules would allow for it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

70200574

16 4月, 2022 1 次提交

dm: fix bio length of empty flush · 92b914e2

由 Shin'ichiro Kawasaki 提交于 4月 15, 2022

The commit 92986f6b ("dm: use bio_clone_fast in alloc_io/alloc_tio")
removed bio_clone_fast() call from alloc_tio() when ci->io->tio is
available. In this case, ci->bio is not copied to ci->io->tio.clone.
This is fine since init_clone_info() sets same values to ci->bio and
ci->io->tio.clone.

However, when incoming bios have REQ_PREFLUSH flag, __send_empty_flush()
prepares a zero length bio on stack and set it to ci->bio. At this time,
ci->io->tio.clone still keeps non-zero length. When alloc_tio() chooses
this ci->io->tio.clone as the bio to map, it is passed to targets as
non-empty flush bio. It causes bio length check failure in dm-zoned and
unexpected operation such as dm_accept_partial_bio() call.

To avoid the non-empty flush bio, set zero length to ci->io->tio.clone
in __send_empty_flush().

Fixes: 92986f6b ("dm: use bio_clone_fast in alloc_io/alloc_tio")
Signed-off-by: NShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

92b914e2

15 4月, 2022 1 次提交

dm: allow dm_accept_partial_bio() for dm_io without duplicate bios · 7dd06a25

由 Mike Snitzer 提交于 4月 14, 2022

The intent behind commit e6fc9f62 ("dm: flag clones created by
__send_duplicate_bios") was to formally disallow the use of
dm_accept_partial_bio() where it simply isn't possible -- due to
constraint that multiple bios cannot meaningfully update a shared
tio->len_ptr.

But that commit went too far and disallowed the case where "abormal"
IO (e.g. WRITE_ZEROES) is only using a single bio.  Fix this by
not marking a dm_io with a single dm_target_io (and bio), that happens
to be created by __send_duplicate_bios, as DM_TIO_IS_DUPLICATE_BIO.
Also remove 'unsigned *len' parameter from alloc_multiple_bios().

This commit fixes a dm_accept_partial_bio() BUG_ON() with dm-zoned
when a WRITE_ZEROES bio is issued.

Fixes: 655f3aad ("dm: switch dm_target_io booleans over to proper flags")
Reported-by: NShinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: NMike Snitzer <snitzer@kernel.org>

7dd06a25

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功