提交 · 7465d7ac50edb3158c5eb957c5ecd3a5310e1c68 · openeuler / Kernel

30 9月, 2020 6 次提交

M
dm: eliminate need for start_io_acct() forward declaration · 7465d7ac
由 Mike Snitzer 提交于 9月 17, 2020
```
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
7465d7ac

dm: simplify __process_abnormal_io() · 9679b5a7

由 Mike Snitzer 提交于 9月 15, 2020

Only call bio_op() once in switch statement.  Also remove the
excessive factoring out to one line functions.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

9679b5a7

M
dm: push use of on-stack flush_bio down to __send_empty_flush() · 828678b8
由 Mike Snitzer 提交于 9月 14, 2020
```
Eliminates duplicate code, no functional change.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
828678b8

dm: optimize max_io_len() by inlining max_io_len_target_boundary() · 3720281d

由 Mike Snitzer 提交于 9月 19, 2020

Saves redundant dm_target_offset() math.

Also, reverse argument order for max_io_len() to be consistent with
other similar functions.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

3720281d

M
dm: push md->immutable_target optimization down to __process_bio() · 094ee64d
由 Mike Snitzer 提交于 9月 14, 2020
```
Also, update associated stale comment in __bind().
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
```
094ee64d

dm: change max_io_len() to use blk_max_size_offset() · 5091cdec

由 Mike Snitzer 提交于 9月 18, 2020

Using blk_max_size_offset() enables DM core's splitting to impose
ti->max_io_len (via q->limits.chunk_sectors) and also fallback to
respecting q->limits.max_sectors if chunk_sectors isn't set.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5091cdec

25 9月, 2020 1 次提交

dm: add support for REQ_NOWAIT and enable it for linear target · 6abc4946

由 Konstantin Khlebnikov 提交于 9月 23, 2020

Add DM target feature flag DM_TARGET_NOWAIT which advertises that
target works with REQ_NOWAIT bios.

Add dm_table_supports_nowait() and update dm_table_set_restrictions()
to set/clear QUEUE_FLAG_NOWAIT accordingly.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6abc4946

22 9月, 2020 2 次提交

dm: fix comment in dm_process_bio() · cf9c3786

由 Mike Snitzer 提交于 9月 21, 2020

Refer to the correct function (->submit_bio instead of ->queue_bio).
Also, add details about why using blk_queue_split() isn't needed for
dm_wq_work()'s call to dm_process_bio().

Fixes: c62b37d9 ("block: move ->make_request_fn to struct block_device_operations")
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

cf9c3786

dm: fix bio splitting and its bio completion order for regular IO · ee1dfad5

由 Mike Snitzer 提交于 9月 14, 2020

dm_queue_split() is removed because __split_and_process_bio() _must_
handle splitting bios to ensure proper bio submission and completion
ordering as a bio is split.

Otherwise, multiple recursive calls to ->submit_bio will cause multiple
split bios to be allocated from the same ->bio_split mempool at the same
time. This would result in deadlock in low memory conditions because no
progress could be made (only one bio is available in ->bio_split
mempool).

This fix has been verified to still fix the loss of performance, due
to excess splitting, that commit 120c9257 provided.

Fixes: 120c9257 ("Revert "dm: always call blk_queue_split() in dm_process_bio()"")
Cc: stable@vger.kernel.org # 5.0+, requires custom backport due to 5.9 changes
Reported-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ee1dfad5

20 9月, 2020 1 次提交

dm/dax: Fix table reference counts · 02186d88

由 Dan Williams 提交于 9月 18, 2020

A recent fix to the dm_dax_supported() flow uncovered a latent bug. When
dm_get_live_table() fails it is still required to drop the
srcu_read_lock(). Without this change the lvm2 test-suite triggers this
warning:

    # lvm2-testsuite --only pvmove-abort-all.sh

    WARNING: lock held when returning to user space!
    5.9.0-rc5+ #251 Tainted: G           OE
    ------------------------------------------------
    lvm/1318 is leaving the kernel with locks still held!
    1 lock held by lvm/1318:
     #0: ffff9372abb5a340 (&md->io_barrier){....}-{0:0}, at: dm_get_live_table+0x5/0xb0 [dm_mod]

...and later on this hang signature:

    INFO: task lvm:1344 blocked for more than 122 seconds.
          Tainted: G           OE     5.9.0-rc5+ #251
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    task:lvm             state:D stack:    0 pid: 1344 ppid:     1 flags:0x00004000
    Call Trace:
     __schedule+0x45f/0xa80
     ? finish_task_switch+0x249/0x2c0
     ? wait_for_completion+0x86/0x110
     schedule+0x5f/0xd0
     schedule_timeout+0x212/0x2a0
     ? __schedule+0x467/0xa80
     ? wait_for_completion+0x86/0x110
     wait_for_completion+0xb0/0x110
     __synchronize_srcu+0xd1/0x160
     ? __bpf_trace_rcu_utilization+0x10/0x10
     __dm_suspend+0x6d/0x210 [dm_mod]
     dm_suspend+0xf6/0x140 [dm_mod]

Fixes: 7bf7eac8 ("dax: Arrange for dax_supported check to span multiple devices")
Cc: <stable@vger.kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Reported-by: NAdrian Huang <ahuang12@lenovo.com>
Reviewed-by: NIra Weiny <ira.weiny@intel.com>
Tested-by: NAdrian Huang <ahuang12@lenovo.com>
Link: https://lore.kernel.org/r/160045867590.25663.7548541079217827340.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

02186d88

02 9月, 2020 1 次提交

block: fix locking for struct block_device size updates · c2b4bb8c

由 Christoph Hellwig 提交于 8月 23, 2020

Two different callers use two different mutexes for updating the
block device size, which obviously doesn't help to actually protect
against concurrent updates from the different callers.  In addition
one of the locks, bd_mutex is rather prone to deadlocks with other
parts of the block stack that use it for high level synchronization.

Switch to using a new spinlock protecting just the size updates, as
that is all we need, and make sure everyone does the update through
the proper helper.

This fixes a bug reported with the nvme revalidating disks during a
hot removal operation, which can currently deadlock on bd_mutex.
Reported-by: NXianting Tian <xianting_tian@126.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c2b4bb8c

24 8月, 2020 1 次提交

treewide: Use fallthrough pseudo-keyword · df561f66

由 Gustavo A. R. Silva 提交于 8月 23, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

df561f66

05 8月, 2020 1 次提交

dm: don't call report zones for more than the user requested · a9cb9f41

由 Johannes Thumshirn 提交于 8月 04, 2020

Don't call report zones for more zones than the user actually requested,
otherwise this can lead to out-of-bounds accesses in the callback
functions.

Such a situation can happen if the target's ->report_zones() callback
function returns 0 because we've reached the end of the target and then
restart the report zones on the second target.

We're again calling into ->report_zones() and ultimately into the user
supplied callback function but when we're not subtracting the number of
zones already processed this may lead to out-of-bounds accesses in the
user callbacks.
Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Fixes: d4100351 ("block: rework zone reporting")
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

a9cb9f41

24 7月, 2020 1 次提交

dm integrity: fix integrity recalculation that is improperly skipped · 5df96f2b

由 Mikulas Patocka 提交于 7月 23, 2020

Commit adc0daad ("dm: report suspended
device during destroy") broke integrity recalculation.

The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().

Signed-off-by: Mikulas Patocka <mpatocka redhat com>
Fixes: adc0daad ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5df96f2b

09 7月, 2020 3 次提交

writeback: remove bdi->congested_fn · 21cf8661

由 Christoph Hellwig 提交于 7月 01, 2020

Except for pktdvd, the only places setting congested bits are file
systems that allocate their own backing_dev_info structures.  And
pktdvd is a deprecated driver that isn't useful in stack setup
either.  So remove the dead congested_fn stacking infrastructure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NSong Liu <song@kernel.org>
Acked-by: NDavid Sterba <dsterba@suse.com>
[axboe: fixup unused variables in bcache/request.c]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

21cf8661

writeback: remove struct bdi_writeback_congested · 8c911f3d

由 Christoph Hellwig 提交于 7月 01, 2020

We never set any congested bits in the group writeback instances of it.
And for the simpler bdi-wide case a simple scalar field is all that
that is needed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8c911f3d

dm: use noio when sending kobject event · 6958c1c6

由 Mikulas Patocka 提交于 7月 08, 2020

kobject_uevent may allocate memory and it may be called while there are dm
devices suspended. The allocation may recurse into a suspended device,
causing a deadlock. We must set the noio flag when sending a uevent.

The observed deadlock was reported here:
https://www.redhat.com/archives/dm-devel/2020-March/msg00025.htmlReported-by: NKhazhismel Kumykov <khazhy@google.com>
Reported-by: NTahsin Erdogan <tahsin@google.com>
Reported-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

6958c1c6

08 7月, 2020 2 次提交

dm: use bio_uninit instead of bio_disassociate_blkg · 382761dc

由 Christoph Hellwig 提交于 6月 27, 2020

bio_uninit is the proper API to clean up a BIO that has been allocated
on stack or inside a structure that doesn't come from the BIO allocator.
Switch dm to use that instead of bio_disassociate_blkg, which really is
an implementation detail. Note that the bio_uninit calls are also moved
to the two callers of __send_empty_flush, so that they better pair with
the bio_init calls used to initialize them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

382761dc

dm: do not use waitqueue for request-based DM · 85067747

由 Ming Lei 提交于 6月 24, 2020

Given request-based DM now uses blk-mq's blk_mq_queue_inflight() to
determine if outstanding IO has completed (and DM has no control over
the blk-mq state machine used to track outstanding IO) it is unsafe to
wakeup waiter (dm_wait_for_completion) before blk-mq has cleared a
request's state bits (e.g. MQ_RQ_IN_FLIGHT or MQ_RQ_COMPLETE).  As
such dm_wait_for_completion() could be left to wait indefinitely if no
other requests complete.

Fix this by eliminating request-based DM's use of waitqueue to wait
for blk-mq requests to complete in dm_wait_for_completion.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Depends-on: 3c94d83c ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()")
Cc: stable@vger.kernel.org
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

85067747

02 7月, 2020 1 次提交

dm: remove unused variable · b53ac8b8

由 Jens Axboe 提交于 7月 01, 2020

Since merging the commit identified in Fixes below, we trigger this
compile time warning:

drivers/md/dm.c: In function ‘__map_bio’:
drivers/md/dm.c:1296:24: warning: unused variable ‘md’ [-Wunused-variable]
 1296 |  struct mapped_device *md = io->md;
       |                        ^~

Remove the 'md' variable.

Fixes: 5a6c35f9 ("block: remove direct_make_request")
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b53ac8b8

01 7月, 2020 5 次提交

block: remove direct_make_request · 5a6c35f9

由 Christoph Hellwig 提交于 7月 01, 2020

Now that submit_bio_noacct has a decent blk-mq fast path there is no
more need for this bypass.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5a6c35f9

block: rename generic_make_request to submit_bio_noacct · ed00aabd

由 Christoph Hellwig 提交于 7月 01, 2020

generic_make_request has always been very confusingly misnamed, so rename
it to submit_bio_noacct to make it clear that it is submit_bio minus
accounting and a few checks.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ed00aabd

block: move ->make_request_fn to struct block_device_operations · c62b37d9

由 Christoph Hellwig 提交于 7月 01, 2020

The make_request_fn is a little weird in that it sits directly in
struct request_queue instead of an operation vector.  Replace it with
a block_device_operations method called submit_bio (which describes much
better what it does).  Also remove the request_queue argument to it, as
the queue can be derived pretty trivially from the bio.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c62b37d9

block: remove the request_queue argument from blk_queue_split · f695ca38

由 Christoph Hellwig 提交于 7月 01, 2020

The queue can be trivially derived from the bio, so pass one less
argument.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f695ca38

dm: stop using ->queuedata · c4a59c4e

由 Christoph Hellwig 提交于 7月 01, 2020

Instead of setting up the queuedata as well just use one private data
field.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c4a59c4e

29 6月, 2020 1 次提交

dm: use bio_uninit instead of bio_disassociate_blkg · 4ef2c5c2

由 Christoph Hellwig 提交于 6月 27, 2020

bio_uninit is the proper API to clean up a BIO that has been allocated
on stack or inside a structure that doesn't come from the BIO allocator.
Switch dm to use that instead of bio_disassociate_blkg, which really is
an implementation detail.  Note that the bio_uninit calls are also moved
to the two callers of __send_empty_flush, so that they better pair with
the bio_init calls used to initialize them.
Acked-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4ef2c5c2

20 6月, 2020 1 次提交

dm: update original bio sector on Zone Append · 415c79e1

由 Johannes Thumshirn 提交于 6月 19, 2020

Naohiro reported that issuing zone-append bios to a zoned block device
underneath a dm-linear device does not work as expected.

This because we forgot to reverse-map the sector the device wrote to the
original bio.

For zone-append bios, get the offset in the zone of the written sector
from the clone bio and add that to the original bio's sector position.

Fixes: 0512a75b ("block: Introduce REQ_OP_ZONE_APPEND")
Cc: stable@vger.kernel.org
Reported-by: NNaohiro Aota <Naohiro.Aota@wdc.com>
Signed-off-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

415c79e1

27 5月, 2020 1 次提交

dm: use bio_{start,end}_io_acct · 86240d5b

由 Christoph Hellwig 提交于 5月 27, 2020

Switch dm to use the nicer bio accounting helpers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

86240d5b

21 5月, 2020 1 次提交

dm: use DMDEBUG macros now that they use pr_debug variants · ac75b09f

由 Mike Snitzer 提交于 5月 14, 2020

Now that DMDEBUG uses pr_debug and DMDEBUG_LIMIT uses
pr_debug_ratelimited cleanup DM's 2 direct pr_debug callers to use
them to get the benefit of consistent DM_FMT formatting of debugging
messages.

While doing so, dm-mpath.c:dm_report_EIO() was switched over to using
DMDEBUG_LIMIT due to the potential for error handling floods in the IO
completion path.
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

ac75b09f

19 5月, 2020 1 次提交

blk-mq: allow blk_mq_make_request to consume the q_usage_counter reference · ac7c5675

由 Christoph Hellwig 提交于 5月 16, 2020

blk_mq_make_request currently needs to grab an q_usage_counter
reference when allocating a request.  This is because the block layer
grabs one before calling blk_mq_make_request, but also releases it as
soon as blk_mq_make_request returns.  Remove the blk_queue_exit call
after blk_mq_make_request returns, and instead let it consume the
reference.  This works perfectly fine for the block layer caller, just
device mapper needs an extra reference as the old problem still
persists there.  Open code blk_queue_enter_live in device mapper,
as there should be no other callers and this allows better documenting
why we do a non-try get.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ac7c5675

15 5月, 2020 1 次提交

dm mpath: pass IO start time to path selector · 087615bf

由 Gabriel Krisman Bertazi 提交于 4月 30, 2020

The HST path selector needs this information to perform path
prediction. For request-based mpath, struct request's io_start_time_ns
is used, while for bio-based, use the start_time stored in dm_io.
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

087615bf

14 5月, 2020 1 次提交

block: Inline encryption support for blk-mq · a892c8d5

由 Satya Tangirala 提交于 5月 14, 2020

We must have some way of letting a storage device driver know what
encryption context it should use for en/decrypting a request. However,
it's the upper layers (like the filesystem/fscrypt) that know about and
manages encryption contexts. As such, when the upper layer submits a bio
to the block layer, and this bio eventually reaches a device driver with
support for inline encryption, the device driver will need to have been
told the encryption context for that bio.

We want to communicate the encryption context from the upper layer to the
storage device along with the bio, when the bio is submitted to the block
layer. To do this, we add a struct bio_crypt_ctx to struct bio, which can
represent an encryption context (note that we can't use the bi_private
field in struct bio to do this because that field does not function to pass
information across layers in the storage stack). We also introduce various
functions to manipulate the bio_crypt_ctx and make the bio/request merging
logic aware of the bio_crypt_ctx.

We also make changes to blk-mq to make it handle bios with encryption
contexts. blk-mq can merge many bios into the same request. These bios need
to have contiguous data unit numbers (the necessary changes to blk-merge
are also made to ensure this) - as such, it suffices to keep the data unit
number of just the first bio, since that's all a storage driver needs to
infer the data unit number to use for each data block in each bio in a
request. blk-mq keeps track of the encryption context to be used for all
the bios in a request with the request's rq_crypt_ctx. When the first bio
is added to an empty request, blk-mq will program the encryption context
of that bio into the request_queue's keyslot manager, and store the
returned keyslot in the request's rq_crypt_ctx. All the functions to
operate on encryption contexts are in blk-crypto.c.

Upper layers only need to call bio_crypt_set_ctx with the encryption key,
algorithm and data_unit_num; they don't have to worry about getting a
keyslot for each encryption context, as blk-mq/blk-crypto handles that.
Blk-crypto also makes it possible for request-based layered devices like
dm-rq to make use of inline encryption hardware by cloning the
rq_crypt_ctx and programming a keyslot in the new request_queue when
necessary.

Note that any user of the block layer can submit bios with an
encryption context, such as filesystems, device-mapper targets, etc.
Signed-off-by: NSatya Tangirala <satyat@google.com>
Reviewed-by: NEric Biggers <ebiggers@google.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a892c8d5

25 4月, 2020 1 次提交

block: bypass ->make_request_fn for blk-mq drivers · 8cf7961d

由 Christoph Hellwig 提交于 4月 25, 2020

Call blk_mq_make_request when no ->make_request_fn is set.  This is
safe now that blk_alloc_queue always sets up the pointer for make_request
based drivers.  This avoids an indirect call in the blk-mq driver I/O
fast path, which is rather expensive due to spectre mitigations.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8cf7961d

03 4月, 2020 3 次提交

Revert "dm: always call blk_queue_split() in dm_process_bio()" · 120c9257

由 Mike Snitzer 提交于 4月 02, 2020

This reverts commit effd58c9.

blk_queue_split() is causing excessive IO splitting -- because
blk_max_size_offset() depends on 'chunk_sectors' limit being set and
if it isn't (as is the case for DM targets!) it falls back to
splitting on a 'max_sectors' boundary regardless of offset.

"Fix" this by reverting back to _not_ using blk_queue_split() in
dm_process_bio() for normal IO (reads and writes). Long-term fix is
still TBD but it should focus on training blk_max_size_offset() to
call into a DM provided hook (to call DM's max_io_len()).

Test results from simple misaligned IO test on 4-way dm-striped device
with chunksize of 128K and stripesize of 512K:

xfs_io -d -c 'pread -b 2m 224s 4072s' /dev/mapper/stripe_dev

before this revert:

253,0 21 1 0.000000000 2206 Q R 224 + 4072 [xfs_io]
253,0 21 2 0.000008267 2206 X R 224 / 480 [xfs_io]
253,0 21 3 0.000010530 2206 X R 224 / 256 [xfs_io]
253,0 21 4 0.000027022 2206 X R 480 / 736 [xfs_io]
253,0 21 5 0.000028751 2206 X R 480 / 512 [xfs_io]
253,0 21 6 0.000033323 2206 X R 736 / 992 [xfs_io]
253,0 21 7 0.000035130 2206 X R 736 / 768 [xfs_io]
253,0 21 8 0.000039146 2206 X R 992 / 1248 [xfs_io]
253,0 21 9 0.000040734 2206 X R 992 / 1024 [xfs_io]
253,0 21 10 0.000044694 2206 X R 1248 / 1504 [xfs_io]
253,0 21 11 0.000046422 2206 X R 1248 / 1280 [xfs_io]
253,0 21 12 0.000050376 2206 X R 1504 / 1760 [xfs_io]
253,0 21 13 0.000051974 2206 X R 1504 / 1536 [xfs_io]
253,0 21 14 0.000055881 2206 X R 1760 / 2016 [xfs_io]
253,0 21 15 0.000057462 2206 X R 1760 / 1792 [xfs_io]
253,0 21 16 0.000060999 2206 X R 2016 / 2272 [xfs_io]
253,0 21 17 0.000062489 2206 X R 2016 / 2048 [xfs_io]
253,0 21 18 0.000066133 2206 X R 2272 / 2528 [xfs_io]
253,0 21 19 0.000067507 2206 X R 2272 / 2304 [xfs_io]
253,0 21 20 0.000071136 2206 X R 2528 / 2784 [xfs_io]
253,0 21 21 0.000072764 2206 X R 2528 / 2560 [xfs_io]
253,0 21 22 0.000076185 2206 X R 2784 / 3040 [xfs_io]
253,0 21 23 0.000077486 2206 X R 2784 / 2816 [xfs_io]
253,0 21 24 0.000080885 2206 X R 3040 / 3296 [xfs_io]
253,0 21 25 0.000082316 2206 X R 3040 / 3072 [xfs_io]
253,0 21 26 0.000085788 2206 X R 3296 / 3552 [xfs_io]
253,0 21 27 0.000087096 2206 X R 3296 / 3328 [xfs_io]
253,0 21 28 0.000093469 2206 X R 3552 / 3808 [xfs_io]
253,0 21 29 0.000095186 2206 X R 3552 / 3584 [xfs_io]
253,0 21 30 0.000099228 2206 X R 3808 / 4064 [xfs_io]
253,0 21 31 0.000101062 2206 X R 3808 / 3840 [xfs_io]
253,0 21 32 0.000104956 2206 X R 4064 / 4096 [xfs_io]
253,0 21 33 0.001138823 0 C R 4096 + 200 [0]

after this revert:

253,0 18 1 0.000000000 4430 Q R 224 + 3896 [xfs_io]
253,0 18 2 0.000018359 4430 X R 224 / 256 [xfs_io]
253,0 18 3 0.000028898 4430 X R 256 / 512 [xfs_io]
253,0 18 4 0.000033535 4430 X R 512 / 768 [xfs_io]
253,0 18 5 0.000065684 4430 X R 768 / 1024 [xfs_io]
253,0 18 6 0.000091695 4430 X R 1024 / 1280 [xfs_io]
253,0 18 7 0.000098494 4430 X R 1280 / 1536 [xfs_io]
253,0 18 8 0.000114069 4430 X R 1536 / 1792 [xfs_io]
253,0 18 9 0.000129483 4430 X R 1792 / 2048 [xfs_io]
253,0 18 10 0.000136759 4430 X R 2048 / 2304 [xfs_io]
253,0 18 11 0.000152412 4430 X R 2304 / 2560 [xfs_io]
253,0 18 12 0.000160758 4430 X R 2560 / 2816 [xfs_io]
253,0 18 13 0.000183385 4430 X R 2816 / 3072 [xfs_io]
253,0 18 14 0.000190797 4430 X R 3072 / 3328 [xfs_io]
253,0 18 15 0.000197667 4430 X R 3328 / 3584 [xfs_io]
253,0 18 16 0.000218751 4430 X R 3584 / 3840 [xfs_io]
253,0 18 17 0.000226005 4430 X R 3840 / 4096 [xfs_io]
253,0 18 18 0.000250404 4430 Q R 4120 + 176 [xfs_io]
253,0 18 19 0.000847708 0 C R 4096 + 24 [0]
253,0 18 20 0.000855783 0 C R 4120 + 176 [0]

Fixes: effd58c9 ("dm: always call blk_queue_split() in dm_process_bio()")
Cc: stable@vger.kernel.org
Reported-by: NAndreas Gruenbacher <agruenba@redhat.com>
Tested-by: NBarry Marson <bmarson@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

120c9257

dax: Move mandatory ->zero_page_range() check in alloc_dax() · 4e4ced93

由 Vivek Goyal 提交于 4月 01, 2020

zero_page_range() dax operation is mandatory for dax devices. Right now
that check happens in dax_zero_page_range() function. Dan thinks that's
too late and its better to do the check earlier in alloc_dax().

I also modified alloc_dax() to return pointer with error code in it in
case of failure. Right now it returns NULL and caller assumes failure
happened due to -ENOMEM. But with this ->zero_page_range() check, I
need to return -EINVAL instead.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Link: https://lore.kernel.org/r/20200401161125.GB9398@redhat.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

4e4ced93

dm,dax: Add dax zero_page_range operation · cdf6cdcd

由 Vivek Goyal 提交于 2月 28, 2020

This patch adds support for dax zero_page_range operation to dm targets.
Signed-off-by: NVivek Goyal <vgoyal@redhat.com>
Acked-by: NMike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20200228163456.1587-5-vgoyal@redhat.comSigned-off-by: NDan Williams <dan.j.williams@intel.com>

cdf6cdcd

28 3月, 2020 1 次提交

block: simplify queue allocation · 3d745ea5

由 Christoph Hellwig 提交于 3月 27, 2020

Current make_request based drivers use either blk_alloc_queue_node or
blk_alloc_queue to allocate a queue, and then set up the make_request_fn
function pointer and a few parameters using the blk_queue_make_request
helper. Simplify this by passing the make_request pointer to
blk_alloc_queue, and while at it merge the _node variant into the main
helper by always passing a node_id, and remove the superfluous gfp_mask
parameter. A lower-level __blk_alloc_queue is kept for the blk-mq case.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3d745ea5

25 3月, 2020 1 次提交

block: move the part_stat* helpers from genhd.h to a new header · c6a564ff

由 Christoph Hellwig 提交于 3月 25, 2020

These macros are just used by a few files.  Move them out of genhd.h,
which is included everywhere into a new standalone header.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c6a564ff

04 3月, 2020 1 次提交

dm: fix congested_fn for request-based device · 974f51e8

由 Hou Tao 提交于 3月 03, 2020

We neither assign congested_fn for requested-based blk-mq device nor
implement it correctly. So fix both.

Also, remove incorrect comment from dm_init_normal_md_queue and rename
it to dm_init_congested_fn.

Fixes: 4aa9c692 ("bdi: separate out congested state into a separate struct")
Cc: stable@vger.kernel.org
Signed-off-by: NHou Tao <houtao1@huawei.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

974f51e8

28 2月, 2020 1 次提交

dm: report suspended device during destroy · adc0daad

由 Mikulas Patocka 提交于 2月 24, 2020

The function dm_suspended returns true if the target is suspended.
However, when the target is being suspended during unload, it returns
false.

An example where this is a problem: the test "!dm_suspended(wc->ti)" in
writecache_writeback is not sufficient, because dm_suspended returns
zero while writecache_suspend is in progress.  As is, without an
enhanced dm_suspended, simply switching from flush_workqueue to
drain_workqueue still emits warnings:
workqueue writecache-writeback: drain_workqueue() isn't complete after 10 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 100 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 200 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 300 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 400 tries

writecache_suspend calls flush_workqueue(wc->writeback_wq) - this function
flushes the current work. However, the workqueue may re-queue itself and
flush_workqueue doesn't wait for re-queued works to finish. Because of
this - the function writecache_writeback continues execution after the
device was suspended and then concurrently with writecache_dtr, causing
a crash in writecache_writeback.

We must use drain_workqueue - that waits until the work and all re-queued
works finish.

As a prereq for switching to drain_workqueue, this commit fixes
dm_suspended to return true after the presuspend hook and before the
postsuspend hook - just like during a normal suspend. It allows
simplifying the dm-integrity and dm-writecache targets so that they
don't have to maintain suspended flags on their own.

With this change use of drain_workqueue() can be used effectively.  This
change was tested with the lvm2 testsuite and cryptsetup testsuite and
the are no regressions.

Fixes: 48debafe ("dm: add writecache target")
Cc: stable@vger.kernel.org # 4.18+
Reported-by: NCorey Marthaler <cmarthal@redhat.com>
Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

adc0daad

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功