提交 · 60a6e10c537a7459dd53882186bd16fff257fb03 · openeuler / Kernel

01 11月, 2022 2 次提交

block, bfq: record how many queues have pending requests · 60a6e10c

由 Yu Kuai 提交于 9月 16, 2022

Prepare to refactor the counting of 'num_groups_with_pending_reqs'.

Add a counter in bfq_group, update it while tracking if bfqq have pending
requests and when bfq_bfqq_move() is called.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20220916071942.214222-3-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

60a6e10c

block, bfq: support to track if bfqq has pending requests · 3d89bd12

由 Yu Kuai 提交于 9月 16, 2022

If entity belongs to bfqq, then entity->in_groups_with_pending_reqs
is not used currently. This patch use it to track if bfqq has pending
requests through callers of weights_tree insertion and removal.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NJan Kara <jack@suse.cz>
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20220916071942.214222-2-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

3d89bd12

31 10月, 2022 3 次提交

blk-mq: remove redundant call to blk_freeze_queue_start in blk_mq_destroy_queue · 56c1ee92

由 Jinlong Chen 提交于 10月 30, 2022

The calling relationship in blk_mq_destroy_queue() is as follows:

blk_mq_destroy_queue()
    ...
    -> blk_queue_start_drain()
        -> blk_freeze_queue_start()  <- called
        ...
    -> blk_freeze_queue()
        -> blk_freeze_queue_start()  <- called again
        -> blk_mq_freeze_queue_wait()
    ...

So there is a redundant call to blk_freeze_queue_start().

Replace blk_freeze_queue() with blk_mq_freeze_queue_wait() to avoid the
redundant call.
Signed-off-by: NJinlong Chen <nickyc975@zju.edu.cn>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221030083212.1251255-1-nickyc975@zju.edu.cnSigned-off-by: NJens Axboe <axboe@kernel.dk>

56c1ee92

blk-mq: move queue_is_mq out of blk_mq_cancel_work_sync · 219cf43c

由 Jinlong Chen 提交于 10月 30, 2022

The only caller that needs queue_is_mq check is del_gendisk, so move the
check into it.
Signed-off-by: NJinlong Chen <nickyc975@zju.edu.cn>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221030094730.1275463-1-nickyc975@zju.edu.cnSigned-off-by: NJens Axboe <axboe@kernel.dk>

219cf43c

blk-mq: avoid double ->queue_rq() because of early timeout · 82c22947

由 David Jeffery 提交于 10月 26, 2022

David Jeffery found one double ->queue_rq() issue, so far it can
be triggered in VM use case because of long vmexit latency or preempt
latency of vCPU pthread or long page fault in vCPU pthread, then block
IO req could be timed out before queuing the request to hardware but after
calling blk_mq_start_request() during ->queue_rq(), then timeout handler
may handle it by requeue, then double ->queue_rq() is caused, and kernel
panic.

So far, it is driver's responsibility to cover the race between timeout
and completion, so it seems supposed to be solved in driver in theory,
given driver has enough knowledge.

But it is really one common problem, lots of driver could have similar
issue, and could be hard to fix all affected drivers, even it isn't easy
for driver to handle the race. So David suggests this patch by draining
in-progress ->queue_rq() for solving this issue.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: virtualization@lists.linux-foundation.org
Cc: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: NDavid Jeffery <djeffery@redhat.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221026051957.358818-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

82c22947

26 10月, 2022 3 次提交

block: Micro-optimize get_max_segment_size() · 95465318

由 Bart Van Assche 提交于 10月 25, 2022

This patch removes a conditional jump from get_max_segment_size(). The
x86-64 assembler code for this function without this patch is as follows:

206             return min_not_zero(mask - offset + 1,
   0x0000000000000118 <+72>:    not    %rax
   0x000000000000011b <+75>:    and    0x8(%r10),%rax
   0x000000000000011f <+79>:    add    $0x1,%rax
   0x0000000000000123 <+83>:    je     0x138 <bvec_split_segs+104>
   0x0000000000000125 <+85>:    cmp    %rdx,%rax
   0x0000000000000128 <+88>:    mov    %rdx,%r12
   0x000000000000012b <+91>:    cmovbe %rax,%r12
   0x000000000000012f <+95>:    test   %rdx,%rdx
   0x0000000000000132 <+98>:    mov    %eax,%edx
   0x0000000000000134 <+100>:   cmovne %r12d,%edx

With this patch applied:

206             return min(mask - offset, (unsigned long)lim->max_segment_size - 1) + 1;
   0x000000000000003f <+63>:    mov    0x28(%rdi),%ebp
   0x0000000000000042 <+66>:    not    %rax
   0x0000000000000045 <+69>:    and    0x8(%rdi),%rax
   0x0000000000000049 <+73>:    sub    $0x1,%rbp
   0x000000000000004d <+77>:    cmp    %rbp,%rax
   0x0000000000000050 <+80>:    cmova  %rbp,%rax
   0x0000000000000054 <+84>:    add    $0x1,%eax
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221025191755.1711437-4-bvanassche@acm.orgReviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

95465318

block: Constify most queue limits pointers · aa261f20

由 Bart Van Assche 提交于 10月 25, 2022

Document which functions do not modify the queue limits.
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <kbusch@kernel.org>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20221025191755.1711437-3-bvanassche@acm.orgReviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

aa261f20

block: remove bio_start_io_acct_time · a55b70f1

由 Christoph Hellwig 提交于 10月 25, 2022

bio_start_io_acct_time is not actually used anywhere, so remove it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221025155916.270303-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

a55b70f1

25 10月, 2022 1 次提交

blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue · 2b3f056f

由 Christoph Hellwig 提交于 10月 18, 2022

The fact that blk_mq_destroy_queue also drops a queue reference leads
to various places having to grab an extra reference.  Move the call to
blk_put_queue into the callers to allow removing the extra references.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20221018135720.670094-2-hch@lst.de
[axboe: fix fabrics_q vs admin_q conflict in nvme core.c]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2b3f056f

24 10月, 2022 14 次提交

block: fix up elevator_type refcounting · 8ed40ee3

由 Jinlong Chen 提交于 10月 20, 2022

The current reference management logic of io scheduler modules contains
refcnt problems. For example, blk_mq_init_sched may fail before or after
the calling of e->ops.init_sched. If it fails before the calling, it does
nothing to the reference to the io scheduler module. But if it fails after
the calling, it releases the reference by calling kobject_put(&eq->kobj).

As the callers of blk_mq_init_sched can't know exactly where the failure
happens, they can't handle the reference to the io scheduler module
properly: releasing the reference on failure results in double-release if
blk_mq_init_sched has released it, and not releasing the reference results
in ghost reference if blk_mq_init_sched did not release it either.

The same problem also exists in io schedulers' init_sched implementations.

We can address the problem by adding releasing statements to the error
handling procedures of blk_mq_init_sched and init_sched implementations.
But that is counterintuitive and requires modifications to existing io
schedulers.

Instead, We make elevator_alloc get the io scheduler module references
that will be released by elevator_release. And then, we match each
elevator_get with an elevator_put. Therefore, each reference to an io
scheduler module explicitly has its own getter and releaser, and we no
longer need to worry about the refcnt problems.

The bugs and the patch can be validated with tools here:
https://github.com/nickyc975/linux_elv_refcnt_bug.git

[hch: split out a few bits into separate patches, use a non-try
module_get in elevator_alloc]
Signed-off-by: NJinlong Chen <nickyc975@zju.edu.cn>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221020064819.1469928-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

8ed40ee3

block: check for an unchanged elevator earlier in __elevator_change · b54c2ad9

由 Jinlong Chen 提交于 10月 20, 2022

No need to find the actual elevator_type struct for this comparism,
the name is all that is needed.
Signed-off-by: NJinlong Chen <nickyc975@zju.edu.cn>
[hch: split from a larger patch]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221020064819.1469928-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

b54c2ad9

block: sanitize the elevator name before passing it to __elevator_change · 58367c8a

由 Christoph Hellwig 提交于 10月 20, 2022

The stripped name should also be used for the none check.  To do so
strip it in the caller and pass in the sanitized name.  Drop the pointless
__ prefix in the function name while we're at it.

Based on a patch from Jinlong Chen <nickyc975@zju.edu.cn>.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221020064819.1469928-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

58367c8a

block: add proper helpers for elevator_type module refcount management · dd6f7f17

由 Christoph Hellwig 提交于 10月 20, 2022

Make sure we have helpers for all relevant module refcount operations on
the elevator_type in elevator.h, and use them. Move the call to the get
helper in blk_mq_elv_switch_none a bit so that it is obvious with a less
verbose comment.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221020064819.1469928-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

dd6f7f17

blk-wbt: don't enable throttling if default elevator is bfq · 671fae5e

由 Yu Kuai 提交于 10月 19, 2022

Commit b5dc5d4d ("block,bfq: Disable writeback throttling") tries to
disable wbt for bfq, it's done by calling wbt_disable_default() in
bfq_init_queue(). However, wbt is still enabled if default elevator is
bfq:

device_add_disk
 elevator_init_mq
  bfq_init_queue
   wbt_disable_default -> done nothing

 blk_register_queue
  wbt_enable_default -> wbt is enabled

Fix the problem by adding a new flag ELEVATOR_FLAG_DISBALE_WBT, bfq
will set the flag in bfq_init_queue, and following wbt_enable_default()
won't enable wbt while the flag is set.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019121518.3865235-7-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

671fae5e

elevator: add new field flags in struct elevator_queue · 181d0663

由 Yu Kuai 提交于 10月 19, 2022

There are only one flag to indicate that elevator is registered currently,
prepare to add a flag to disable wbt if default elevator is bfq.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019121518.3865235-6-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

181d0663

blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled · 3642ef4d

由 Yu Kuai 提交于 10月 19, 2022

Currently, if wbt is initialized and then disabled by
wbt_disable_default(), sysfs will still show valid wbt_lat_usec, which
will confuse users that wbt is still enabled.

This patch shows wbt_lat_usec as zero if it's disabled.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reported-and-tested-by: NHolger Hoffstätte <holger@applied-asynchrony.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019121518.3865235-5-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

3642ef4d

blk-wbt: make enable_state more accurate · a9a236d2

由 Yu Kuai 提交于 10月 19, 2022

Currently, if user disable wbt through sysfs, 'enable_state' will be
'WBT_STATE_ON_MANUAL', which will be confusing. Add a new state
'WBT_STATE_OFF_MANUAL' to cover that case.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019121518.3865235-4-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

a9a236d2

blk-wbt: remove unnecessary check in wbt_enable_default() · b11d31ae

由 Yu Kuai 提交于 10月 19, 2022

If CONFIG_BLK_WBT_MQ is disabled, wbt_init() won't do anything.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019121518.3865235-3-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

b11d31ae

elevator: remove redundant code in elv_unregister_queue() · 6d9f4cf1

由 Yu Kuai 提交于 10月 19, 2022

"elevator_queue *e" is already declared and initialized in the beginning
of elv_unregister_queue().
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NEric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221019121518.3865235-2-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

6d9f4cf1

blk-iocost: read 'ioc->params' inside 'ioc->lock' in ioc_timer_fn() · 074501bc

由 Yu Kuai 提交于 10月 12, 2022

'ioc->params' is updated in ioc_refresh_params(), which is proteced by
'ioc->lock', however, ioc_timer_fn() read params outside the lock.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Acked-by: NTejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20221012094035.390056-5-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

074501bc

blk-iocost: prevent configuration update concurrent with io throttling · 2b2da2f6

由 Yu Kuai 提交于 10月 12, 2022

This won't cause any severe problem currently, however, this doesn't
seems appropriate:

1) 'ioc->params' is read from multiple places without holding
'ioc->lock', unexpected value might be read if writing it concurrently.

2) If configuration is changed while io is throttling, the functionality
might be affected. For example, if module params is updated and cost
becomes smaller, waiting for timer that is caculated under old
configuration is not appropriate.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Acked-by: NTejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20221012094035.390056-4-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

2b2da2f6

blk-iocost: don't release 'ioc->lock' while updating params · 2c064798

由 Yu Kuai 提交于 10月 12, 2022

ioc_qos_write() and ioc_cost_model_write() are the same:

1) hold lock to read 'ioc->params' to local variable;
2) update params to local variable without lock;
3) hold lock to write local variable to 'ioc->params';

In theroy, if user updates params concurrenty, the params might be lost:

t1: update params a		t2: update params b
spin_lock_irq(&ioc->lock);
memcpy(qos, ioc->params.qos, sizeof(qos))
spin_unlock_irq(&ioc->lock);

qos[a] = xxx;

				spin_lock_irq(&ioc->lock);
				memcpy(qos, ioc->params.qos, sizeof(qos))
				spin_unlock_irq(&ioc->lock);

				qos[b] = xxx;

spin_lock_irq(&ioc->lock);
memcpy(ioc->params.qos, qos, sizeof(qos));
ioc_refresh_params(ioc, true);
spin_unlock_irq(&ioc->lock);

				spin_lock_irq(&ioc->lock);
				// updates of a will be lost
				memcpy(ioc->params.qos, qos, sizeof(qos));
				ioc_refresh_params(ioc, true);
				spin_unlock_irq(&ioc->lock);

Althrough this is not common case, the problem can by fixed easily by
holding the lock through the read, update, write process.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Acked-by: NTejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20221012094035.390056-3-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

2c064798

blk-iocost: disable writeback throttling · 8796acbc

由 Yu Kuai 提交于 10月 12, 2022

Commit b5dc5d4d ("block,bfq: Disable writeback throttling") disable
wbt for bfq, because different write-throttling heuristics should not
work together.

For the same reason, wbt and iocost should not work together as well,
unless admin really want to do that, dispite that performance is
affected.
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Acked-by: NTejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20221012094035.390056-2-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

8796acbc

20 10月, 2022 2 次提交

bio: safeguard REQ_ALLOC_CACHE bio put · d4347d50

由 Pavel Begunkov 提交于 10月 18, 2022

bio_put() with REQ_ALLOC_CACHE assumes that it's executed not from
an irq context. Let's add a warning if the invariant is not respected,
especially since there is a couple of places removing REQ_POLLED by hand
without also clearing REQ_ALLOC_CACHE.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/558d78313476c4e9c233902efa0092644c3d420a.1666122465.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d4347d50

block, bfq: remove unused variable for bfq_queue · 33566f92

由 Yuwei Guan 提交于 10月 18, 2022

it defined in d0edc247, but there's nowhere to use it,
so remove it.
Signed-off-by: NYuwei Guan <Yuwei.Guan@zeekrlife.com>
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20221018030139.159-1-Yuwei.Guan@zeekrlife.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

33566f92

17 10月, 2022 1 次提交

blk-mq: fix null pointer dereference in blk_mq_clear_rq_mapping() · 76dd2980

由 Yu Kuai 提交于 10月 11, 2022

Our syzkaller report a null pointer dereference, root cause is
following:

__blk_mq_alloc_map_and_rqs
 set->tags[hctx_idx] = blk_mq_alloc_map_and_rqs
  blk_mq_alloc_map_and_rqs
   blk_mq_alloc_rqs
    // failed due to oom
    alloc_pages_node
    // set->tags[hctx_idx] is still NULL
    blk_mq_free_rqs
     drv_tags = set->tags[hctx_idx];
     // null pointer dereference is triggered
     blk_mq_clear_rq_mapping(drv_tags, ...)

This is because commit 63064be1 ("blk-mq:
Add blk_mq_alloc_map_and_rqs()") merged the two steps:

1) set->tags[hctx_idx] = blk_mq_alloc_rq_map()
2) blk_mq_alloc_rqs(..., set->tags[hctx_idx])

into one step:

set->tags[hctx_idx] = blk_mq_alloc_map_and_rqs()

Since tags is not initialized yet in this case, fix the problem by
checking if tags is NULL pointer in blk_mq_clear_rq_mapping().

Fixes: 63064be1 ("blk-mq: Add blk_mq_alloc_map_and_rqs()")
Signed-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NJohn Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20221011142253.4015966-1-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

76dd2980

12 10月, 2022 1 次提交

treewide: use get_random_bytes() when possible · 197173db

由 Jason A. Donenfeld 提交于 10月 05, 2022

The prandom_bytes() function has been a deprecated inline wrapper around
get_random_bytes() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. This was done as a basic find and replace.
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NYury Norov <yury.norov@gmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> # powerpc
Acked-by: NJakub Kicinski <kuba@kernel.org>
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>

197173db

10 10月, 2022 1 次提交

block: fix leaking minors of hidden disks · a0a6314a

由 Christoph Hellwig 提交于 10月 10, 2022

The major/minor of a hidden gendisk is not propagated to the block
device because it is never registered using bdev_add.  But the lack of
bd_dev also causes the dynamic major minor number not to be freed.
Assign bd_dev manually to ensure the dynamic major minor gets freed.

Based on a patch by Keith Busch.

Fixes: 8ddcd653 ("block: introduce GENHD_FL_HIDDEN")
Reported-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NDaniel Wagner <dwagner@suse.de>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20221010131857.748129-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

a0a6314a

09 10月, 2022 1 次提交

blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init() · 285febab

由 Yu Kuai 提交于 10月 09, 2022

commit 8c5035df ("blk-wbt: call rq_qos_add() after wb_normal is
initialized") moves wbt_set_write_cache() before rq_qos_add(), which
is wrong because wbt_rq_qos() is still NULL.

Fix the problem by removing wbt_set_write_cache() and setting 'rwb->wc'
directly. Noted that this patch also remove the redundant setting of
'rab->wc'.

Fixes: 8c5035df ("blk-wbt: call rq_qos_add() after wb_normal is initialized")
Reported-by: Nkernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/r/202210081045.77ddf59b-yujie.liu@intel.comSigned-off-by: NYu Kuai <yukuai3@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20221009101038.1692875-1-yukuai1@huaweicloud.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

285febab

06 10月, 2022 1 次提交

block: Remove the repeat word 'can' · 340e1347

由 Deming Wang 提交于 10月 06, 2022

Remove the repeat word 'can' from the comments of bio_kmalloc.
Signed-off-by: NDeming Wang <wangdeming@inspur.com>
Link: https://lore.kernel.org/r/20221006084450.1513-1-wangdeming@inspur.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

340e1347

04 10月, 2022 2 次提交

block: kmsan: skip bio block merging logic for KMSAN · 11b331f8

由 Alexander Potapenko 提交于 9月 15, 2022

KMSAN doesn't allow treating adjacent memory pages as such, if they were
allocated by different alloc_pages() calls.  The block layer however does
so: adjacent pages end up being used together.  To prevent this, make
page_is_mergeable() return false under KMSAN.

Link: https://lkml.kernel.org/r/20220915150417.722975-29-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
Suggested-by: NEric Biggers <ebiggers@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

11b331f8

kmsan: disable physical page merging in biovec · f630a5d0

由 Alexander Potapenko 提交于 9月 15, 2022

KMSAN metadata for adjacent physical pages may not be adjacent, therefore
accessing such pages together may lead to metadata corruption.  We disable
merging pages in biovec to prevent such corruptions.

Link: https://lkml.kernel.org/r/20220915150417.722975-28-glider@google.comSigned-off-by: NAlexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>

f630a5d0

30 9月, 2022 8 次提交

block: extend functionality to map bvec iterator · 37987547

由 Kanchan Joshi 提交于 9月 30, 2022

Extend blk_rq_map_user_iov so that it can handle bvec iterator, using
the new blk_rq_map_user_bvec function. It maps the pages from bvec
iterator into a bio and place the bio into request.

This helper will be used by nvme for uring-passthrough path when IO is
done using pre-mapped buffers.
Signed-off-by: NKanchan Joshi <joshi.k@samsung.com>
Signed-off-by: NAnuj Gupta <anuj20.g@samsung.com>
Suggested-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220930062749.152261-11-anuj20.g@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

37987547

block: factor out blk_rq_map_bio_alloc helper · ab89e8e7

由 Kanchan Joshi 提交于 9月 30, 2022

Move bio allocation logic from bio_map_user_iov to a new helper
blk_rq_map_bio_alloc. It is named so because functionality is opposite
of what is done inside blk_mq_map_bio_put. This is a prep patch.
Signed-off-by: NKanchan Joshi <joshi.k@samsung.com>
Link: https://lore.kernel.org/r/20220930062749.152261-10-anuj20.g@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ab89e8e7

block: rename bio_map_put to blk_mq_map_bio_put · 32f1c71b

由 Anuj Gupta 提交于 9月 30, 2022

This patch renames existing bio_map_put function to blk_mq_map_bio_put.
Signed-off-by: NAnuj Gupta <anuj20.g@samsung.com>
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220930062749.152261-9-anuj20.g@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

32f1c71b

block: add blk_rq_map_user_io · 55765402

由 Anuj Gupta 提交于 9月 30, 2022

Create a helper blk_rq_map_user_io for mapping of vectored as well as
non-vectored requests. This will help in saving dupilcation of code at few
places in scsi and nvme.
Signed-off-by: NAnuj Gupta <anuj20.g@samsung.com>
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220930062749.152261-4-anuj20.g@samsung.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

55765402

block: allow end_io based requests in the completion batch handling · ab3e1d3b

由 Jens Axboe 提交于 9月 21, 2022

With end_io handlers now being able to potentially pass ownership of
the request upon completion, we can allow requests with end_io handlers
in the batch completion handling.
Reviewed-by: NAnuj Gupta <anuj20.g@samsung.com>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Co-developed-by: NStefan Roesch <shr@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ab3e1d3b

block: change request end_io handler to pass back a return value · de671d61

由 Jens Axboe 提交于 9月 21, 2022

Everything is just converted to returning RQ_END_IO_NONE, and there
should be no functional changes with this patch.

In preparation for allowing the end_io handler to pass ownership back
to the block layer, rather than retain ownership of the request.
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

de671d61

block: enable batched allocation for blk_mq_alloc_request() · 4b6a5d9c

由 Jens Axboe 提交于 9月 21, 2022

The filesystem IO path can take advantage of allocating batches of
requests, if the underlying submitter tells the block layer about it
through the blk_plug. For passthrough IO, the exported API is the
blk_mq_alloc_request() helper, and that one does not allow for
request caching.

Wire up request caching for blk_mq_alloc_request(), which is generally
done without having a bio available upfront.
Tested-by: NAnuj Gupta <anuj20.g@samsung.com>
Reviewed-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4b6a5d9c

block: kill deprecated BUG_ON() in the flush handling · e73a625b

由 Jens Axboe 提交于 9月 28, 2022

We've never had any useful reports from this BUG_ON(), and in fact a
number of the BUG_ON()'s in the flush handling need to be turned into
more graceful handling.

In preparation for allowing batched completions of the end_io handling,
where we can enter the flush completion with queuelist having been reused
for the batch, get rid of this BUG_ON().
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e73a625b

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功