提交 · 4c928904ff771a8e830773b71a080047365324a5 · openeuler / Kernel

18 10月, 2021 40 次提交

block: move CONFIG_BLOCK guard to top Makefile · 4c928904

由 Masahiro Yamada 提交于 9月 27, 2021

Every object under block/ depends on CONFIG_BLOCK.

Move the guard to the top Makefile since there is no point to
descend into block/ if CONFIG_BLOCK=n.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210927140000.866249-5-masahiroy@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

4c928904

block: move menu "Partition type" to block/partitions/Kconfig · b8b98a62

由 Masahiro Yamada 提交于 9月 27, 2021

Move the menu to the relevant place.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210927140000.866249-4-masahiroy@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

b8b98a62

block: simplify Kconfig files · c50fca55

由 Masahiro Yamada 提交于 9月 27, 2021

Everything under block/ depends on BLOCK. BLOCK_HOLDER_DEPRECATED is
selected from drivers/md/Kconfig, which is entirely dependent on BLOCK.

Extend the 'if BLOCK' ... 'endif' so it covers the whole block/Kconfig.

Also, clean up the definition of BLOCK_COMPAT and BLK_MQ_PCI because
COMPAT and PCI are boolean.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210927140000.866249-3-masahiroy@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

c50fca55

block: remove redundant =y from BLK_CGROUP dependency · df252bde

由 Masahiro Yamada 提交于 9月 27, 2021

CONFIG_BLK_CGROUP is a boolean option, that is, its value is 'y' or 'n'.
The comparison to 'y' is redundant.
Signed-off-by: NMasahiro Yamada <masahiroy@kernel.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210927140000.866249-2-masahiroy@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>

df252bde

block: improve batched tag allocation · 349302da

由 Jens Axboe 提交于 10月 09, 2021

Add a blk_mq_get_tags() helper, which uses the new sbitmap API for
allocating a batch of tags all at once. This both simplifies the block
code for batched allocation, and it is also more efficient than just
doing repeated calls into __sbitmap_queue_get().

This reduces the sbitmap overhead in peak runs from ~3% to ~1% and
yields a performanc increase from 6.6M IOPS to 6.8M IOPS for a single
CPU core.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

349302da

sbitmap: add __sbitmap_queue_get_batch() · 9672b0d4

由 Jens Axboe 提交于 10月 09, 2021

The block layer tag allocation batching still calls into sbitmap to get
each tag, but we can improve on that. Add __sbitmap_queue_get_batch(),
which returns a mask of tags all at once, along with an offset for
those tags.

An example return would be 0xff, where bits 0..7 are set, with
tag_offset == 128. The valid tags in this case would be 128..135.

A batch is specific to an individual sbitmap_map, hence it cannot be
larger than that. The requested number of tags is automatically reduced
to the max that can be satisfied with a single map.

On failure, 0 is returned. Caller should fall back to single tag
allocation at that point/
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9672b0d4

blk-mq: optimise *end_request non-stat path · 8971a3b7

由 Pavel Begunkov 提交于 10月 13, 2021

We already have a blk_mq_need_time_stamp() check in
__blk_mq_end_request() to get a timestamp, hide all the statistics
accounting under it. It cuts some cycles for requests that don't need
stats, and is free otherwise.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/e0f2ea812e93a8adcd07101212e7d7e70ca304e7.1634115360.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

8971a3b7

block: mark bio_truncate static · 4f7ab09a

由 Christoph Hellwig 提交于 10月 12, 2021

bio_truncate is only used in bio.c, so mark it static.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

4f7ab09a

block: move bio_get_{first,last}_bvec out of bio.h · ff18d77b

由 Christoph Hellwig 提交于 10月 12, 2021

bio_get_first_bvec and bio_get_last_bvec are only used in blk-merge.c,
so move them there.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

ff18d77b

block: mark __bio_try_merge_page static · 9774b391

由 Christoph Hellwig 提交于 10月 12, 2021

Mark __bio_try_merge_page static and move it up a bit to avoid the need
for a forward declaration.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

9774b391

block: move bio_full out of bio.h · 9a6083be

由 Christoph Hellwig 提交于 10月 12, 2021

bio_full is only used in bio.c, so move it there.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

9a6083be

block: fold bio_cur_bytes into blk_rq_cur_bytes · b6559d8f

由 Christoph Hellwig 提交于 10月 12, 2021

Fold bio_cur_bytes into the only caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

b6559d8f

block: move bio_mergeable out of bio.h · 8addffd6

由 Christoph Hellwig 提交于 10月 12, 2021

bio_mergeable is only needed by I/O schedulers, so move it to
blk-mq-sched.h.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

8addffd6

block: don't include <linux/ioprio.h> in <linux/bio.h> · 11d9cab1

由 Christoph Hellwig 提交于 10月 12, 2021

bio.h doesn't need any of the definitions from ioprio.h.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

11d9cab1

block: remove BIO_BUG_ON · 9e8c0d0d

由 Christoph Hellwig 提交于 10月 12, 2021

BIO_DEBUG is always defined, so just switch the two instances to use
BUG_ON directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012161804.991559-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

9e8c0d0d

blk-mq: inline hot part of __blk_mq_sched_restart · e9ea1596

由 Pavel Begunkov 提交于 10月 09, 2021

Extract a fast check out of __block_mq_sched_restart() and inline it for
performance reasons.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/894abaa0998e5999f2fe18f271e5efdfc2c32bd2.1633781740.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e9ea1596

block: inline hot paths of blk_account_io_*() · be6bfe36

由 Pavel Begunkov 提交于 10月 09, 2021

Extract hot paths of __blk_account_io_start() and
__blk_account_io_done() into inline functions, so we don't always pay
for function calls.
Signed-off-by: NPavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b0662a636bd4cc7b4f84c9d0a41efa46a688ef13.1633781740.git.asml.silence@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

be6bfe36

block: merge block_ioctl into blkdev_ioctl · 8a709512

由 Christoph Hellwig 提交于 10月 12, 2021

Simplify the ioctl path and match the code structure on the compat side.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104450.659013-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

8a709512

block: move the *blkdev_ioctl declarations out of blkdev.h · 84b8514b

由 Christoph Hellwig 提交于 10月 12, 2021

These are only used inside of block/.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104450.659013-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

84b8514b

block: unexport blkdev_ioctl · fea349b0

由 Christoph Hellwig 提交于 10月 12, 2021

With the raw driver gone, there is no modular user left.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104450.659013-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

fea349b0

block: don't dereference request after flush insertion · 4a60f360

由 Jens Axboe 提交于 10月 16, 2021

We could have a race here, where the request gets freed before we call
into blk_mq_run_hw_queue(). If this happens, we cannot rely on the state
of the request.

Grab the hardware context before inserting the flush.

Fixes: 0f38d766 ("blk-mq: cleanup blk_mq_submit_bio")
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4a60f360

blk-mq: cleanup blk_mq_submit_bio · 0f38d766

由 Christoph Hellwig 提交于 10月 12, 2021

Move the blk_mq_alloc_data stack allocation only into the branch
that actually needs it, and use rq->mq_hctx instead of data.hctx
to refer to the hctx.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104045.658051-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

0f38d766

blk-mq: cleanup and rename __blk_mq_alloc_request · b90cfaed

由 Christoph Hellwig 提交于 10月 12, 2021

The newly added loop for the cached requests in __blk_mq_alloc_request
is a little too convoluted for my taste, so unwind it a bit. Also
rename the function to __blk_mq_alloc_requests now that it can allocate
more than a single request.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211012104045.658051-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>

b90cfaed

block: pre-allocate requests if plug is started and is a batch · 47c122e3

由 Jens Axboe 提交于 10月 06, 2021

The caller typically has a good (or even exact) idea of how many requests
it needs to submit. We can make the request/tag allocation a lot more
efficient if we just allocate N requests/tags upfront when we queue the
first bio from the batch.

Provide a new plug start helper that allows the caller to specify how many
IOs are expected. This sets plug->nr_ios, and we can use that for smarter
request allocation. The plug provides a holding spot for requests, and
request allocation will check it before calling into the normal request
allocation path.

The blk_finish_plug() is called, check if there are unused requests and
free them. This should not happen in normal operations. The exception is
if we get merging, then we may be left with requests that need freeing
when done.

This raises the per-core performance on my setup from ~5.8M to ~6.1M
IOPS.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

47c122e3

block: bump max plugged deferred size from 16 to 32 · ba0ffdd8

由 Jens Axboe 提交于 10月 06, 2021

Particularly for NVMe with efficient deferred submission for many
requests, there are nice benefits to be seen by bumping the default max
plug count from 16 to 32. This is especially true for virtualized setups,
where the submit part is more expensive. But can be noticed even on
native hardware.

Reduce the multiple queue factor from 4 to 2, since we're changing the
default size.

While changing it, move the defines into the block layer private header.
These aren't values that anyone outside of the block layer uses, or
should use.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ba0ffdd8

block: inherit request start time from bio for BLK_CGROUP · 00067077

由 Jens Axboe 提交于 10月 05, 2021

Doing high IOPS testing with blk-cgroups enabled spends ~15-20% of the
time just doing ktime_get_ns() -> readtsc. We essentially read and
set the start time twice, one for the bio and then again when that bio
is mapped to a request.

Given that the time between the two is very short, inherit the bio
start time instead of reading it again. This cuts 1/3rd of the overhead
of the time keeping.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

00067077

block: move blk-throtl fast path inline · a7b36ee6

由 Jens Axboe 提交于 10月 05, 2021

Even if no policies are defined, we spend ~2% of the total IO time
checking. Move the fast path inline.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a7b36ee6

blk-mq: Change shared sbitmap naming to shared tags · 079a2e3e

由 John Garry 提交于 10月 05, 2021

Now that shared sbitmap support really means shared tags, rename symbols
to match that.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633429419-228500-15-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

079a2e3e

blk-mq: Stop using pointers for blk_mq_tags bitmap tags · ae0f1a73

由 John Garry 提交于 10月 05, 2021

Now that we use shared tags for shared sbitmap support, we don't require
the tags sbitmap pointers, so drop them.

This essentially reverts commit 222a5ae0 ("blk-mq: Use pointers for
blk_mq_tags bitmap tags").

Function blk_mq_init_bitmap_tags() is removed also, since it would be only
a wrappper for blk_mq_init_bitmaps().
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1633429419-228500-14-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

ae0f1a73

blk-mq: Use shared tags for shared sbitmap support · e155b0c2

由 John Garry 提交于 10月 05, 2021

Currently we use separate sbitmap pairs and active_queues atomic_t for
shared sbitmap support.

However a full sets of static requests are used per HW queue, which is
quite wasteful, considering that the total number of requests usable at
any given time across all HW queues is limited by the shared sbitmap depth.

As such, it is considerably more memory efficient in the case of shared
sbitmap to allocate a set of static rqs per tag set or request queue, and
not per HW queue.

So replace the sbitmap pairs and active_queues atomic_t with a shared
tags per tagset and request queue, which will hold a set of shared static
rqs.

Since there is now no valid HW queue index to be passed to the blk_mq_ops
.init and .exit_request callbacks, pass an invalid index token. This
changes the semantics of the APIs, such that the callback would need to
validate the HW queue index before using it. Currently no user of shared
sbitmap actually uses the HW queue index (as would be expected).
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1633429419-228500-13-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

e155b0c2

blk-mq: Refactor and rename blk_mq_free_map_and_{requests->rqs}() · 645db34e

由 John Garry 提交于 10月 05, 2021

Refactor blk_mq_free_map_and_requests() such that it can be used at many
sites at which the tag map and rqs are freed.

Also rename to blk_mq_free_map_and_rqs(), which is shorter and matches the
alloc equivalent.
Suggested-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/1633429419-228500-12-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

645db34e

blk-mq: Add blk_mq_alloc_map_and_rqs() · 63064be1

由 John Garry 提交于 10月 05, 2021

Add a function to combine allocating tags and the associated requests,
and factor out common patterns to use this new function.

Some function only call blk_mq_alloc_map_and_rqs() now, but more
functionality will be added later.

Also make blk_mq_alloc_rq_map() and blk_mq_alloc_rqs() static since they
are only used in blk-mq.c, and finally rename some functions for
conciseness and consistency with other function names:
- __blk_mq_alloc_map_and_{request -> rqs}()
- blk_mq_alloc_{map_and_requests -> set_map_and_rqs}()
Suggested-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1633429419-228500-11-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

63064be1

blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap() · a7e7388d

由 John Garry 提交于 10月 05, 2021

Put the functionality to update the sched shared sbitmap size in a common
function.

Since the same formula is always used to resize, and it can be got from
the request queue argument, so just pass the request queue pointer.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/1633429419-228500-10-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

a7e7388d

blk-mq: Don't clear driver tags own mapping · 4f245d5b

由 John Garry 提交于 10月 05, 2021

Function blk_mq_clear_rq_mapping() is required to clear the sched tags
mappings in driver tags rqs[].

But there is no need for a driver tags to clear its own mapping, so skip
clearing the mapping in this scenario.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1633429419-228500-9-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

4f245d5b

blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() · f32e4eaf

由 John Garry 提交于 10月 05, 2021

Function blk_mq_clear_rq_mapping() will be used for shared sbitmap tags
in future, so pass a driver tags pointer instead of the tagset container
and HW queue index.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1633429419-228500-8-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

f32e4eaf

blk-mq-sched: Rename blk_mq_sched_free_{requests -> rqs}() · 1820f4f0

由 John Garry 提交于 10月 05, 2021

To be more concise and consistent in naming, rename
blk_mq_sched_free_requests() -> blk_mq_sched_free_rqs().
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/1633429419-228500-7-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

1820f4f0

blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_rqs}() · d99a6bb3

由 John Garry 提交于 10月 05, 2021

Function blk_mq_sched_alloc_tags() does same as
__blk_mq_alloc_map_and_request(), so give a similar name to be consistent.

Similarly rename label err_free_tags -> err_free_map_and_rqs.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/1633429419-228500-6-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d99a6bb3

blk-mq: Invert check in blk_mq_update_nr_requests() · f6adcef5

由 John Garry 提交于 10月 05, 2021

It's easier to read:

if (x)
	X;
else
	Y;

over:

if (!x)
	Y;
else
	X;

No functional change intended.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/1633429419-228500-5-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

f6adcef5

blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests() · 8fa04464

由 John Garry 提交于 10月 05, 2021

For shared sbitmap, if the call to blk_mq_tag_update_depth() was
successful for any hctx when hctx->sched_tags is not set, then it would be
successful for all (due to nature in which blk_mq_tag_update_depth()
fails).

As such, there is no need to call blk_mq_tag_resize_shared_sbitmap() for
each hctx. So relocate the call until after the hctx iteration under the
!q->elevator check, which is equivalent (to !hctx->sched_tags).
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/1633429419-228500-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

8fa04464

block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ · d2a27964

由 John Garry 提交于 10月 05, 2021

It is a bit confusing that there is BLKDEV_MAX_RQ and MAX_SCHED_RQ, as
the name BLKDEV_MAX_RQ would imply the max requests always, which it is
not.

Rename to BLKDEV_MAX_RQ to BLKDEV_DEFAULT_RQ, matching its usage - that being
the default number of requests assigned when allocating a request queue.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/1633429419-228500-3-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>

d2a27964

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功