提交 · cb8432d650fe3be58bb962bc8e602dc405510327 · openeuler / Kernel

02 12月, 2020 13 次提交

block: allocate struct hd_struct as part of struct bdev_inode · cb8432d6

由 Christoph Hellwig 提交于 11月 26, 2020

Allocate hd_struct together with struct block_device to pre-load
the lifetime rule changes in preparation of merging the two structures.

Note that part0 was previously embedded into struct gendisk, but is
a separate allocation now, and already points to the block_device instead
of the hd_struct.  The lifetime of struct gendisk is still controlled by
the struct device embedded in the part0 hd_struct.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cb8432d6

block: move the policy field to struct block_device · 83950d35

由 Christoph Hellwig 提交于 11月 23, 2020

Move the policy field to struct block_device and rename it to the
more descriptive bd_read_only.  Also turn the field into a bool as it
is used as such.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

83950d35

block: move make_it_fail to struct block_device · b309e993

由 Christoph Hellwig 提交于 11月 23, 2020

Move the make_it_fail flag to struct block_device an turn it into a bool
in preparation of killing struct hd_struct.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b309e993

block: move holder_dir to struct block_device · 1bdd5ae0

由 Christoph Hellwig 提交于 11月 23, 2020

Move the holder_dir field to struct block_device in preparation for
kill struct hd_struct.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1bdd5ae0

block: move the partition_meta_info to struct block_device · 231926db

由 Christoph Hellwig 提交于 11月 24, 2020

Move the partition_meta_info to struct block_device in preparation for
killing struct hd_struct.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

231926db

block: move the start_sect field to struct block_device · 29ff57c6

由 Christoph Hellwig 提交于 11月 24, 2020

Move the start_sect field to struct block_device in preparation
of killing struct hd_struct.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

29ff57c6

block: move disk stat accounting to struct block_device · 15e3d2c5

由 Christoph Hellwig 提交于 11月 24, 2020

Move the dkstats and stamp field to struct block_device in preparation
of killing struct hd_struct.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

15e3d2c5

block: remove the nr_sects field in struct hd_struct · a782483c

由 Christoph Hellwig 提交于 11月 26, 2020

Now that the hd_struct always has a block device attached to it, there is
no need for having two size field that just get out of sync.

Additionally the field in hd_struct did not use proper serialization,
possibly allowing for torn writes.  By only using the block_device field
this problem also gets fixed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Acked-by: Coly Li <colyli@suse.de>			[bcache]
Acked-by: Chao Yu <yuchao0@huawei.com>			[f2fs]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a782483c

block: simplify bdev/disk lookup in blkdev_get · 22ae8ce8

由 Christoph Hellwig 提交于 11月 26, 2020

To simplify block device lookup and a few other upcoming areas, make sure
that we always have a struct block_device available for each disk and
each partition, and only find existing block devices in bdget. The only
downside of this is that each device and partition uses a little more
memory. The upside will be that a lot of code can be simplified.

With that all we need to look up the block device is to lookup the inode
and do a few sanity checks on the gendisk, instead of the separate lookup
for the gendisk. For blk-cgroup which wants to access a gendisk without
opening it, a new blkdev_{get,put}_no_open low-level interface is added
to replace the previous get_gendisk use.

Note that the change to look up block device directly instead of the two
step lookup using struct gendisk causes a subtile change in behavior:
accessing a non-existing partition on an existing block device can now
cause a call to request_module. That call is harmless, and in practice
no recent system will access these nodes as they aren't created by udev
and static /dev/ setups are unusual.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

22ae8ce8

block: remove i_bdev · 4e7b5671

由 Christoph Hellwig 提交于 11月 23, 2020

Switch the block device lookup interfaces to directly work with a dev_t
so that struct block_device references are only acquired by the
blkdev_get variants (and the blk-cgroup special case).  This means that
we now don't need an extra reference in the inode and can generally
simplify handling of struct block_device to keep the lookups contained
in the core block layer code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: Coly Li <colyli@suse.de>		[bcache]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4e7b5671

block: use put_device in put_disk · efdc41c8

由 Christoph Hellwig 提交于 11月 10, 2020

Use put_device to put the device instead of poking into the internals
and using kobject_put.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

efdc41c8

block: use disk_part_iter_exit in disk_part_iter_next · e79319af

由 Christoph Hellwig 提交于 11月 10, 2020

Call disk_part_iter_exit in disk_part_iter_next instead of duplicating
the functionality.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e79319af

block: remove a superflous check in blkpg_do_ioctl · 3f50b95e

由 Christoph Hellwig 提交于 11月 24, 2020

sector_t is now always a u64, so this check is not needed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJan Kara <jack@suse.cz>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3f50b95e

01 12月, 2020 1 次提交

block: wbt: Remove unnecessary invoking of wbt_update_limits in wbt_init · 5a20d073

由 Lei Chen 提交于 11月 30, 2020

It's unnecessary to call wbt_update_limits explicitly within wbt_init,
because it will be called in the following function wbt_queue_depth_changed.
Signed-off-by: NLei Chen <lennychen@tencent.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5a20d073

16 11月, 2020 13 次提交

block: remove the update_bdev parameter to set_capacity_revalidate_and_notify · 449f4ec9

由 Christoph Hellwig 提交于 11月 16, 2020

The update_bdev argument is always set to true, so remove it.  Also
rename the function to the slighly less verbose set_capacity_and_notify,
as propagating the disk size to the block device isn't really
revalidation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NPetr Vorel <pvorel@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

449f4ec9

block: fix the kerneldoc comment for __register_blkdev · e2b6b301

由 Christoph Hellwig 提交于 11月 14, 2020

Switch the comment to talk about __register_blkdev instead of
register_blkdev and document the new probe parameter.

Fixes: 3da1a61e7046 ("block: add an optional probe callback to major_names")
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e2b6b301

block: switch gendisk lookup to a simple xarray · e418de3a

由 Christoph Hellwig 提交于 10月 29, 2020

Now that bdev_map is only used for finding gendisks, we can use
a simple xarray instead of the regions tracking structure for it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e418de3a

block: add an optional probe callback to major_names · a160c615

由 Christoph Hellwig 提交于 10月 29, 2020

Add a callback to the major_names array that allows a driver to override
how to probe for dev_t that doesn't currently have a gendisk registered.
This will help separating the lookup of the gendisk by dev_t vs probe
action for a not currently registered dev_t.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a160c615

block: rework requesting modules for unclaimed devices · bd8eff3b

由 Christoph Hellwig 提交于 10月 29, 2020

Instead of reusing the ranges in bdev_map, add a new helper that is
called if no ranges was found.  This is a first step to unpeel and
eventually remove the complex ranges structure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bd8eff3b

block: split block_class_lock · e49fbbbf

由 Christoph Hellwig 提交于 10月 29, 2020

Split the block_class_lock mutex into one each to protect bdev_map
and major_names.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e49fbbbf

block: open code kobj_map into in block/genhd.c · 62b508f8

由 Christoph Hellwig 提交于 10月 29, 2020

Copy and paste the kobj_map functionality in the block code in preparation
for completely rewriting it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

62b508f8

block: cleanup del_gendisk a bit · 6b3ba976

由 Christoph Hellwig 提交于 10月 29, 2020

Merge three hidden gendisk checks into one.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6b3ba976

block: remove __blkdev_driver_ioctl · a7cb3d2f

由 Christoph Hellwig 提交于 11月 03, 2020

Just open code it in the few callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a7cb3d2f

block: remove set_device_ro · 98f49b63

由 Christoph Hellwig 提交于 11月 03, 2020

Fold set_device_ro into its only remaining caller.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

98f49b63

block: don't call into the driver for BLKROSET · 732e12d8

由 Christoph Hellwig 提交于 11月 03, 2020

Now that all drivers that want to hook into setting or clearing the
read-only flag use the set_read_only method, this code can be removed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

732e12d8

block: add a new set_read_only method · e00adcad

由 Christoph Hellwig 提交于 11月 03, 2020

Add a new method to allow for driver-specific processing when setting or
clearing the block device read-only state.  This allows to replace the
cumbersome and error-prone override of the whole ioctl implementation.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e00adcad

block: don't call into the driver for BLKFLSBUF · 4a9d6d66

由 Christoph Hellwig 提交于 11月 03, 2020

BLKFLSBUF is entirely contained in the block core, and there is no
good reason to give the driver a hook into processing it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4a9d6d66

13 11月, 2020 1 次提交

block: add a return value to set_capacity_revalidate_and_notify · 7e890c37

由 Christoph Hellwig 提交于 11月 12, 2020

Return if the function ended up sending an uevent or not.

Cc: stable@vger.kernel.org # v5.9
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NPetr Vorel <pvorel@suse.cz>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7e890c37

30 10月, 2020 1 次提交

blk-mq: mark flush request as IDLE in flush_end_io() · 65ff5cd0

由 Ming Lei 提交于 10月 30, 2020

Mark flush request as IDLE in its .end_io(), aligning it with how normal
requests behave. The flush request stays in in-flight tags if we're not
using an IO scheduler, so we need to change its state into IDLE.
Otherwise, we will hang in blk_mq_tagset_wait_completed_request() during
error recovery because flush the request state is kept as COMPLETED.
Reported-by: NYi Zhang <yi.zhang@redhat.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Tested-by: NYi Zhang <yi.zhang@redhat.com>
Cc: Chao Leng <lengchao@huawei.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

65ff5cd0

28 10月, 2020 1 次提交

block: advance iov_iter on bio_add_hw_page failure · 4977d121

由 Naohiro Aota 提交于 10月 28, 2020

When the bio's size reaches max_append_sectors, bio_add_hw_page returns
0 then __bio_iov_append_get_pages returns -EINVAL. This is an expected
result of building a small enough bio not to be split in the IO path.
However, iov_iter is not advanced in this case, causing the same pages
are filled for the bio again and again.

Fix the case by properly advancing the iov_iter for already processed
pages.

Fixes: 0512a75b ("block: Introduce REQ_OP_ZONE_APPEND")
Cc: stable@vger.kernel.org # 5.8+
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: NNaohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4977d121

26 10月, 2020 2 次提交

blk-cgroup: Pre-allocate tree node on blkg_conf_prep · f255c19b

由 Gabriel Krisman Bertazi 提交于 10月 22, 2020

Similarly to commit 457e490f ("blkcg: allocate struct blkcg_gq
outside request queue spinlock"), blkg_create can also trigger
occasional -ENOMEM failures at the radix insertion because any
allocation inside blkg_create has to be non-blocking, making it more
likely to fail.  This causes trouble for userspace tools trying to
configure io weights who need to deal with this condition.

This patch reduces the occurrence of -ENOMEMs on this path by preloading
the radix tree element on a GFP_KERNEL context, such that we guarantee
the later non-blocking insertion won't fail.

A similar solution exists in blkcg_init_queue for the same situation.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f255c19b

blk-cgroup: Fix memleak on error path · 52abfcbd

由 Gabriel Krisman Bertazi 提交于 10月 22, 2020

If new_blkg allocation raced with blk_policy change and
blkg_lookup_check fails, new_blkg is leaked.
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

52abfcbd

24 10月, 2020 1 次提交

block: blk-mq: fix a kernel-doc markup · 24f7bb88

由 Mauro Carvalho Chehab 提交于 10月 23, 2020

Fix a typo:
	blk_mq_run_hw_queue -> blk_mq_run_hw_queues
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

24f7bb88

20 10月, 2020 1 次提交

blk-mq: remove the calling of local_memory_node() · 576e85c5

由 Xianting Tian 提交于 10月 19, 2020

We don't need to check whether the node is memoryless numa node before
calling allocator interface. SLUB(and SLAB,SLOB) relies on the page
allocator to pick a node. Page allocator should deal with memoryless
nodes just fine. It has zonelists constructed for each possible nodes.
And it will automatically fall back into a node which is closest to the
requested node. As long as __GFP_THISNODE is not enforced of course.

The code comments of kmem_cache_alloc_node() of SLAB also showed this:
 * Fallback to other node is possible if __GFP_THISNODE is not set.

blk-mq code doesn't set __GFP_THISNODE, so we can remove the calling
of local_memory_node().
Signed-off-by: NXianting Tian <tian.xianting@h3c.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

576e85c5

15 10月, 2020 2 次提交

docs: bio: fix a kerneldoc markup · 5cd3ddc1

由 Mauro Carvalho Chehab 提交于 9月 09, 2020

Fix this warning:

	./block/bio.c:1098: WARNING: Inline emphasis start-string without end-string.

The thing is that *iter is not a valid markup.

That seems to be a typo:
	*iter -> @iter
Signed-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>

5cd3ddc1

block: bio: fix a warning at the kernel-doc markups · 5b874af6

由 Mauro Carvalho Chehab 提交于 8月 27, 2020

Using "@bio's parent" causes the following waring:
	./block/bio.c:10: WARNING: Inline emphasis start-string without end-string.

The main problem here is that this would be converted into:

	**bio**'s parent

By kernel-doc, which is not a valid notation. It would be
possible to use, instead, this kernel-doc markup:

	``bio's`` parent

Yet, here, is probably simpler to just use an altenative language:

	the parent of @bioSigned-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>

5b874af6

14 10月, 2020 1 次提交

block: add zone specific block statuses · 3b481d91

由 Keith Busch 提交于 9月 24, 2020

A zoned device with limited resources to open or activate zones may
return an error when the host exceeds those limits. The same command may
be successful if retried later, but the host needs to wait for specific
zone states before it should expect a retry to succeed. Have the block
layer provide an appropriate status for these conditions so applications
can distinuguish this error for special handling.

Cc: linux-api@vger.kernel.org
Cc: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NKeith Busch <kbusch@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3b481d91

10 10月, 2020 3 次提交

blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue · 47ce030b

由 Yang Yang 提交于 10月 09, 2020

blk_exit_queue will free elevator_data, while blk_mq_run_work_fn
will access it. Move cancel of hctx->run_work to the front of
blk_exit_queue to avoid use-after-free.

Fixes: 1b97871b ("blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release")
Signed-off-by: NYang Yang <yang.yang@vivo.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

47ce030b

blk-mq: get rid of the dead flush handle code path · c7281524

由 Yufen Yu 提交于 10月 08, 2020

After commit 923218f6 ("blk-mq: don't allocate driver tag upfront
for flush rq"), blk_mq_submit_bio() will call blk_insert_flush()
directly to handle flush request rather than blk_mq_sched_insert_request()
in the case of elevator.

Then, all flush request either have set RQF_FLUSH_SEQ flag when call
blk_mq_sched_insert_request(), or have inserted into hctx->dispatch.
So, remove the dead code path.
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c7281524

block: get rid of unnecessary local variable · 0546858c

由 Yufen Yu 提交于 10月 08, 2020

Since whole elevator register is protectd by sysfs_lock, we
don't need extras 'has_elevator'. Just use q->elevator directly.
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0546858c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功