提交 · 387048bf67eeff8bdf7c2a41b03b48230a88b3d3 · openeuler / Kernel

24 3月, 2020 18 次提交

block: merge partition-generic.c and check.c · 387048bf

由 Christoph Hellwig 提交于 3月 24, 2020

Merge block/partition-generic.c and block/partitions/check.c into
a single block/partitions/core.c as the content is closely related
and both files are tiny.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

387048bf

block: move the various x86 Unix label formats out of genhd.h · 3f4fc59c

由 Christoph Hellwig 提交于 3月 24, 2020

All these are just used in block/partitions/msdos.c, so move them out of the
genhd.h driver included by every driver.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3f4fc59c

partitions/msdos: remove LINUX_SWAP_PARTITION · cb0ab526

由 Christoph Hellwig 提交于 3月 24, 2020

Just always use NEW_SOLARIS_X86_PARTITION and explain the situation,
as that is less confusing than two names for a single value.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cb0ab526

block: move the *_PARTITION enum out of genhd.h · 0226e9ea

由 Christoph Hellwig 提交于 3月 24, 2020

The enum containing the *_PARTITION symbolic names is only relevant
for the partition parser.  More specifically most values are MSDOS
partition table system indicators and thus should go straight into
msdos.c.  One value is only used by the sun partition parser, and the
sun and sgi partition parsers use the same value as the x86 Linux
RAID indicator to also indicate RAID autodetection.  Duplicate them
in sun.c and sgi.c given that the different partition types use
entirely different values otherwise.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0226e9ea

block: move struct partition out of genhd.h · 1442f76d

由 Christoph Hellwig 提交于 3月 24, 2020

struct partition is the on-disk format of a MSDOS partition table entry.
Move it out of genhd.h into a new msdos_partition.h header and give it
a msdos_ prefix to avoid confusion.
Also move the magic number from block/partitions/msdos.h to the new
header so that it can be used by the SCSI drivers looking at the DOS
partition tables.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1442f76d

block: remove block/partitions/sun.h · cbb5cb3b

由 Christoph Hellwig 提交于 3月 24, 2020

Just move the two defines to block/partitions/sun.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cbb5cb3b

block: remove block/partitions/sgi.h · 95f77ef3

由 Christoph Hellwig 提交于 3月 24, 2020

Just move the single define to block/partitions/sgi.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

95f77ef3

block: remove block/partitions/osf.h · 3466f63a

由 Christoph Hellwig 提交于 3月 24, 2020

Just move the single define to block/partitions/osf.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3466f63a

block: remove block/partitions/karma.h · f6d17358

由 Christoph Hellwig 提交于 3月 24, 2020

Just move the single define to block/partitions/karma.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f6d17358

block: declare all partition detection routines in check.h · 3f1b95ef

由 Christoph Hellwig 提交于 3月 24, 2020

There is no good reason to include one header per partition type in
core.c.  Instead move the prototypes for the detection routins to
check.h, and remove all now empty headers in block/partitions/.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3f1b95ef

block: remove warn_no_part · ffa9ed64

由 Christoph Hellwig 提交于 3月 24, 2020

The warn_no_part is initialized to 1 and never changed.  Remove
it and execute the code keyed off from it unconditionally.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ffa9ed64

block: cleanup how md_autodetect_dev is called · 74cc979c

由 Christoph Hellwig 提交于 3月 24, 2020

Add a new include/linux/raid/detect.h header to declare the
md_autodetect_dev prototype which can be shared between md and
the partition code.  Then use IS_BUILTIN to call it instead of the
ifdef magic.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

74cc979c

block: unexport read_dev_sector and put_dev_sector · 1a9fba3a

由 Christoph Hellwig 提交于 3月 24, 2020

read_dev_sector and put_dev_sector are now only used by the partition
parsing code.  Remove the export for read_dev_sector and merge it into
the only caller.  Clean the mess up a bit by using goto labels and
the SECTOR_SHIFT constant.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1a9fba3a

block: remove alloc_part_info and free_part_info · f17c21c1

由 Christoph Hellwig 提交于 3月 24, 2020

There isn't any good reason not to simply open code the allocation and
freeing of the partition_meta_info structure.  Especially as one of
the branches in alloc_part_info is entirely dead code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f17c21c1

block: move sysfs methods shared by disks and partitions to genhd.c · 3ad5cee5

由 Christoph Hellwig 提交于 3月 24, 2020

Move the sysfs _show methods that are used both on the full disk and
partition nodes to genhd.c instead of hiding them in the partitioning
code.  Also move the declaration for these methods to block/blk.h so
that we don't expose them to drivers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3ad5cee5

block: move disk_name and related helpers out of partition-generic.c · 5cbd28e3

由 Christoph Hellwig 提交于 3月 24, 2020

Thes functions aren't really related to partition support, so move them
to a more suitable place.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5cbd28e3

block: remove __bdevname · ea3edd4d

由 Christoph Hellwig 提交于 3月 24, 2020

There is no good reason for __bdevname to exist.  Just open code
printing the string in the callers.  For three of them the format
string can be trivially merged into existing printk statements,
and in init/do_mounts.c we can at least do the scnprintf once at
the start of the function, and unconditional of CONFIG_BLOCK to
make the output for tiny configfs a little more helpful.

Acked-by: Theodore Ts'o <tytso@mit.edu> # for ext4
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ea3edd4d

block: remove the blk_lookup_devt export · d2332c5c

由 Christoph Hellwig 提交于 3月 24, 2020

This function is only used by init/do_mounts.c, which can't be modular.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d2332c5c

22 3月, 2020 5 次提交

block, bfq: invoke flush_idle_tree after reparent_active_queues in pd_offline · 4d38a87f

由 Paolo Valente 提交于 3月 21, 2020

In bfq_pd_offline(), the function bfq_flush_idle_tree() is invoked to
flush the rb tree that contains all idle entities belonging to the pd
(cgroup) being destroyed. In particular, bfq_flush_idle_tree() is
invoked before bfq_reparent_active_queues(). Yet the latter may happen
to add some entities to the idle tree. It happens if, in some of the
calls to bfq_bfqq_move() performed by bfq_reparent_active_queues(),
the queue to move is empty and gets expired.

This commit simply reverses the invocation order between
bfq_flush_idle_tree() and bfq_reparent_active_queues().

Tested-by: cki-project@redhat.com
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4d38a87f

block, bfq: make reparent_leaf_entity actually work only on leaf entities · 576682fa

由 Paolo Valente 提交于 3月 21, 2020

bfq_reparent_leaf_entity() reparents the input leaf entity (a leaf
entity represents just a bfq_queue in an entity tree). Yet, the input
entity is guaranteed to always be a leaf entity only in two-level
entity trees. In this respect, because of the error fixed by
commit 14afc593 ("block, bfq: fix overwrite of bfq_group pointer
in bfq_find_set_group()"), all (wrongly collapsed) entity trees happened
to actually have only two levels. After the latter commit, this does not
hold any longer.

This commit fixes this problem by modifying
bfq_reparent_leaf_entity(), so that it searches an active leaf entity
down the path that stems from the input entity. Such a leaf entity is
guaranteed to exist when bfq_reparent_leaf_entity() is invoked.

Tested-by: cki-project@redhat.com
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

576682fa

block, bfq: turn put_queue into release_process_ref in __bfq_bic_change_cgroup · c8997736

由 Paolo Valente 提交于 3月 21, 2020

A bfq_put_queue() may be invoked in __bfq_bic_change_cgroup(). The
goal of this put is to release a process reference to a bfq_queue. But
process-reference releases may trigger also some extra operation, and,
to this goal, are handled through bfq_release_process_ref(). So, turn
the invocation of bfq_put_queue() into an invocation of
bfq_release_process_ref().

Tested-by: cki-project@redhat.com
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c8997736

block, bfq: move forward the getting of an extra ref in bfq_bfqq_move · fd1bb3ae

由 Paolo Valente 提交于 3月 21, 2020

Commit ecedd3d7 ("block, bfq: get extra ref to prevent a queue
from being freed during a group move") gets an extra reference to a
bfq_queue before possibly deactivating it (temporarily), in
bfq_bfqq_move(). This prevents the bfq_queue from disappearing before
being reactivated in its new group.

Yet, the bfq_queue may also be expired (i.e., its service may be
stopped) before the bfq_queue is deactivated. And also an expiration
may lead to a premature freeing. This commit fixes this issue by
simply moving forward the getting of the extra reference already
introduced by commit ecedd3d7 ("block, bfq: get extra ref to
prevent a queue from being freed during a group move").

Reported-by: cki-project@redhat.com
Tested-by: cki-project@redhat.com
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fd1bb3ae

block, bfq: fix use-after-free in bfq_idle_slice_timer_body · 2f95fa5c

由 Zhiqiang Liu 提交于 3月 19, 2020

In bfq_idle_slice_timer func, bfqq = bfqd->in_service_queue is
not in bfqd-lock critical section. The bfqq, which is not
equal to NULL in bfq_idle_slice_timer, may be freed after passing
to bfq_idle_slice_timer_body. So we will access the freed memory.

In addition, considering the bfqq may be in race, we should
firstly check whether bfqq is in service before doing something
on it in bfq_idle_slice_timer_body func. If the bfqq in race is
not in service, it means the bfqq has been expired through
__bfq_bfqq_expire func, and wait_request flags has been cleared in
__bfq_bfqd_reset_in_service func. So we do not need to re-clear the
wait_request of bfqq which is not in service.

KASAN log is given as follows:
[13058.354613] ==================================================================
[13058.354640] BUG: KASAN: use-after-free in bfq_idle_slice_timer+0xac/0x290
[13058.354644] Read of size 8 at addr ffffa02cf3e63f78 by task fork13/19767
[13058.354646]
[13058.354655] CPU: 96 PID: 19767 Comm: fork13
[13058.354661] Call trace:
[13058.354667]  dump_backtrace+0x0/0x310
[13058.354672]  show_stack+0x28/0x38
[13058.354681]  dump_stack+0xd8/0x108
[13058.354687]  print_address_description+0x68/0x2d0
[13058.354690]  kasan_report+0x124/0x2e0
[13058.354697]  __asan_load8+0x88/0xb0
[13058.354702]  bfq_idle_slice_timer+0xac/0x290
[13058.354707]  __hrtimer_run_queues+0x298/0x8b8
[13058.354710]  hrtimer_interrupt+0x1b8/0x678
[13058.354716]  arch_timer_handler_phys+0x4c/0x78
[13058.354722]  handle_percpu_devid_irq+0xf0/0x558
[13058.354731]  generic_handle_irq+0x50/0x70
[13058.354735]  __handle_domain_irq+0x94/0x110
[13058.354739]  gic_handle_irq+0x8c/0x1b0
[13058.354742]  el1_irq+0xb8/0x140
[13058.354748]  do_wp_page+0x260/0xe28
[13058.354752]  __handle_mm_fault+0x8ec/0x9b0
[13058.354756]  handle_mm_fault+0x280/0x460
[13058.354762]  do_page_fault+0x3ec/0x890
[13058.354765]  do_mem_abort+0xc0/0x1b0
[13058.354768]  el0_da+0x24/0x28
[13058.354770]
[13058.354773] Allocated by task 19731:
[13058.354780]  kasan_kmalloc+0xe0/0x190
[13058.354784]  kasan_slab_alloc+0x14/0x20
[13058.354788]  kmem_cache_alloc_node+0x130/0x440
[13058.354793]  bfq_get_queue+0x138/0x858
[13058.354797]  bfq_get_bfqq_handle_split+0xd4/0x328
[13058.354801]  bfq_init_rq+0x1f4/0x1180
[13058.354806]  bfq_insert_requests+0x264/0x1c98
[13058.354811]  blk_mq_sched_insert_requests+0x1c4/0x488
[13058.354818]  blk_mq_flush_plug_list+0x2d4/0x6e0
[13058.354826]  blk_flush_plug_list+0x230/0x548
[13058.354830]  blk_finish_plug+0x60/0x80
[13058.354838]  read_pages+0xec/0x2c0
[13058.354842]  __do_page_cache_readahead+0x374/0x438
[13058.354846]  ondemand_readahead+0x24c/0x6b0
[13058.354851]  page_cache_sync_readahead+0x17c/0x2f8
[13058.354858]  generic_file_buffered_read+0x588/0xc58
[13058.354862]  generic_file_read_iter+0x1b4/0x278
[13058.354965]  ext4_file_read_iter+0xa8/0x1d8 [ext4]
[13058.354972]  __vfs_read+0x238/0x320
[13058.354976]  vfs_read+0xbc/0x1c0
[13058.354980]  ksys_read+0xdc/0x1b8
[13058.354984]  __arm64_sys_read+0x50/0x60
[13058.354990]  el0_svc_common+0xb4/0x1d8
[13058.354994]  el0_svc_handler+0x50/0xa8
[13058.354998]  el0_svc+0x8/0xc
[13058.354999]
[13058.355001] Freed by task 19731:
[13058.355007]  __kasan_slab_free+0x120/0x228
[13058.355010]  kasan_slab_free+0x10/0x18
[13058.355014]  kmem_cache_free+0x288/0x3f0
[13058.355018]  bfq_put_queue+0x134/0x208
[13058.355022]  bfq_exit_icq_bfqq+0x164/0x348
[13058.355026]  bfq_exit_icq+0x28/0x40
[13058.355030]  ioc_exit_icq+0xa0/0x150
[13058.355035]  put_io_context_active+0x250/0x438
[13058.355038]  exit_io_context+0xd0/0x138
[13058.355045]  do_exit+0x734/0xc58
[13058.355050]  do_group_exit+0x78/0x220
[13058.355054]  __wake_up_parent+0x0/0x50
[13058.355058]  el0_svc_common+0xb4/0x1d8
[13058.355062]  el0_svc_handler+0x50/0xa8
[13058.355066]  el0_svc+0x8/0xc
[13058.355067]
[13058.355071] The buggy address belongs to the object at ffffa02cf3e63e70#012 which belongs to the cache bfq_queue of size 464
[13058.355075] The buggy address is located 264 bytes inside of#012 464-byte region [ffffa02cf3e63e70, ffffa02cf3e64040)
[13058.355077] The buggy address belongs to the page:
[13058.355083] page:ffff7e80b3cf9800 count:1 mapcount:0 mapping:ffff802db5c90780 index:0xffffa02cf3e606f0 compound_mapcount: 0
[13058.366175] flags: 0x2ffffe0000008100(slab|head)
[13058.370781] raw: 2ffffe0000008100 ffff7e80b53b1408 ffffa02d730c1c90 ffff802db5c90780
[13058.370787] raw: ffffa02cf3e606f0 0000000000370023 00000001ffffffff 0000000000000000
[13058.370789] page dumped because: kasan: bad access detected
[13058.370791]
[13058.370792] Memory state around the buggy address:
[13058.370797]  ffffa02cf3e63e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb
[13058.370801]  ffffa02cf3e63e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13058.370805] >ffffa02cf3e63f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13058.370808]                                                                 ^
[13058.370811]  ffffa02cf3e63f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13058.370815]  ffffa02cf3e64000: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[13058.370817] ==================================================================
[13058.370820] Disabling lock debugging due to kernel taint

Here, we directly pass the bfqd to bfq_idle_slice_timer_body func.
--
V2->V3: rewrite the comment as suggested by Paolo Valente
V1->V2: add one comment, and add Fixes and Reported-by tag.

Fixes: aee69d78 ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
Acked-by: NPaolo Valente <paolo.valente@linaro.org>
Reported-by: NWang Wang <wangwang2@huawei.com>
Signed-off-by: NZhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: NFeilong Lin <linfeilong@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

2f95fa5c

19 3月, 2020 1 次提交

block/genhd: Notify udev about capacity change · e598a72f

由 Balbir Singh 提交于 3月 13, 2020

Allow block/genhd to notify user space (via udev) about disk size changes
using a new helper set_capacity_revalidate_and_notify(), which is a wrapper
on top of set_capacity(). set_capacity_revalidate_and_notify() will only
notify via udev if the current capacity or the target capacity is not zero
and iff the capacity changes.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSomeswarudu Sangaraju <ssomesh@amazon.com>
Signed-off-by: NBalbir Singh <sblbir@amazon.com>
Reviewed-by: NBob Liu <bob.liu@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e598a72f

18 3月, 2020 1 次提交

block: Prevent hung_check firing during long sync IO · de6a78b6

由 Ming Lei 提交于 3月 18, 2020

submit_bio_wait() can be called from ioctl(BLKSECDISCARD), which
may take long time to complete, as Salman mentioned, 4K BLKSECDISCARD
takes up to 100 second on some devices. Also any block I/O operation
that occurs after the BLKSECDISCARD is submitted will also potentially
be affected by the hung task timeouts.

Another report is that task hang can be observed when running mkfs
over raid10 which takes a small max discard sectors limit because
of chunk size.

So prevent hung_check from firing by taking same approach used
in blk_execute_rq(), and the wake-up interval is set as half the
hung_check timer period, which keeps overhead low enough.

Cc: Salman Qazi <sqazi@google.com>
Cc: Jesse Barnes <jsbarnes@google.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Link: https://lkml.org/lkml/2020/2/12/1193Reported-by: NSalman Qazi <sqazi@google.com>
Reviewed-by: NJesse Barnes <jsbarnes@google.com>
Reviewed-by: NSalman Qazi <sqazi@google.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

de6a78b6

12 3月, 2020 10 次提交

block, zoned: fix integer overflow with BLKRESETZONE et al · 11bde986

由 Alexey Dobriyan 提交于 2月 12, 2020

Check for overflow in addition before checking for end-of-block-device.

Steps to reproduce:

	#define _GNU_SOURCE 1
	#include <sys/ioctl.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <fcntl.h>

	typedef unsigned long long __u64;

	struct blk_zone_range {
	        __u64 sector;
	        __u64 nr_sectors;
	};

	#define BLKRESETZONE    _IOW(0x12, 131, struct blk_zone_range)

	int main(void)
	{
	        int fd = open("/dev/nullb0", O_RDWR|O_DIRECT);
	        struct blk_zone_range zr = {4096, 0xfffffffffffff000ULL};
	        ioctl(fd, BLKRESETZONE, &zr);
	        return 0;
	}

BUG: KASAN: null-ptr-deref in submit_bio_wait+0x74/0xe0
Write of size 8 at addr 0000000000000040 by task a.out/1590

CPU: 8 PID: 1590 Comm: a.out Not tainted 5.6.0-rc1-00019-g359c92c0 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190711_202441-buildvm-armv7-10.arm.fedoraproject.org-2.fc31 04/01/2014
Call Trace:
 dump_stack+0x76/0xa0
 __kasan_report.cold+0x5/0x3e
 kasan_report+0xe/0x20
 submit_bio_wait+0x74/0xe0
 blkdev_zone_mgmt+0x26f/0x2a0
 blkdev_zone_mgmt_ioctl+0x14b/0x1b0
 blkdev_ioctl+0xb28/0xe60
 block_ioctl+0x69/0x80
 ksys_ioctl+0x3af/0xa50
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NAlexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

11bde986

blk-iocost: remove duplicated lines in comments · fa800d73

由 Weiping Zhang 提交于 2月 27, 2020

Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NWeiping Zhang <zhangweiping@didiglobal.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fa800d73

block: sed-opal: Change the check condition for regular session validity · 88d6041d

由 Revanth Rajashekar 提交于 3月 03, 2020

This patch changes the check condition for the validity/authentication
of the session.

1. The Host Session Number(HSN) in the response should match the HSN for
the session.
2. The TPER Session Number(TSN) can never be less than 4096 for a regular
session.

Reference:
Section 3.2.2.1 of https://trustedcomputinggroup.org/wp-content/uploads/TCG_Storage_Opal_SSC_Application_Note_1-00_1-00-Final.pdf
Section 3.3.7.1.1 of https://trustedcomputinggroup.org/wp-content/uploads/TCG_Storage_Architecture_Core_Spec_v2.01_r1.00.pdfCo-developed-by: NAndrzej Jakowski <andrzej.jakowski@linux.intel.com>
Signed-off-by: NAndrzej Jakowski <andrzej.jakowski@linux.intel.com>
Signed-off-by: NRevanth Rajashekar <revanth.rajashekar@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

88d6041d

block: cleanup comment for blk_flush_complete_seq · ce24f736

由 Guoqing Jiang 提交于 3月 09, 2020

Remove the comment about return value, since it is not valid after
commit 404b8f5a ("block: cleanup kick/queued handling").
Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ce24f736

block: remove unneeded argument from blk_alloc_flush_queue · 754a1572

由 Guoqing Jiang 提交于 3月 09, 2020

Remove 'q' from arguments since it is not used anymore after
commit 7e992f84 ("block: remove non mq parts from the
flush code").
Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

754a1572

block: cleanup for _blk/blk_rq_prep_clone · 361301a2

由 Guoqing Jiang 提交于 3月 09, 2020

Both cmd and sense had been moved to scsi_request, so remove
the related comments to avoid confusion.

And as Bart suggested, move _blk_rq_prep_clone into the only
caller (blk_rq_prep_clone).
Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

361301a2

block: remove redundant setting of QUEUE_FLAG_DYING · fc4cc772

由 Guoqing Jiang 提交于 3月 09, 2020

Previously, blk_cleanup_queue has called blk_set_queue_dying to set the
flag, no need to do it again.
Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fc4cc772

block: use bio_{wouldblock,io}_error in direct_make_request · 35ed78b3

由 Guoqing Jiang 提交于 3月 09, 2020

Use the two functions to simplify code.
Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: NNikolay Borisov <nborisov@suse.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

35ed78b3

block: fix comment for blk_cloned_rq_check_limits · 0d720318

由 Guoqing Jiang 提交于 3月 09, 2020

Since the later description mentioned "checked against the new queue
limits", so make the change to avoid confusion.
Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0d720318

block: Fix use-after-free issue accessing struct io_cq · 30a2da7b

由 Sahitya Tummala 提交于 3月 11, 2020

There is a potential race between ioc_release_fn() and
ioc_clear_queue() as shown below, due to which below kernel
crash is observed. It also can result into use-after-free
issue.

context#1:				context#2:
ioc_release_fn()			__ioc_clear_queue() gets the same icq
->spin_lock(&ioc->lock);		->spin_lock(&ioc->lock);
->ioc_destroy_icq(icq);
  ->list_del_init(&icq->q_node);
  ->call_rcu(&icq->__rcu_head,
  	icq_free_icq_rcu);
->spin_unlock(&ioc->lock);
					->ioc_destroy_icq(icq);
					  ->hlist_del_init(&icq->ioc_node);
					  This results into below crash as this memory
					  is now used by icq->__rcu_head in context#1.
					  There is a chance that icq could be free'd
					  as well.

22150.386550:   <6> Unable to handle kernel write to read-only memory
at virtual address ffffffaa8d31ca50
...
Call trace:
22150.607350:   <2>  ioc_destroy_icq+0x44/0x110
22150.611202:   <2>  ioc_clear_queue+0xac/0x148
22150.615056:   <2>  blk_cleanup_queue+0x11c/0x1a0
22150.619174:   <2>  __scsi_remove_device+0xdc/0x128
22150.623465:   <2>  scsi_forget_host+0x2c/0x78
22150.627315:   <2>  scsi_remove_host+0x7c/0x2a0
22150.631257:   <2>  usb_stor_disconnect+0x74/0xc8
22150.635371:   <2>  usb_unbind_interface+0xc8/0x278
22150.639665:   <2>  device_release_driver_internal+0x198/0x250
22150.644897:   <2>  device_release_driver+0x24/0x30
22150.649176:   <2>  bus_remove_device+0xec/0x140
22150.653204:   <2>  device_del+0x270/0x460
22150.656712:   <2>  usb_disable_device+0x120/0x390
22150.660918:   <2>  usb_disconnect+0xf4/0x2e0
22150.664684:   <2>  hub_event+0xd70/0x17e8
22150.668197:   <2>  process_one_work+0x210/0x480
22150.672222:   <2>  worker_thread+0x32c/0x4c8

Fix this by adding a new ICQ_DESTROYED flag in ioc_destroy_icq() to
indicate this icq is once marked as destroyed. Also, ensure
__ioc_clear_queue() is accessing icq within rcu_read_lock/unlock so
that icq doesn't get free'd up while it is still using it.
Signed-off-by: NSahitya Tummala <stummala@codeaurora.org>
Co-developed-by: NPradeep P V K <ppvk@codeaurora.org>
Signed-off-by: NPradeep P V K <ppvk@codeaurora.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

30a2da7b

10 3月, 2020 2 次提交

blk-mq: Fix a recently introduced regression in blk_mq_realloc_hw_ctxs() · d0930bb8

由 Bart Van Assche 提交于 3月 09, 2020

q->nr_hw_queues must only be updated once it is known that
blk_mq_realloc_hw_ctxs() has succeeded. Otherwise it can happen that
reallocation fails and that q->nr_hw_queues is larger than the number of
allocated hardware queues. This patch fixes the following crash if
increasing the number of hardware queues fails:

BUG: KASAN: null-ptr-deref in blk_mq_map_swqueue+0x775/0x810
Write of size 8 at addr 0000000000000118 by task check/977

CPU: 3 PID: 977 Comm: check Not tainted 5.6.0-rc1-dbg+ #8
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0xa5/0xe6
 __kasan_report.cold+0x65/0x99
 kasan_report+0x16/0x20
 check_memory_region+0x140/0x1b0
 memset+0x28/0x40
 blk_mq_map_swqueue+0x775/0x810
 blk_mq_update_nr_hw_queues+0x468/0x710
 nullb_device_submit_queues_store+0xf7/0x1a0 [null_blk]
 configfs_write_file+0x1c4/0x250 [configfs]
 __vfs_write+0x4c/0x90
 vfs_write+0x145/0x2c0
 ksys_write+0xd7/0x180
 __x64_sys_write+0x47/0x50
 do_syscall_64+0x6f/0x2f0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: ac0d6b92 ("block: Reduce the amount of memory required per request queue")
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d0930bb8

blk-mq: Keep set->nr_hw_queues and set->map[].nr_queues in sync · 6e66b493

由 Bart Van Assche 提交于 3月 09, 2020

blk_mq_map_queues() and multiple .map_queues() implementations expect that
set->map[HCTX_TYPE_DEFAULT].nr_queues is set to the number of hardware
queues. Hence set .nr_queues before calling these functions. This patch
fixes the following kernel warning:

WARNING: CPU: 0 PID: 2501 at include/linux/cpumask.h:137
Call Trace:
 blk_mq_run_hw_queue+0x19d/0x350 block/blk-mq.c:1508
 blk_mq_run_hw_queues+0x112/0x1a0 block/blk-mq.c:1525
 blk_mq_requeue_work+0x502/0x780 block/blk-mq.c:775
 process_one_work+0x9af/0x1740 kernel/workqueue.c:2269
 worker_thread+0x98/0xe40 kernel/workqueue.c:2415
 kthread+0x361/0x430 kernel/kthread.c:255

Fixes: ed76e329 ("blk-mq: abstract out queue map") # v5.0
Reported-by: syzbot+d44e1b26ce5c3e77458d@syzkaller.appspotmail.com
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

6e66b493

06 3月, 2020 1 次提交

block, bfq: fix overwrite of bfq_group pointer in bfq_find_set_group() · 14afc593

由 Carlo Nonato 提交于 3月 06, 2020

The bfq_find_set_group() function takes as input a blkcg (which represents
a cgroup) and retrieves the corresponding bfq_group, then it updates the
bfq internal group hierarchy (see comments inside the function for why
this is needed) and finally it returns the bfq_group.
In the hierarchy update cycle, the pointer holding the correct bfq_group
that has to be returned is mistakenly used to traverse the hierarchy
bottom to top, meaning that in each iteration it gets overwritten with the
parent of the current group. Since the update cycle stops at root's
children (depth = 2), the overwrite becomes a problem only if the blkcg
describes a cgroup at a hierarchy level deeper than that (depth > 2). In
this case the root's child that happens to be also an ancestor of the
correct bfq_group is returned. The main consequence is that processes
contained in a cgroup at depth greater than 2 are wrongly placed in the
group described above by BFQ.

This commits fixes this problem by using a different bfq_group pointer in
the update cycle in order to avoid the overwrite of the variable holding
the original group reference.
Reported-by: NKwon Je Oh <kwonje.oh2@gmail.com>
Signed-off-by: NCarlo Nonato <carlo.nonato95@gmail.com>
Signed-off-by: NPaolo Valente <paolo.valente@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

14afc593

02 3月, 2020 1 次提交

block: Remove used kblockd_schedule_work_on() · e959e540

由 Daniel Wagner 提交于 3月 02, 2020

Commit ee63cfa7 ("block: add kblockd_schedule_work_on()")
introduced the helper in 2016. Remove it because since then no caller
was added.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e959e540

27 2月, 2020 1 次提交

blk-mq: Remove some unused function arguments · cae740a0

由 John Garry 提交于 2月 26, 2020

The struct blk_mq_hw_ctx pointer argument in blk_mq_put_tag(),
blk_mq_poll_nsecs(), and blk_mq_poll_hybrid_sleep() is unused, so remove
it.

Overall obj code size shows a minor reduction, before:
   text	   data	    bss	    dec	    hex	filename
  27306	   1312	      0	  28618	   6fca	block/blk-mq.o
   4303	    272	      0	   4575	   11df	block/blk-mq-tag.o

after:
  27282	   1312	      0	  28594	   6fb2	block/blk-mq.o
   4311	    272	      0	   4583	   11e7	block/blk-mq-tag.o
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
--
This minor patch had been carried as part of the blk-mq shared tags RFC,
I'd rather not carry it anymore as it required rebasing, so now or never..
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cae740a0

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功