提交 · 0d20dcc277cfb50ec1de4db0d758f30e8f597d30 · openeuler / Kernel

01 8月, 2020 4 次提交

block: genhd: delete duplicated words · 0d20dcc2

由 Randy Dunlap 提交于 7月 30, 2020

Drop the repeated word "to" in multiple places.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0d20dcc2

block: elevator: delete duplicated word and fix typos · 5b8f65e1

由 Randy Dunlap 提交于 7月 30, 2020

Drop the repeated word "the".
Fix typos of "features" and "specified".
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5b8f65e1

block: bio: delete duplicated words · 3cf14889

由 Randy Dunlap 提交于 7月 30, 2020

Drop the repeated words "a" and "the".
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3cf14889

block: bfq-iosched: fix duplicated word · f06678af

由 Randy Dunlap 提交于 7月 30, 2020

Change "at at" to "at a".
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

f06678af

31 7月, 2020 2 次提交

iocost_monitor: start from the oldest usage index · 1bf6ece5

由 Chengming Zhou 提交于 7月 30, 2020

iocg usage_idx is the latest usage index, we should start from the
oldest usage index to show the consecutive NR_USAGE_SLOTS usages.
Signed-off-by: NChengming Zhou <zhouchengming@bytedance.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

1bf6ece5

iocost: Fix check condition of iocg abs_vdebt · d9012a59

由 Chengming Zhou 提交于 7月 30, 2020

We shouldn't skip iocg when its abs_vdebt is not zero.

Fixes: 0b80f986 ("iocost: protect iocg->abs_vdebt with iocg->waitq.lock")
Signed-off-by: NChengming Zhou <zhouchengming@bytedance.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d9012a59

29 7月, 2020 1 次提交

block: Remove callback typedefs for blk_mq_ops · 0516c2f6

由 Daniel Wagner 提交于 7月 28, 2020

No need to define typedefs for the callbacks, because there is not a
single user except blk_mq_ops.
Signed-off-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0516c2f6

28 7月, 2020 1 次提交

block: Use non _rcu version of list functions for tag_set_list · 08c875cb

由 Daniel Wagner 提交于 7月 28, 2020

tag_set_list is only accessed under the tag_set_lock lock. There is
no need for using the _rcu list functions.

The _rcu list function were introduced to allow read access to the
tag_set_list protected under RCU, see 705cda97 ("blk-mq: Make it
safe to use RCU to iterate over blk_mq_tag_set.tag_list") and
05b79413 ("Revert "blk-mq: don't handle TAG_SHARED in restart"").
Those changes got reverted later but the cleanup commit missed a
couple of places to undo the changes.

Fixes: 97889f9a ("blk-mq: remove synchronize_rcu() from blk_mq_del_queue_tag_set()"
Signed-off-by: NDaniel Wagner <dwagner@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

08c875cb

18 7月, 2020 2 次提交

blk-cgroup: show global disk stats in root cgroup io.stat · ef45fe47

由 Boris Burkov 提交于 6月 01, 2020

In order to improve consistency and usability in cgroup stat accounting,
we would like to support the root cgroup's io.stat.

Since the root cgroup has processes doing io even if the system has no
explicitly created cgroups, we need to be careful to avoid overhead in
that case.  For that reason, the rstat algorithms don't handle the root
cgroup, so just turning the file on wouldn't give correct statistics.

To get around this, we simulate flushing the iostat struct by filling it
out directly from global disk stats. The result is a root cgroup io.stat
file consistent with both /proc/diskstats and io.stat.

Note that in order to collect the disk stats, we needed to iterate over
devices. To facilitate that, we had to change the linkage of a disk_type
to external so that it can be used from blk-cgroup.c to iterate over
disks.
Suggested-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NBoris Burkov <boris@bur.io>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ef45fe47

blk-cgroup: make iostat functions visible to stat printing · cd1fc4b9

由 Boris Burkov 提交于 6月 01, 2020

Previously, the code which printed io.stat only needed access to the
generic rstat flushing code, but since we plan to write some more
specific code for preparing root cgroup stats, we need to manipulate
iostat structs directly. Since declaring static functions ahead does not
seem like common practice in this file, simply move the iostat functions
up. We only plan to use blkg_iostat_set, but it seems better to keep them
all together.
Signed-off-by: NBoris Burkov <boris@bur.io>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cd1fc4b9

17 7月, 2020 6 次提交

block: improve discard bio alignment in __blkdev_issue_discard() · 9b15d109

由 Coly Li 提交于 7月 17, 2020

This patch improves discard bio split for address and size alignment in
__blkdev_issue_discard(). The aligned discard bio may help underlying
device controller to perform better discard and internal garbage
collection, and avoid unnecessary internal fragment.

Current discard bio split algorithm in __blkdev_issue_discard() may have
non-discarded fregment on device even the discard bio LBA and size are
both aligned to device's discard granularity size.

Here is the example steps on how to reproduce the above problem.
- On a VMWare ESXi 6.5 update3 installation, create a 51GB virtual disk
  with thin mode and give it to a Linux virtual machine.
- Inside the Linux virtual machine, if the 50GB virtual disk shows up as
  /dev/sdb, fill data into the first 50GB by,
        # dd if=/dev/zero of=/dev/sdb bs=4096 count=13107200
- Discard the 50GB range from offset 0 on /dev/sdb,
        # blkdiscard /dev/sdb -o 0 -l 53687091200
- Observe the underlying mapping status of the device
        # sg_get_lba_status /dev/sdb -m 1048 --lba=0
  descriptor LBA: 0x0000000000000000  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000000000800  blocks: 16773120  deallocated
  descriptor LBA: 0x0000000000fff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000001000000  blocks: 8386560  deallocated
  descriptor LBA: 0x00000000017ff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000001800000  blocks: 8386560  deallocated
  descriptor LBA: 0x0000000001fff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000002000000  blocks: 8386560  deallocated
  descriptor LBA: 0x00000000027ff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000002800000  blocks: 8386560  deallocated
  descriptor LBA: 0x0000000002fff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000003000000  blocks: 8386560  deallocated
  descriptor LBA: 0x00000000037ff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000003800000  blocks: 8386560  deallocated
  descriptor LBA: 0x0000000003fff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000004000000  blocks: 8386560  deallocated
  descriptor LBA: 0x00000000047ff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000004800000  blocks: 8386560  deallocated
  descriptor LBA: 0x0000000004fff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000005000000  blocks: 8386560  deallocated
  descriptor LBA: 0x00000000057ff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000005800000  blocks: 8386560  deallocated
  descriptor LBA: 0x0000000005fff800  blocks: 2048  mapped (or unknown)
  descriptor LBA: 0x0000000006000000  blocks: 6291456  deallocated
  descriptor LBA: 0x0000000006600000  blocks: 0  deallocated

Although the discard bio starts at LBA 0 and has 50<<30 bytes size which
are perfect aligned to the discard granularity, from the above list
these are many 1MB (2048 sectors) internal fragments exist unexpectedly.

The problem is in __blkdev_issue_discard(), an improper algorithm causes
an improper bio size which is not aligned.

 25 int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 26                 sector_t nr_sects, gfp_t gfp_mask, int flags,
 27                 struct bio **biop)
 28 {
 29         struct request_queue *q = bdev_get_queue(bdev);
   [snipped]
 56
 57         while (nr_sects) {
 58                 sector_t req_sects = min_t(sector_t, nr_sects,
 59                                 bio_allowed_max_sectors(q));
 60
 61                 WARN_ON_ONCE((req_sects << 9) > UINT_MAX);
 62
 63                 bio = blk_next_bio(bio, 0, gfp_mask);
 64                 bio->bi_iter.bi_sector = sector;
 65                 bio_set_dev(bio, bdev);
 66                 bio_set_op_attrs(bio, op, 0);
 67
 68                 bio->bi_iter.bi_size = req_sects << 9;
 69                 sector += req_sects;
 70                 nr_sects -= req_sects;
   [snipped]
 79         }
 80
 81         *biop = bio;
 82         return 0;
 83 }
 84 EXPORT_SYMBOL(__blkdev_issue_discard);

At line 58-59, to discard a 50GB range, req_sects is set as return value
of bio_allowed_max_sectors(q), which is 8388607 sectors. In the above
case, the discard granularity is 2048 sectors, although the start LBA
and discard length are aligned to discard granularity, req_sects never
has chance to be aligned to discard granularity. This is why there are
some still-mapped 2048 sectors fragment in every 4 or 8 GB range.

If req_sects at line 58 is set to a value aligned to discard_granularity
and close to UNIT_MAX, then all consequent split bios inside device
driver are (almostly) aligned to discard_granularity of the device
queue. The 2048 sectors still-mapped fragment will disappear.

This patch introduces bio_aligned_discard_max_sectors() to return the
the value which is aligned to q->limits.discard_granularity and closest
to UINT_MAX. Then this patch replaces bio_allowed_max_sectors() with
this new routine to decide a more proper split bio length.

But we still need to handle the situation when discard start LBA is not
aligned to q->limits.discard_granularity, otherwise even the length is
aligned, current code may still leave 2048 fragment around every 4GB
range. Therefore, to calculate req_sects, firstly the start LBA of
discard range is checked (including partition offset), if it is not
aligned to discard granularity, the first split location should make
sure following bio has bi_sector aligned to discard granularity. Then
there won't be still-mapped fragment in the middle of the discard range.

The above is how this patch improves discard bio alignment in
__blkdev_issue_discard(). Now with this patch, after discard with same
command line mentiond previously, sg_get_lba_status returns,
descriptor LBA: 0x0000000000000000  blocks: 106954752  deallocated
descriptor LBA: 0x0000000006600000  blocks: 0  deallocated

We an see there is no 2048 sectors segment anymore, everything is clean.
Reported-and-tested-by: NAcshai Manoj <acshai.manoj@microfocus.com>
Signed-off-by: NColy Li <colyli@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NXiao Ni <xni@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Enzo Matsumiya <ematsumiya@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9b15d109

block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers · ecdef9f4

由 Coly Li 提交于 7月 17, 2020

Currently REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are defined as
even numbers 6 and 8, such zone reset bios are treated as READ bios by
bio_data_dir(), which is obviously misleading.

The macro bio_data_dir() is defined in include/linux/bio.h as,
 55 #define bio_data_dir(bio) \
 56         (op_is_write(bio_op(bio)) ? WRITE : READ)

And op_is_write() is defined in include/linux/blk_types.h as,
397 static inline bool op_is_write(unsigned int op)
398 {
399         return (op & 1);
400 }

The convention of op_is_write() is when there is data transfer then the
op code should be odd number, and treat as a write op. bio_data_dir()
treats all bio direction as READ if op_is_write() reports false, and
WRITE if op_is_write() reports true.

Because REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are even numbers,
although they don't transfer data but reporting them as READ bio by
bio_data_dir() is misleading and might be wrong. Because these two
commands will reset the writer pointers of the resetting zones, and all
content after the reset write pointer will be invalid and unaccessible,
obviously they are not READ bios in any means.

This patch changes REQ_OP_ZONE_RESET from 6 to 15, and changes
REQ_OP_ZONE_RESET_ALL from 8 to 17. Now bios with these two op code
can be treated as WRITE by bio_data_dir(). Although they don't transfer
data, now we keep them consistent with REQ_OP_DISCARD and
REQ_OP_WRITE_ZEROES with the ituition that they change on-media content
and should be WRITE request.
Signed-off-by: NColy Li <colyli@suse.de>
Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@fb.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ecdef9f4

block: defer flush request no matter whether we have elevator · b5718d6c

由 Yufen Yu 提交于 7月 16, 2020

Commit 7520872c ("block: don't defer flushes on blk-mq + scheduling")
tried to fix deadlock for cycled wait between flush requests and data
request into flush_data_in_flight. The former holded all driver tags
and wait for data request completion, but the latter can not complete
for waiting free driver tags.

After commit 923218f6 ("blk-mq: don't allocate driver tag upfront
for flush rq"), flush requests will not get driver tag before queuing
into flush queue.

* With elevator, flush request just get sched_tags before inserting
  flush queue. It will not get driver tag until issue them to driver.
  data request on list fq->flush_data_in_flight will complete in
  the end.

* Without elevator, each flush request will get a driver tag when
  allocate request. Then data request on fq->flush_data_in_flight
  don't worry about lacking driver tag.

In both of these cases, cycled wait cannot be true. So we may allow
to defer flush request.
Signed-off-by: NYufen Yu <yuyufen@huawei.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b5718d6c

block: make blk_timeout_init() static · 943c4d90

由 Wei Yongjun 提交于 7月 17, 2020

The sparse tool complains as follows:

block/blk-timeout.c:93:12: warning:
 symbol 'blk_timeout_init' was not declared. Should it be static?

Function blk_timeout_init() is not used outside of blk-timeout.c, so
mark it static.

Fixes: 9054650f ("block: relax jiffies rounding for timeouts")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

943c4d90

block: remove retry loop in ioc_release_fn() · ab96bbab

由 John Ogness 提交于 6月 19, 2020

The reverse-order double lock dance in ioc_release_fn() is using a
retry loop. This is a problem on PREEMPT_RT because it could preempt
the task that would release q->queue_lock and thus live lock in the
retry loop.

RCU is already managing the freeing of the request queue and icq. If
the trylock fails, use RCU to guarantee that the request queue and
icq are not freed and re-acquire the locks in the correct order,
allowing forward progress.
Signed-off-by: NJohn Ogness <john.ogness@linutronix.de>
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ab96bbab

block: remove unnecessary ioc nested locking · a43f085f

由 John Ogness 提交于 6月 19, 2020

The legacy CFQ IO scheduler could call put_io_context() in its exit_icq()
elevator callback. This led to a lockdep warning, which was fixed in
commit d8c66c5d ("block: fix lockdep warning on io_context release
put_io_context()") by using a nested subclass for the ioc spinlock.
However, with commit f382fb0b ("block: remove legacy IO schedulers")
the CFQ IO scheduler no longer exists.

The BFQ IO scheduler also implements the exit_icq() elevator callback but
does not call put_io_context().

The nested subclass for the ioc spinlock is no longer needed. Since it
existed as an exception and no longer applies, remove the nested subclass
usage.
Signed-off-by: NJohn Ogness <john.ogness@linutronix.de>
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a43f085f

16 7月, 2020 4 次提交

block: integrate bd_start_claiming into __blkdev_get · 5b642d8b

由 Christoph Hellwig 提交于 7月 16, 2020

bd_start_claiming duplicates a lot of the work done in __blkdev_get.
Integrate the two functions to avoid the duplicate work, and to do the
right thing for the md -ERESTARTSYS corner case.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5b642d8b

block: use bd_prepare_to_claim directly in the loop driver · ecbe6bc0

由 Christoph Hellwig 提交于 7月 16, 2020

The arcane magic in bd_start_claiming is only needed to be able to claim
a block_device that hasn't been fully set up.  Switch the loop driver
that claims from the ioctl path with a fully set up struct block_device
to just use the much simpler bd_prepare_to_claim directly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ecbe6bc0

block: refactor bd_start_claiming · 58e46ed9

由 Christoph Hellwig 提交于 7月 16, 2020

Move the locking and assignment of bd_claiming from bd_start_claiming to
bd_prepare_to_claim.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

58e46ed9

block: simplify the restart case in __blkdev_get · c5638ab4

由 Christoph Hellwig 提交于 7月 16, 2020

Insted of duplicating all the cleanup logic jump to the code that cleans
up anyway, and restart after that.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c5638ab4

15 7月, 2020 3 次提交

Revert "blk-rq-qos: remove redundant finish_wait to rq_qos_wait." · e791ee68

由 Jens Axboe 提交于 7月 15, 2020

This reverts commit 826f2f48.

Qian Cai reports that this commit causes stalls with swap. Revert until
the reason can be figured out.
Reported-by: NQian Cai <cai@lca.pw>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e791ee68

block: always remove partitions from blk_drop_partitions() · d0f0f1b4

由 Ming Lei 提交于 7月 15, 2020

In theory, when GENHD_FL_NO_PART_SCAN is set, no partitions can be created
on one disk. However, ioctl(BLKPG, BLKPG_ADD_PARTITION) doesn't check
GENHD_FL_NO_PART_SCAN, so partitions still can be added even though
GENHD_FL_NO_PART_SCAN is set.

So far blk_drop_partitions() only removes partitions when disk_part_scan_enabled()
return true. This way can make ghost partition on loop device after changing/clearing
FD in case that PARTSCAN is disabled, such as partitions can be added
via 'parted' on loop disk even though GENHD_FL_NO_PART_SCAN is set.

Fix this issue by always removing partitions in blk_drop_partitions(), and
this way is correct because the current code supposes that no partitions
can be added in case of GENHD_FL_NO_PART_SCAN.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d0f0f1b4

block: relax jiffies rounding for timeouts · 9054650f

由 Jens Axboe 提交于 7月 13, 2020

In doing high IOPS testing, blk-mq is generally pretty well optimized.
There are a few things that stuck out as using more CPU than what is
really warranted, and one thing is the round_jiffies_up() that we do
twice for each request. That accounts for about 0.8% of the CPU in
my testing.

We can make this cheaper by avoiding an integer division, by just adding
a rough HZ mask that we can AND with instead. The timeouts are only on a
second granularity already, we don't have to be that accurate here and
this patch barely changes that. All we care about is nice grouping.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9054650f

10 7月, 2020 2 次提交

blk-mq: remove redundant validation in __blk_mq_end_request() · 87890092

由 Baolin Wang 提交于 7月 04, 2020

We've already validated the 'q->elevator' before calling
->ops.completed_request() in blk_mq_sched_completed_request(), thus no
need to validate rq->internal_tag again. Rmove it.
Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

87890092

blk-mq: Remove unnecessary local variable · 106e71c5

由 Baolin Wang 提交于 7月 04, 2020

Remove unnecessary local variable 'ret' in blk_mq_dispatch_hctx_list().
Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

106e71c5

09 7月, 2020 12 次提交

writeback: remove bdi->congested_fn · 21cf8661

由 Christoph Hellwig 提交于 7月 01, 2020

Except for pktdvd, the only places setting congested bits are file
systems that allocate their own backing_dev_info structures.  And
pktdvd is a deprecated driver that isn't useful in stack setup
either.  So remove the dead congested_fn stacking infrastructure.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NSong Liu <song@kernel.org>
Acked-by: NDavid Sterba <dsterba@suse.com>
[axboe: fixup unused variables in bcache/request.c]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

21cf8661

writeback: remove struct bdi_writeback_congested · 8c911f3d

由 Christoph Hellwig 提交于 7月 01, 2020

We never set any congested bits in the group writeback instances of it.
And for the simpler bdi-wide case a simple scalar field is all that
that is needed.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8c911f3d

writeback: remove {set,clear}_wb_congested · 492d76b2

由 Christoph Hellwig 提交于 7月 01, 2020

Just merge them into their only callers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

492d76b2

drbd: remove a bogus bdi_rw_congested call · d5c69838

由 Christoph Hellwig 提交于 7月 01, 2020

bdi_rw_congested returns congestion state, so calling it without
looking at the return value doesn't make much sense.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

d5c69838

mmc: remove the call to check_disk_change · 9eb994ec

由 Christoph Hellwig 提交于 7月 08, 2020

The mmc driver doesn't support event notifications, which means
that check_disk_change is a no-op.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9eb994ec

xtensa/simdisk: remove the call to check_disk_change · 3d3a6a20

由 Christoph Hellwig 提交于 7月 08, 2020

The simdisk driver doesn't support event notifications, which means
that check_disk_change is a no-op.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3d3a6a20

isofs: remove a stale comment · 13ab6488

由 Christoph Hellwig 提交于 7月 08, 2020

check_disk_change isn't for consumers of the block layer, so remove
the comment mentioning it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

13ab6488

block: remove flush_disk · 9a3ffbbc

由 Christoph Hellwig 提交于 7月 08, 2020

flush_disk has only two callers, so open code it there.  That also helps
clarifying the error message for the particular case, and allows to remove
setting bd_invalidated in check_disk_size_change, which will be cleared
again instantly.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

9a3ffbbc

cdrom: remove the unused cdrom_media_changed function · 8c22eb3a

由 Christoph Hellwig 提交于 7月 08, 2020

As well as the ->media_changed method.  All these are left over from
before the drivers were switched over to the check_events scheme.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8c22eb3a

md: switch to ->check_events for media change notifications · a564e23f

由 Christoph Hellwig 提交于 7月 08, 2020

md is the last driver using the legacy media_changed method.  Switch
it over to (not so) new ->clear_events approach, which also removes the
need for the ->revalidate_disk method.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
[axboe: remove unused 'bdops' variable in disk_clear_events()]
Signed-off-by: NJens Axboe <axboe@kernel.dk>

a564e23f

blk-mq: centralise related handling into blk_mq_get_driver_tag · 568f2700

由 Ming Lei 提交于 7月 06, 2020

Move .nr_active update and request assignment into blk_mq_get_driver_tag(),
all are good to do during getting driver tag.

Meantime blk-flush related code is simplified and flush request needn't
to update the request table manually any more.
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

568f2700

blk-mq: streamline handling of q->mq_ops->queue_rq result · 7bf13729

由 Ming Lei 提交于 7月 01, 2020

Current handling of q->mq_ops->queue_rq result is a bit ugly:

- two branches which needs to 'continue' have to check if the
dispatch local list is empty, otherwise one bad request may
be retrieved via 'rq = list_first_entry(list, struct request, queuelist);'

- the branch of 'if (unlikely(ret != BLK_STS_OK))' isn't easy
to follow, since it is actually one error branch.

Streamline this handling, so the code becomes more readable, meantime
potential kernel oops can be avoided in case that the last request in
local dispatch list is failed.

Fixes: fc17b653 ("blk-mq: switch ->queue_rq return value to blk_status_t")
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7bf13729

08 7月, 2020 1 次提交

block: remove a bogus warning in __submit_bio_noacct_mq · 0e6e255e

由 Christoph Hellwig 提交于 7月 07, 2020

If blk_mq_submit_bio flushes the plug list, bios for other disks can
show up on current->bio_list.  As that doesn't involve any stacking of
block device it is entirely harmless and we should not warn about
this case.

Fixes: ff93ea0c ("block: shortcut __submit_bio_noacct for blk-mq drivers")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0e6e255e

03 7月, 2020 1 次提交

block: initialize current->bio_list[1] in __submit_bio_noacct_mq · 7c792f33

由 Christoph Hellwig 提交于 7月 02, 2020

bio_alloc_bioset references current->bio_list[1], so we need to
initialize it for the blk-mq submission path as well.

Fixes: ff93ea0c ("block: shortcut __submit_bio_noacct for blk-mq drivers")
Reported-by: NQian Cai <cai@lca.pw>
Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7c792f33

02 7月, 2020 1 次提交

Revert "blk-mq: put driver tag when this request is completed" · 4e2f62e5

由 Jens Axboe 提交于 7月 01, 2020

This reverts commits the following commits:

	37f4a24c
	723bf178
	36a3df5a

The last one is the culprit, but we have to go a bit deeper to get this
to revert cleanly. There's been a report that this breaks some MMC
setups [1], and also causes an issue with swap [2]. Until this can be
figured out, revert the offending commits.

[1] https://lore.kernel.org/linux-block/57fb09b1-54ba-f3aa-f82c-d709b0e6b281@samsung.com/
[2] https://lore.kernel.org/linux-block/20200702043721.GA1087@lca.pw/Reported-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Reported-by: NQian Cai <cai@lca.pw>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4e2f62e5

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功