- 17 8月, 2021 5 次提交
-
-
由 Christoph Hellwig 提交于
__ebs_rw_bvec use page_address on the submitted bios data, and thus can't deal with highmem. Disable the target on highmem configs. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804095634.460779-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use the bvec_virt helper to clean up the bio integrity processing a little bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@kernel.org> Acked-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210804095634.460779-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to get the virtual address for a bvec. This avoids that all callers need to know about the page + offset representation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@kernel.org> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210804095634.460779-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
inode_detach_wb references the "main" bdi of the inode. With the recent change to move the bdi from the request_queue to the gendisk this causes a guaranteed use after free when using certain cgroup configurations. The big itself is older through as any non-default inode reference (e.g. an open file descriptor) could have injected this use after free even before that. Fixes: 52ebea74 ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Reported-by: NQian Cai <quic_qiancai@quicinc.com> Reported-by: Nsyzbot <syzbot+1fb38bb7d3ce0fa3e1c4@syzkaller.appspotmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210816122614.601358-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The dev_t is used as the inode hash, so we should only released it once then block device inode is gone from the inode cache. Move it to bdev_free_inode to ensure that. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210816122614.601358-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 8月, 2021 1 次提交
-
-
由 Chunguang Xu 提交于
After patch 54efd50b (block: make generic_make_request handle arbitrarily sized bios), the IO through io-throttle may be larger, and these IOs may be further split into more small IOs. However, IOPS throttle does not seem to be aware of this change, which makes the calculation of IOPS of large IOs incomplete, resulting in disk-side IOPS that does not meet expectations. Maybe we should fix this problem. We can reproduce it by set max_sectors_kb of disk to 128, set blkio.write_iops_throttle to 100, run a dd instance inside blkio and use iostat to watch IOPS: dd if=/dev/zero of=/dev/sdb bs=1M count=1000 oflag=direct As a result, without this change the average IOPS is 1995, with this change the IOPS is 98. Signed-off-by: NChunguang Xu <brookxu@tencent.com> Acked-by: NTejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/65869aaad05475797d63b4c3fed4f529febe3c26.1627876014.git.brookxu@tencent.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 13 8月, 2021 12 次提交
-
-
由 Christoph Hellwig 提交于
bdev_resize_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
bdev_del_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
bdev_add_partition can only operate on the whole device. Make that clear by passing a gendisk instead of a block_device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210810154512.1809898-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Partition scanning only happens on the whole device, so pass a struct gendisk instead of the whole device block_device to the scanners. This allows to simplify printing the device name in various places as the disk name is available in disk->name. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NStefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20210810154512.1809898-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Just check inode_unhashed on the whole device bdev inode instead, and provide a helper to check for that information. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Let the callers call del_gendisk so that we can check if add_disk has been called properly for the cached device case instead of relying on the block layer internal GENHD_FL_UP flag. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210809064028.1198327-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Except for the IDA none of the allocations in bcache_device_init is unwound on error, fix that. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NColy Li <colyli@suse.de> Link: https://lore.kernel.org/r/20210809064028.1198327-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Remove usage of the block layer internal GENHD_FL_UP by just looking at the host state. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use the nvme-internal NVME_NSHEAD_DISK_LIVE flag instead of abusing the block layer state. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Early probe failure never reaches nvme_ns_remove, so GENHD_FL_UP must be set at this point. Remove the check. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210809064028.1198327-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Restructure mmc_blk_probe to avoid a failure path with a half created disk. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210809064028.1198327-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Pass the attribute group for the attributes on the gendisk to device_add_disk so that they are created atomically with the disk creation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210809064028.1198327-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 8月, 2021 1 次提交
-
-
由 Guoqing Jiang 提交于
Move them (PAGE_SECTORS_SHIFT, PAGE_SECTORS and SECTOR_MASK) to the generic header file to remove redundancy. Signed-off-by: NGuoqing Jiang <jiangguoqing@kylinos.cn> Link: https://lore.kernel.org/r/20210721025315.1729118-1-guoqing.jiang@linux.devSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 10 8月, 2021 15 次提交
-
-
由 Christoph Hellwig 提交于
Fix the !CONFIG_BLOCK build after the recent cleanup. Fixes: 5ed964f8 ("mm: hide laptop_mode_wb_timer entirely behind the BDI API") Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Ming Lei 提交于
When merging one bio to request, if they are discard IO and the queue supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE because both block core and related drivers(nvme, virtio-blk) doesn't handle mixed discard io merge(traditional IO merge together with discard merge) well. Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation, so both blk-mq and drivers just need to handle multi-range discard. Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: NMing Lei <ming.lei@redhat.com> Tested-by: NOleksandr Natalenko <oleksandr@natalenko.name> Fixes: 2705dfb2 ("block: fix discard request merge") Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Just retrieve the bdi from the disk. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The backing device information only makes sense for file system I/O, and thus belongs into the gendisk and not the lower level request_queue structure. Move it there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to check if a gendisk is associated with a request_queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
.. and rename the function to disk_update_readahead. This is in preparation for moving the BDI from the request_queue to the gendisk. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Don't leak the detaіls of the timer into the block layer, instead initialize the timer in bdi_alloc and delete it in bdi_unregister. Note that this means the timer is initialized (but not armed) for non-block queues as well now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20210809141744.1203023-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Now that device mapper has been changed to register the disk once it is fully ready all this code is unused. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
device mapper is currently the only outlier that tries to call register_disk after add_disk, leading to fairly inconsistent state of these block layer data structures. Instead change device-mapper to just register the gendisk later now that the holder mechanism can cope with that. Note that this introduces a user visible change: the dm kobject is now only visible after the initial table has been loaded. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Move setting md->type from both callers into dm_setup_md_queue. This ensures that md->type is only set to a valid value after the queue has been fully setup, something we'll rely on future changes. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
md->queue is now always set when md->disk is set, so simplify the conditionals a bit. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
device mapper needs to register holders before it is ready to do I/O. Currently it does so by registering the disk early, which can leave the disk and queue in a weird half state where the queue is registered with the disk, except for sysfs and the elevator. And this state has been a bit promlematic before, and will get more so when sorting out the responsibilities between the queue and the disk. Support registering holders on an initialized but not registered disk instead by delaying the sysfs registration until the disk is registered. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Invert they way the holder relations are tracked. This very slightly reduces the memory overhead for partitioned devices. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210804094147.459763-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Since commit 0d02129e ("block: merge struct block_device and struct hd_struct") there is no way for the bdev to go away as long as there is a holder, so remove the extra references. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Move the block holder code into a separate file as it is not in any way related to the other block_dev.c code, and add a new selectable config option for it so that we don't have to build it without any remapped drivers selected. The Kconfig symbol contains a _DEPRECATED suffix to match the comments added in commit 49731baa ("block: restore multiple bd_link_disk_holder() support"). Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 8月, 2021 2 次提交
-
-
由 Bart Van Assche 提交于
We noticed that the user interface of Android devices becomes very slow under memory pressure. This is because Android uses the zram driver on top of the loop driver for swapping, because under memory pressure the swap code alternates reads and writes quickly, because mq-deadline is the default scheduler for loop devices and because mq-deadline delays writes by five seconds for such a workload with default settings. Fix this by making the kernel select I/O scheduler 'none' from inside add_disk() for loop devices. This default can be overridden at any time from user space, e.g. via a udev rule. This approach has an advantage compared to changing the I/O scheduler from userspace from 'mq-deadline' into 'none', namely that synchronize_rcu() does not get called. This patch changes the default I/O scheduler for loop devices from 'mq-deadline' into 'none'. Additionally, this patch reduces the Android boot time on my test setup with 0.5 seconds compared to configuring the loop I/O scheduler from user space. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Martijn Coenen <maco@android.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210805174200.3250718-3-bvanassche@acm.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Bart Van Assche 提交于
elevator_get_default() uses the following algorithm to select an I/O scheduler from inside add_disk(): - In case of a single hardware queue or if sharing hardware queues across multiple request queues (BLK_MQ_F_TAG_HCTX_SHARED), use mq-deadline. - Otherwise, use 'none'. This is a good choice for most but not for all block drivers. Make it possible to override the selection of mq-deadline with a new flag, namely BLK_MQ_F_NO_SCHED_BY_DEFAULT. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Martijn Coenen <maco@android.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210805174200.3250718-2-bvanassche@acm.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 8月, 2021 4 次提交
-
-
由 Damien Le Moal 提交于
In block/blk-mq-sysfs.c, struct blk_mq_ctx_sysfs_entry is not used to define any attribute since the "mq" sysfs directory contains only sub-directories (no attribute files). As a result, blk_mq_sysfs_show(), blk_mq_sysfs_store(), and struct sysfs_ops blk_mq_sysfs_ops are all unused and unnecessary. Remove all this unused code. Signed-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Link: https://lore.kernel.org/r/20210713081837.524422-1-damien.lemoal@wdc.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Matteo Croce 提交于
Make the loop device raise a DISK_MEDIA_CHANGE event on attach or detach. # udevadm monitor -up |grep -e DISK_MEDIA_CHANGE -e DEVNAME & # losetup -f zero [ 7.454235] loop0: detected capacity change from 0 to 16384 DISK_MEDIA_CHANGE=1 DEVNAME=/dev/loop0 DEVNAME=/dev/loop0 DEVNAME=/dev/loop0 # losetup -f zero [ 10.205245] loop1: detected capacity change from 0 to 16384 DISK_MEDIA_CHANGE=1 DEVNAME=/dev/loop1 DEVNAME=/dev/loop1 DEVNAME=/dev/loop1 # losetup -f zero2 [ 13.532368] loop2: detected capacity change from 0 to 40960 DISK_MEDIA_CHANGE=1 DEVNAME=/dev/loop2 DEVNAME=/dev/loop2 # losetup -D DEVNAME=/dev/loop1 DISK_MEDIA_CHANGE=1 DEVNAME=/dev/loop1 DEVNAME=/dev/loop2 DISK_MEDIA_CHANGE=1 DEVNAME=/dev/loop2 DEVNAME=/dev/loop0 DISK_MEDIA_CHANGE=1 DEVNAME=/dev/loop0 Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Tested-by: NLuca Boccassi <bluca@debian.org> Link: https://lore.kernel.org/r/20210712230530.29323-7-mcroce@linux.microsoft.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Matteo Croce 提交于
Refactor disk_check_events() and move some code into disk_event_uevent(). Then add disk_force_media_change(), a helper which will be used by devices to force issuing a DISK_EVENT_MEDIA_CHANGE event. Co-developed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Tested-by: NLuca Boccassi <bluca@debian.org> Link: https://lore.kernel.org/r/20210712230530.29323-6-mcroce@linux.microsoft.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Matteo Croce 提交于
Add a new sysfs handle to export the new diskseq value. Place it in <sysfs>/block/<disk>/diskseq and document it. $ grep . /sys/class/block/*/diskseq /sys/class/block/loop0/diskseq:13 /sys/class/block/loop1/diskseq:14 /sys/class/block/loop2/diskseq:5 /sys/class/block/loop3/diskseq:6 /sys/class/block/ram0/diskseq:1 /sys/class/block/ram1/diskseq:2 /sys/class/block/vda/diskseq:7 Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NMatteo Croce <mcroce@microsoft.com> Tested-by: NLuca Boccassi <bluca@debian.org> Link: https://lore.kernel.org/r/20210712230530.29323-5-mcroce@linux.microsoft.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-