- 10 8月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
Move the block holder code into a separate file as it is not in any way related to the other block_dev.c code, and add a new selectable config option for it so that we don't have to build it without any remapped drivers selected. The Kconfig symbol contains a _DEPRECATED suffix to match the comments added in commit 49731baa ("block: restore multiple bd_link_disk_holder() support"). Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20210804094147.459763-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 8月, 2021 4 次提交
-
-
由 Christoph Hellwig 提交于
Now that we've stopped using inode references for anything meaninful in the block layer get rid of the helper to put it and just open code the call to iput on the block_device inode. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NChaitanya Kulkarni <ckulkarnilinux@gmail.com> Link: https://lore.kernel.org/r/20210722075402.983367-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All callers are gone, and no one should grab a pure inode reference to a block device anymore. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210722075402.983367-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Instead of acquiring an inode reference on open make sure partitions always hold device model references to the disk while alive, and switch open to grab only a device model reference to the opened block device. If that is a partition the disk reference is transitively held by the partition already. Link: https://lore.kernel.org/r/20210722075402.983367-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Unhash the whole device inode early in del_gendisk. This allows to remove the first GENHD_FL_UP check in the open path as we simply won't find a just removed inode. The second non-racy check after taking open_mutex is still kept. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210722075402.983367-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 7月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
blkdev_get_no_open acquires a reference to the block_device through the block device inode and then tries to acquire a device model reference to the gendisk. But at this point the disk migh already be freed (although the race is free). Fix this by only freeing the gendisk from the whole device bdevs ->free_inode callback as well. Fixes: 22ae8ce8 ("block: simplify bdev/disk lookup in blkdev_get") Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJosef Bacik <josef@toxicpanda.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210722075402.983367-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 30 6月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
Remove the CONFIG_BLOCK default to __set_page_dirty_buffers and just wire that method up for the missing instances. [hch@lst.de: ecryptfs: add a ->set_page_dirty cludge] Link: https://lkml.kernel.org/r/20210624125250.536369-1-hch@lst.de Link: https://lkml.kernel.org/r/20210614061512.3966143-4-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NJan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Tyler Hicks <code@tyhicks.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 6月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
bdev_disk_changed can only operate on whole devices. Make that clear by passing a gendisk instead of the struct block_device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Move bdev_disk_changed to block/partitions/core.c, together with the rest of the partition scanning code. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210624123240.441814-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 6月, 2021 1 次提交
-
-
由 Zhang Yi 提交于
After remove the unique user of sop->bdev_try_to_free_page() callback, we could remove the callback and the corresponding blkdev_releasepage() at all. Signed-off-by: NZhang Yi <yi.zhang@huawei.com> Reviewed-by: NJan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210610112440.3438139-9-yi.zhang@huawei.comSigned-off-by: NTheodore Ts'o <tytso@mit.edu>
-
- 04 6月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
The raw driver used to provide direct unbuffered access to block devices before O_DIRECT was invented. It has been obsolete for more than a decade. Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NArnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/lkml/Pine.LNX.4.64.0703180754060.6605@CPE00045a9c397f-CM001225dbafb6/Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210531072526.97052-1-hch@lst.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 01 6月, 2021 6 次提交
-
-
由 Christoph Hellwig 提交于
The bd_part_count value only makes sense for whole devices, so move it to struct gendisk and give it a more descriptive name. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210525061301.2242282-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Split __blkdev_put into one helper for the whole device, and one for partitions as well as another shared helper for flushing the block device inode mapping. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210525061301.2242282-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Keep in the callers and thus remove the for_part argument. This mirrors what is done on the blkdev_get side and slightly simplifies blkdev_get_part as well. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@rehat.com> Link: https://lore.kernel.org/r/20210525061301.2242282-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Replace the per-block device bd_mutex with a per-gendisk open_mutex, thus simplifying locking wherever we deal with partitions. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Acked-by: NRoger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20210525061301.2242282-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Do the early unlocked syncing even earlier to move more code out of the recursive path. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20210525061301.2242282-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Split __blkdev_get into one helper for the whole device, and one for opening partitions. This removes the (bounded) recursion when opening a partition. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20210525061301.2242282-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 5月, 2021 2 次提交
-
-
由 Gulam Mohamed 提交于
When BLKRRPART is called concurrently with del_gendisk, the partitions rescan can create a stale partition that will never be be cleaned up. Fix this by checking the the disk is up before rescanning partitions while under bd_mutex. Signed-off-by: NGulam Mohamed <gulam.mohamed@oracle.com> [hch: split from a larger patch] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210514131842.1600568-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
As an artifact of how gendisk lookup used to work in earlier kernels, GENHD_FL_UP is only cleared very late in del_gendisk, and a global lock is used to prevent opens from succeeding while del_gendisk is tearing down the gendisk. Switch to clearing the flag early and under bd_mutex so that callers can use bd_mutex to stabilize the flag, which removes the need for the global mutex. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210514131842.1600568-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 5月, 2021 2 次提交
-
-
由 yangerkun 提交于
We get a bug: BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139 Read of size 8 at addr ffff0000d3fb11f8 by task CPU: 0 PID: 12582 Comm: syz-executor.2 Not tainted 5.10.0-00843-g352c8610ccd2 #2 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132 show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x110/0x164 lib/dump_stack.c:118 print_address_description+0x78/0x5c8 mm/kasan/report.c:385 __kasan_report mm/kasan/report.c:545 [inline] kasan_report+0x148/0x1e4 mm/kasan/report.c:562 check_memory_region_inline mm/kasan/generic.c:183 [inline] __asan_load8+0xb4/0xbc mm/kasan/generic.c:252 iov_iter_revert+0x11c/0x404 lib/iov_iter.c:1139 io_read fs/io_uring.c:3421 [inline] io_issue_sqe+0x2344/0x2d64 fs/io_uring.c:5943 __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260 io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326 io_submit_sqe fs/io_uring.c:6395 [inline] io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624 __do_sys_io_uring_enter fs/io_uring.c:9013 [inline] __se_sys_io_uring_enter fs/io_uring.c:8960 [inline] __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common arch/arm64/kernel/syscall.c:158 [inline] do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227 el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367 el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383 el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670 Allocated by task 12570: stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121 kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461 kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475 __kmalloc+0x23c/0x334 mm/slub.c:3970 kmalloc include/linux/slab.h:557 [inline] __io_alloc_async_data+0x68/0x9c fs/io_uring.c:3210 io_setup_async_rw fs/io_uring.c:3229 [inline] io_read fs/io_uring.c:3436 [inline] io_issue_sqe+0x2954/0x2d64 fs/io_uring.c:5943 __io_queue_sqe+0x19c/0x520 fs/io_uring.c:6260 io_queue_sqe+0x2a4/0x590 fs/io_uring.c:6326 io_submit_sqe fs/io_uring.c:6395 [inline] io_submit_sqes+0x4c0/0xa04 fs/io_uring.c:6624 __do_sys_io_uring_enter fs/io_uring.c:9013 [inline] __se_sys_io_uring_enter fs/io_uring.c:8960 [inline] __arm64_sys_io_uring_enter+0x190/0x708 fs/io_uring.c:8960 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common arch/arm64/kernel/syscall.c:158 [inline] do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:227 el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367 el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383 el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670 Freed by task 12570: stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121 kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track+0x38/0x6c mm/kasan/common.c:56 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355 __kasan_slab_free+0x124/0x150 mm/kasan/common.c:422 kasan_slab_free+0x10/0x1c mm/kasan/common.c:431 slab_free_hook mm/slub.c:1544 [inline] slab_free_freelist_hook mm/slub.c:1577 [inline] slab_free mm/slub.c:3142 [inline] kfree+0x104/0x38c mm/slub.c:4124 io_dismantle_req fs/io_uring.c:1855 [inline] __io_free_req+0x70/0x254 fs/io_uring.c:1867 io_put_req_find_next fs/io_uring.c:2173 [inline] __io_queue_sqe+0x1fc/0x520 fs/io_uring.c:6279 __io_req_task_submit+0x154/0x21c fs/io_uring.c:2051 io_req_task_submit+0x2c/0x44 fs/io_uring.c:2063 task_work_run+0xdc/0x128 kernel/task_work.c:151 get_signal+0x6f8/0x980 kernel/signal.c:2562 do_signal+0x108/0x3a4 arch/arm64/kernel/signal.c:658 do_notify_resume+0xbc/0x25c arch/arm64/kernel/signal.c:722 work_pending+0xc/0x180 blkdev_read_iter can truncate iov_iter's count since the count + pos may exceed the size of the blkdev. This will confuse io_read that we have consume the iovec. And once we do the iov_iter_revert in io_read, we will trigger the slab-out-of-bounds. Fix it by reexpand the count with size has been truncated. blkdev_write_iter can trigger the problem too. Signed-off-by: Nyangerkun <yangerkun@huawei.com> Acked-by: NPavel Begunkov <asml.silencec@gmail.com> Link: https://lore.kernel.org/r/20210401071807.3328235-1-yangerkun@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Matthew Wilcox (Oracle) 提交于
Patch series "Remove nrexceptional tracking", v2. We actually use nrexceptional for very little these days. It's a minor pain to keep in sync with nrpages, but the pain becomes much bigger with the THP patches because we don't know how many indices a shadow entry occupies. It's easier to just remove it than keep it accurate. Also, we save 8 bytes per inode which is nothing to sneeze at; on my laptop, it would improve shmem_inode_cache from 22 to 23 objects per 16kB, and inode_cache from 26 to 27 objects. Combined, that saves a megabyte of memory from a combined usage of 25MB for both caches. Unfortunately, ext4 doesn't cross a magic boundary, so it doesn't save any memory for ext4. This patch (of 4): Instead of checking the two counters (nrpages and nrexceptional), we can just check whether i_pages is empty. Link: https://lkml.kernel.org/r/20201026151849.24232-1-willy@infradead.org Link: https://lkml.kernel.org/r/20201026151849.24232-2-willy@infradead.orgSigned-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Tested-by: NVishal Verma <vishal.l.verma@intel.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 4月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
Now that md has been cleaned up we can get rid of this hack. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Avoid a potentially initialized variabe in the invalidate case. Fixes: d3c4a43d ("block: refactor blk_drop_partitions") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20210408194140.1816537-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 4月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
Move the busy check and disk-wide sync into the only caller, so that the remainder can be shared with del_gendisk. Also pass the gendisk instead of the bdev as that is all that is needed. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210406062303.811835-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 4月, 2021 1 次提交
-
-
由 Pavel Begunkov 提交于
If IOCB_NOWAIT is set on submission, then that needs to get propagated to REQ_NOWAIT on the block side. Otherwise we completely lose this information, and any issuer of IOCB_NOWAIT IO will potentially end up blocking on eg request allocation on the storage side. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 29 3月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
No implementations left. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210308074550.422714-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 23 3月, 2021 1 次提交
-
-
由 Chris Chiu 提交于
The GD_NEED_PART_SCAN is set by bdev_check_media_change to initiate a partition scan while removing a block device. It should be cleared after blk_drop_paritions because blk_drop_paritions could return -EBUSY and then the consequence __blkdev_get has no chance to do delete_partition if GD_NEED_PART_SCAN already cleared. It causes some problems on some card readers. Ex. Realtek card reader 0bda:0328 and 0bda:0158. The device node of the partition will not disappear after the memory card removed. Thus the user applications can not update the device mapping correctly. BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1920874Signed-off-by: NChris Chiu <chris.chiu@canonical.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210323085219.24428-1-chris.chiu@canonical.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 11 3月, 2021 1 次提交
-
-
由 Christoph Hellwig 提交于
Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been horribly confusingly misnamed. Rename it to BIO_MAX_VECS to stop confusing users of the bio API. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 3月, 2021 1 次提交
-
-
由 Jan Kara 提交于
Commit 384d87ef ("block: Do not discard buffers under a mounted filesystem") made paths issuing discard or zeroout requests to the underlying device try to grab block device in exclusive mode. If that failed we returned EBUSY to userspace. This however caused unexpected fallout in userspace where e.g. FUSE filesystems issue discard requests from userspace daemons although the device is open exclusively by the kernel. Also shrinking of logical volume by LVM issues discard requests to a device which may be claimed exclusively because there's another LV on the same PV. So to avoid these userspace regressions, fall back to invalidate_inode_pages2_range() instead of returning EBUSY to userspace and return EBUSY only of that call fails as well (meaning that there's indeed someone using the particular device range we are trying to discard). Link: https://bugzilla.kernel.org/show_bug.cgi?id=211167 Fixes: 384d87ef ("block: Do not discard buffers under a mounted filesystem") CC: stable@vger.kernel.org Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 2月, 2021 1 次提交
-
-
由 Matthew Wilcox (Oracle) 提交于
It's often inconvenient to use BIO_MAX_PAGES due to min() requiring the sign to be the same. Introduce bio_max_segs() and change BIO_MAX_PAGES to be unsigned to make it easier for the users. Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 2月, 2021 1 次提交
-
-
由 Randy Dunlap 提交于
Delete duplicate words in fs/*.c. The doubled words that are being dropped are: that, be, the, in, and, for Link: https://lkml.kernel.org/r/20201224052810.25315-1-rdunlap@infradead.orgSigned-off-by: NRandy Dunlap <rdunlap@infradead.org> Reviewed-by: NMatthew Wilcox (Oracle) <willy@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 1月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
There is no point in allocating memory for a synchronous flush. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Maxim Mikityanskiy 提交于
The cited commit introduced a serious regression with SATA write speed, as found by bisecting. This patch reverts this commit, which restores write speed back to the values observed before this commit. The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs (WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a single HDD, the rest are different RAID levels built over the first partitions of 4 HDDs. Test results are in MB/s, R is read, W is write. | Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6 ----------------+--------+-------+-----------+-----------+-------- 9011495c | R:256 | R:313 | R:276 | R:313 | R:323 (before faulty) | W:254 | W:253 | W:195 | W:204 | W:117 ----------------+--------+-------+-----------+-----------+-------- 5ff9f192 | R:257 | R:398 | R:312 | R:344 | R:391 (faulty commit) | W:154 | W:122 | W:67.7 | W:66.6 | W:67.2 ----------------+--------+-------+-----------+-----------+-------- 5.10.10 | R:256 | R:401 | R:312 | R:356 | R:375 unpatched | W:149 | W:123 | W:64 | W:64.1 | W:61.5 ----------------+--------+-------+-----------+-----------+-------- 5.10.10 | R:255 | R:396 | R:312 | R:340 | R:393 patched | W:247 | W:274 | W:220 | W:225 | W:121 Applying this patch doesn't hurt read performance, while improves the write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed is restored back to the state before the faulty commit, and even a bit higher in RAID tests (which aren't HDD-bound on this device) - that is likely related to other optimizations done between the faulty commit and 5.10.10 which also improved the read speed. Signed-off-by: NMaxim Mikityanskiy <maxtram95@gmail.com> Fixes: 5ff9f192 ("block: simplify set_init_blocksize") Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Acked-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 1月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
truncate_bdev_range is only used in always built-in block layer code, so remove the export and the !CONFIG_BLOCK stub. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jan Kara 提交于
blkdev_fallocate() tries to detect whether a discard raced with an overlapping write by calling invalidate_inode_pages2_range(). However this check can give both false negatives (when writing using direct IO or when writeback already writes out the written pagecache range) and false positives (when write is not actually overlapping but ends in the same page when blocksize < pagesize). This actually causes issues for qemu which is getting confused by EBUSY errors. Fix the problem by removing this conflicting write detection since it is inherently racy and thus of little use anyway. Reported-by: NMaxim Levitsky <mlevitsk@redhat.com> CC: "Darrick J. Wong" <darrick.wong@oracle.com> Link: https://lore.kernel.org/qemu-devel/20201111153913.41840-1-mlevitsk@redhat.comSigned-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 1月, 2021 1 次提交
-
-
由 Pavel Begunkov 提交于
Add a helper function calculating the number of bvec segments we need to allocate to construct a bio. It doesn't change anything functionally, but will be used to not duplicate special cases in the future. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 1月, 2021 2 次提交
-
-
由 Christoph Hellwig 提交于
bdev_evict_inode and bdev_free_inode are also called for the root inode of bdevfs, for which bdev_alloc is never called. Move the zeroing o f struct block_device and the initialization of the bd_bdi field into bdev_alloc_inode to make sure they are initialized for the root inode as well. Fixes: e6cb5382 ("block: initialize struct block_device in bdev_alloc") Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Tested-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Satya Tangirala 提交于
freeze/thaw_bdev() currently use bdev->bd_fsfreeze_count to infer whether or not bdev->bd_fsfreeze_sb is valid (it's valid iff bd_fsfreeze_count is non-zero). thaw_bdev() doesn't nullify bd_fsfreeze_sb. But this means a freeze_bdev() call followed by a thaw_bdev() call can leave bd_fsfreeze_sb with a non-null value, while bd_fsfreeze_count is zero. If freeze_bdev() is called again, and this time get_active_super() returns NULL (e.g. because the FS is unmounted), we'll end up with bd_fsfreeze_count > 0, but bd_fsfreeze_sb is *untouched* - it stays the same (now garbage) value. A subsequent thaw_bdev() will decide that the bd_fsfreeze_sb value is legitimate (since bd_fsfreeze_count > 0), and attempt to use it. Fix this by always setting bd_fsfreeze_sb to NULL when bd_fsfreeze_count is successfully decremented to 0 in thaw_sb(). Alternatively, we could set bd_fsfreeze_sb to whatever get_active_super() returns in freeze_bdev() whenever bd_fsfreeze_count is successfully incremented to 1 from 0 (which can be achieved cleanly by moving the line currently setting bd_fsfreeze_sb to immediately after the "sync:" label, but it might be a little too subtle/easily overlooked in future). This fixes the currently panicking xfstests generic/085. Fixes: 040f04bd ("fs: simplify freeze_bdev/thaw_bdev") Signed-off-by: NSatya Tangirala <satyat@google.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 30 12月, 2020 1 次提交
-
-
由 Randy Dunlap 提交于
Fix new kernel-doc warnings in fs/block_dev.c: ../fs/block_dev.c:1066: warning: Excess function parameter 'whole' description in 'bd_abort_claiming' ../fs/block_dev.c:1837: warning: Function parameter or member 'dev' not described in 'lookup_bdev' Fixes: 4e7b5671 ("block: remove i_bdev") Fixes: 37c3fc9a ("block: simplify the block device claiming interface") Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-fsdevel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 22 12月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
Update copyrights for files that have gotten some major rewrites lately. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-