提交 d54cb0be 编写于 作者: J Jan Kara 提交者: Zheng Zengkai

bdev: Do not return EBUSY if bdev discard races with write

mainline inclusion
from mainline-5.12-rc1
commit 	767630c6
category: bugfix
bugzilla: 107770
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=767630c63bb23acf022adb265574996ca39a4645

-------------------------------------------------

blkdev_fallocate() tries to detect whether a discard raced with an
overlapping write by calling invalidate_inode_pages2_range(). However
this check can give both false negatives (when writing using direct IO
or when writeback already writes out the written pagecache range) and
false positives (when write is not actually overlapping but ends in the
same page when blocksize < pagesize). This actually causes issues for
qemu which is getting confused by EBUSY errors.

Fix the problem by removing this conflicting write detection since it is
inherently racy and thus of little use anyway.
Reported-by: NMaxim Levitsky <mlevitsk@redhat.com>
CC: "Darrick J. Wong" <darrick.wong@oracle.com>
Link: https://lore.kernel.org/qemu-devel/20201111153913.41840-1-mlevitsk@redhat.comSigned-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NMaxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NBaokun Li <libaokun1@huawei.com>
Reviewed-by: NKuohai Xu <xukuohai@huawei.com>
Reviewed-by: NZhang Yi <yi.zhang@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
上级 164b8c99
...@@ -2057,13 +2057,11 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, ...@@ -2057,13 +2057,11 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
return error; return error;
/* /*
* Invalidate again; if someone wandered in and dirtied a page, * Invalidate the page cache again; if someone wandered in and dirtied
* the caller will be given -EBUSY. The third argument is * a page, we just discard it - userspace has no way of knowing whether
* inclusive, so the rounding here is safe. * the write happened before or after discard completing...
*/ */
return invalidate_inode_pages2_range(bdev->bd_inode->i_mapping, return truncate_bdev_range(bdev, file->f_mode, start, end);
start >> PAGE_SHIFT,
end >> PAGE_SHIFT);
} }
const struct file_operations def_blk_fops = { const struct file_operations def_blk_fops = {
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册