- 15 7月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
Replace the remaining calls of bdevname with snprintf using the %pg format specifier. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220713055317.1888500-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 06 7月, 2022 10 次提交
-
-
由 Christoph Hellwig 提交于
Move the zone related fields that are currently stored in struct request_queue to struct gendisk as these are part of the highlevel block layer API and are only used for non-passthrough I/O that requires the gendisk. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-17-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Always use bdev_zone_sectors instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-16-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Pass a block_device instead of a request_queue as that is what most callers have at hand. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Acked-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-12-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Switch to a gendisk based API in preparation for moving all zone related fields from the request_queue to the gendisk. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-11-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Always use the bdev based helpers instead. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Prepare for storing the zone related field in struct gendisk instead of struct request_queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
It doesn't hurt to always have the blk_zone_cond_str prototype, and the two inlines can also be defined unconditionally. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220706070350.1703384-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 John Garry 提交于
We no longer use the 'reserved' arg in busy_tag_iter_fn for any iter function so it may be dropped. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> #nvme Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/1657109034-206040-6-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 John Garry 提交于
With new API blk_mq_is_reserved_rq() we can tell if a request is from the reserved pool, so stop passing 'reserved' arg. There is actually only a single user of that arg for all the callback implementations, which can use blk_mq_is_reserved_rq() instead. This will also allow us to stop passing the same 'reserved' around the blk-mq iter functions next. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NHannes Reinecke <hare@suse.de> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/1657109034-206040-4-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 John Garry 提交于
Add a flag for reserved requests so that drivers may know this for any special handling. Signed-off-by: NJohn Garry <john.garry@huawei.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/1657109034-206040-3-git-send-email-john.garry@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 29 6月, 2022 3 次提交
-
-
由 Christoph Hellwig 提交于
Independent access ranges only matter for file system I/O and are only valid with a registered gendisk, so move them there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220629062013.1331068-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Pass a gendisk to the sysfs register/unregister functions and give them descriptive names. Also move the unregistration helper next to the one doing the registration. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220628171850.1313069-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add the trace attributes to the default gendisk attributes, just like we already do for partitions. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220628171850.1313069-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 6月, 2022 3 次提交
-
-
由 Christoph Hellwig 提交于
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now, so remove it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for all disks that do not have separately allocated queues, and thus remove the need to call blk_cleanup_queue for them. Rename blk_cleanup_disk to blk_mq_destroy_queue to make it clear that this function is intended only for separately allocated blk-mq queues. This saves an extra queue freeze for devices without a separately allocated queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Disallow setting the blk-mq state on any queue that is already dying as setting the state even then is a bad idea, and remove the now unused QUEUE_FLAG_DEAD flag. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NHannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 27 6月, 2022 11 次提交
-
-
由 Jan Kara 提交于
Nobody outside of block/ioprio.c uses it. Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220623074840.5960-4-jack@suse.czSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jan Kara 提交于
get_current_ioprio() operates only on current task. We will need the same functionality for other tasks as well. Generalize get_current_ioprio() for that and also move the bulk out of the header file because it is large enough. Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220623074840.5960-3-jack@suse.czSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jan Kara 提交于
get_current_ioprio() is used to initialize IO priority of various requests. As such it should be returning the effective IO priority of the task (i.e., reflecting the fact that unset IO priority should get set based on task's CPU priority) so that the conversion is concentrated in one place. Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220623074840.5960-2-jack@suse.czSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jan Kara 提交于
Commit e70344c0 ("block: fix default IO priority handling") introduced an inconsistency in get_current_ioprio() that tasks without IO context return IOPRIO_DEFAULT priority while tasks with freshly allocated IO context will return 0 (IOPRIO_CLASS_NONE/0) IO priority. Tasks without IO context used to be rare before 5a9d041b ("block: move io_context creation into where it's needed") but after this commit they became common because now only BFQ IO scheduler setups task's IO context. Similar inconsistency is there for get_task_ioprio() so this inconsistency is now exposed to userspace and userspace will see different IO priority for tasks operating on devices with BFQ compared to devices without BFQ. Furthemore the changes done by commit e70344c0 change the behavior when no IO priority is set for BFQ IO scheduler which is also documented in ioprio_set(2) manpage: "If no I/O scheduler has been set for a thread, then by default the I/O priority will follow the CPU nice value (setpriority(2)). In Linux kernels before version 2.6.24, once an I/O priority had been set using ioprio_set(), there was no way to reset the I/O scheduling behavior to the default. Since Linux 2.6.24, specifying ioprio as 0 can be used to reset to the default I/O scheduling behavior." So make sure we default to IOPRIO_CLASS_NONE as used to be the case before commit e70344c0. Also cleanup alloc_io_context() to explicitely set this IO priority for the allocated IO context to avoid future surprises. Note that we tweak ioprio_best() to maintain ioprio_get(2) behavior and make this commit easily backportable. CC: stable@vger.kernel.org Fixes: e70344c0 ("block: fix default IO priority handling") Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: NJan Kara <jack@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220623074840.5960-1-jack@suse.czSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
blk_queue_get_max_sectors is private to the block layer, so move it out of blkdev.h. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220614090934.570632-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Now that blk_max_size_offset has a single caller left, fold it into that and clean up the naming convention for the local variables there. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NPankaj Raghav <p.raghav@samsung.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220614090934.570632-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Factor out a helper from blk_max_size_offset so that it can be reused independently. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NPankaj Raghav <p.raghav@samsung.com> Link: https://lore.kernel.org/r/20220614090934.570632-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
Use the address alignment requirements from the block_device for direct io instead of requiring addresses be aligned to the block size. User space can discover the alignment requirements from the dma_alignment queue attribute. User space can specify any hardware compatible DMA offset for each segment, but every segment length is still required to be a multiple of the block size. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220610195830.3574005-11-kbusch@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
Provide a convenient function for this repeatable coding pattern. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220610195830.3574005-10-kbusch@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
The existing iov_iter_alignment() function returns the logical OR of address and length. For cases where address and length need to be considered separately, introduce a helper function that a caller can specificy length and address masks that indicate if the iov is unaligned. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220610195830.3574005-9-kbusch@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
Preparing for upcoming dma_alignment users that have a block_device, but don't need the request_queue. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220610195830.3574005-5-kbusch@fb.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 24 6月, 2022 5 次提交
-
-
由 Akira Yokosawa 提交于
Commit 48ec13d3 ("gpio: Properly document parent data union") is supposed to have fixed a warning from "make htmldocs" regarding kernel-doc comments to union members. However, the same warning still remains [1]. Fix the issue by following the example found in section "Nested structs/unions" of Documentation/doc-guide/kernel-doc.rst. Signed-off-by: NAkira Yokosawa <akiyks@gmail.com> Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Fixes: 48ec13d3 ("gpio: Properly document parent data union") Link: https://lore.kernel.org/r/20220606093302.21febee3@canb.auug.org.au/ [1] Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Bartosz Golaszewski <brgl@bgdev.pl> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Tested-by: NStephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: NMauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: NBartosz Golaszewski <brgl@bgdev.pl>
-
由 Petr Mladek 提交于
This reverts commit 2bb2b7b5. The testing of 5.19 release candidates revealed missing synchronization between early and regular console functionality. It would be possible to start the console kthreads later as a workaround. But it is clear that console lock serialized console drivers between each other. It opens a big area of possible problems that were not considered by people involved in the development and review. printk() is crucial for debugging kernel issues and console output is very important part of it. The number of consoles is huge and a proper review would take some time. As a result it need to be reverted for 5.19. Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alleySigned-off-by: NPetr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20220623145157.21938-7-pmladek@suse.com
-
由 Petr Mladek 提交于
This reverts commit 09c5ba0a. This reverts commit b87f0230. The testing of 5.19 release candidates revealed missing synchronization between early and regular console functionality. It would be possible to start the console kthreads later as a workaround. But it is clear that console lock serialized console drivers between each other. It opens a big area of possible problems that were not considered by people involved in the development and review. printk() is crucial for debugging kernel issues and console output is very important part of it. The number of consoles is huge and a proper review would take some time. As a result it need to be reverted for 5.19. Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alleySigned-off-by: NPetr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20220623145157.21938-6-pmladek@suse.com
-
由 Petr Mladek 提交于
This reverts commit 8e274732. The testing of 5.19 release candidates revealed missing synchronization between early and regular console functionality. It would be possible to start the console kthreads later as a workaround. But it is clear that console lock serialized console drivers between each other. It opens a big area of possible problems that were not considered by people involved in the development and review. printk() is crucial for debugging kernel issues and console output is very important part of it. The number of consoles is huge and a proper review would take some time. As a result it need to be reverted for 5.19. Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alleySigned-off-by: NPetr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20220623145157.21938-5-pmladek@suse.com
-
由 Petr Mladek 提交于
This reverts commit b87f0230. The testing of 5.19 release candidates revealed missing synchronization between early and regular console functionality. It would be possible to start the console kthreads later as a workaround. But it is clear that console lock serialized console drivers between each other. It opens a big area of possible problems that were not considered by people involved in the development and review. printk() is crucial for debugging kernel issues and console output is very important part of it. The number of consoles is huge and a proper review would take some time. As a result it need to be reverted for 5.19. Link: https://lore.kernel.org/r/YrBdjVwBOVgLfHyb@alleySigned-off-by: NPetr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20220623145157.21938-2-pmladek@suse.com
-
- 23 6月, 2022 1 次提交
-
-
由 Joel Granados 提交于
Adjust the values of NVME_CAP_CRMS_CRIMS and NVME_CAP_CRMS_CRWMS masks as they are different from the ones in TP4084 - Time-to-ready. Fixes: 354201c5 ("nvme: add support for TP4084 - Time-to-Ready Enhancements"). Signed-off-by: NJoel Granados <j.granados@samsung.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 20 6月, 2022 2 次提交
-
-
由 Damien Le Moal 提交于
The request queue pointer in struct blk_independent_access_range is unused. Remove it. Signed-off-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Fixes: 41e46b3c ("block: Fix potential deadlock in blk_ia_range_sysfs_show()") Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220603053529.76405-1-damien.lemoal@opensource.wdc.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Jason A. Donenfeld 提交于
random.c ratelimits how much it warns about uninitialized urandom reads using __ratelimit(). When the RNG is finally initialized, it prints the number of missed messages due to ratelimiting. It has been this way since that functionality was introduced back in 2018. Recently, cc1e127b ("random: remove ratelimiting for in-kernel unseeded randomness") put a bit more stress on the urandom ratelimiting, which teased out a bug in the implementation. Specifically, when under pressure, __ratelimit() will print its own message and reset the count back to 0, making the final message at the end less useful. Secondly, it does so as a pr_warn(), which apparently is undesirable for people's CI. Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly for this purpose, so we set the flag. Fixes: 4e00b339 ("random: rate limit unseeded randomness warnings") Cc: stable@vger.kernel.org Reported-by: NJon Hunter <jonathanh@nvidia.com> Reported-by: NRon Economos <re@w6rz.net> Tested-by: NRon Economos <re@w6rz.net> Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
-
- 17 6月, 2022 4 次提交
-
-
由 Christoph Hellwig 提交于
Various places like I/O schedulers or the QOS infrastructure try to register debugfs files on demans, which can race with creating and removing the main queue debugfs directory. Use the existing debugfs_mutex to serialize all debugfs operations that rely on q->debugfs_dir or the directories hanging off it. To make the teardown code a little simpler declare all debugfs dentry pointers and not just the main one uncoditionally in blkdev.h. Move debugfs_mutex next to the dentries that it protects and document what it is used for. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220614074827.458955-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 zhenwei pi 提交于
Currently unpoison_memory(unsigned long pfn) is designed for soft poison(hwpoison-inject) only. Since 17fae129, the KPTE gets cleared on a x86 platform once hardware memory corrupts. Unpoisoning a hardware corrupted page puts page back buddy only, the kernel has a chance to access the page with *NOT PRESENT* KPTE. This leads BUG during accessing on the corrupted KPTE. Suggested by David&Naoya, disable unpoison mechanism when a real HW error happens to avoid BUG like this: Unpoison: Software-unpoisoned page 0x61234 BUG: unable to handle page fault for address: ffff888061234000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G M OE 5.18.0.bm.1-amd64 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ... RIP: 0010:clear_page_erms+0x7/0x10 Code: ... RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000 RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000 RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001 FS: 00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> prep_new_page+0x151/0x170 get_page_from_freelist+0xca0/0xe20 ? sysvec_apic_timer_interrupt+0xab/0xc0 ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 __alloc_pages+0x17e/0x340 __folio_alloc+0x17/0x40 vma_alloc_folio+0x84/0x280 __handle_mm_fault+0x8d4/0xeb0 handle_mm_fault+0xd5/0x2a0 do_user_addr_fault+0x1d0/0x680 ? kvm_read_and_reset_apf_flags+0x3b/0x50 exc_page_fault+0x78/0x170 asm_exc_page_fault+0x27/0x30 Link: https://lkml.kernel.org/r/20220615093209.259374-2-pizhenwei@bytedance.com Fixes: 847ce401 ("HWPOISON: Add unpoisoning support") Fixes: 17fae129 ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") Signed-off-by: Nzhenwei pi <pizhenwei@bytedance.com> Acked-by: NDavid Hildenbrand <david@redhat.com> Acked-by: NNaoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: NMiaohe Lin <linmiaohe@huawei.com> Reviewed-by: NOscar Salvador <osalvador@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> [5.8+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Alex Williamson 提交于
The commit referenced below subtly and inadvertently changed the logic to disallow pinning of zero pfns. This breaks device assignment with vfio and potentially various other users of gup. Exclude the zero page test from the negation. Link: https://lkml.kernel.org/r/165490039431.944052.12458624139225785964.stgit@omen Fixes: 1c563432 ("mm: fix is_pinnable_page against a cma page") Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Acked-by: NMinchan Kim <minchan@kernel.org> Acked-by: NDavid Hildenbrand <david@redhat.com> Reported-by: NYishai Hadas <yishaih@nvidia.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: John Dias <joaodias@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Zhangfei Gao <zhangfei.gao@linaro.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Yi Liu <yi.l.liu@intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
-
由 Ming Lei 提交于
q->elevator is referred in blk_mq_has_sqsched() without any protection, no .q_usage_counter is held, no queue srcu and rcu read lock is held, so potential use-after-free may be triggered. Fix the issue by adding one queue flag for checking if the elevator uses single queue style dispatch. Meantime the elevator feature flag of ELEVATOR_F_MQ_AWARE isn't needed any more. Cc: Jan Kara <jack@suse.cz> Signed-off-by: NMing Lei <ming.lei@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220616014401.817001-3-ming.lei@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-