- 15 7月, 2021 3 次提交
-
-
git://git.infradead.org/nvme由 Jens Axboe 提交于
Pull NVMe fixes from Christoph: "nvme fixes for Linux 5.14 - fix various races in nvme-pci when shutting down just after probing (Casey Chen) - fix a net_device leak in nvme-tcp (Prabhakar Kushwaha)" * tag 'nvme-5.14-2021-07-15' of git://git.infradead.org/nvme: nvme-pci: do not call nvme_dev_remove_admin from nvme_remove nvme-pci: fix multiple races in nvme_setup_io_queues nvme-tcp: use __dev_get_by_name instead dev_get_by_name for OPT_HOST_IFACE
-
由 Wang Qing 提交于
We must release the queue before freeing the tagset. Fixes: 4af5f2e0 ("nbd: use blk_mq_alloc_disk and blk_cleanup_disk") Reported-and-tested-by: syzbot+9ca43ff47167c0ee3466@syzkaller.appspotmail.com Signed-off-by: NWang Qing <wangqing@vivo.com> Signed-off-by: NGuoqing Jiang <jiangguoqing@kylinos.cn> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210706040016.1360412-1-guoqing.jiang@linux.devSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Guoqing Jiang 提交于
We must release the queue before freeing the tagset. Fixes: 262d431f ("pd: use blk_mq_alloc_disk and blk_cleanup_disk") Signed-off-by: NGuoqing Jiang <jiangguoqing@kylinos.cn> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210706010734.1356066-1-guoqing.jiang@linux.devSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 13 7月, 2021 3 次提交
-
-
由 Casey Chen 提交于
nvme_dev_remove_admin could free dev->admin_q and the admin_tagset while they are being accessed by nvme_dev_disable(), which can be called by nvme_reset_work via nvme_remove_dead_ctrl. Commit cb4bfda6 ("nvme-pci: fix hot removal during error handling") intended to avoid requests being stuck on a removed controller by killing the admin queue. But the later fix c8e9e9b7 ("nvme-pci: unquiesce admin queue on shutdown"), together with nvme_dev_disable(dev, true) right before nvme_dev_remove_admin() could help dispatch requests and fail them early, so we don't need nvme_dev_remove_admin() any more. Fixes: cb4bfda6 ("nvme-pci: fix hot removal during error handling") Signed-off-by: NCasey Chen <cachen@purestorage.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Casey Chen 提交于
Below two paths could overlap each other if we power off a drive quickly after powering it on. There are multiple races in nvme_setup_io_queues() because of shutdown_lock missing and improper use of NVMEQ_ENABLED bit. nvme_reset_work() nvme_remove() nvme_setup_io_queues() nvme_dev_disable() ... ... A1 clear NVMEQ_ENABLED bit for admin queue lock retry: B1 nvme_suspend_io_queues() A2 pci_free_irq() admin queue B2 nvme_suspend_queue() admin queue A3 pci_free_irq_vectors() nvme_pci_disable() A4 nvme_setup_irqs(); B3 pci_free_irq_vectors() ... unlock A5 queue_request_irq() for admin queue set NVMEQ_ENABLED bit ... nvme_create_io_queues() A6 result = queue_request_irq(); set NVMEQ_ENABLED bit ... fail to allocate enough IO queues: A7 nvme_suspend_io_queues() goto retry If B3 runs in between A1 and A2, it will crash if irqaction haven't been freed by A2. B2 is supposed to free admin queue IRQ but it simply can't fulfill the job as A1 has cleared NVMEQ_ENABLED bit. Fix: combine A1 A2 so IRQ get freed as soon as the NVMEQ_ENABLED bit gets cleared. After solved #1, A2 could race with B3 if A2 is freeing IRQ while B3 is checking irqaction. A3 also could race with B2 if B2 is freeing IRQ while A3 is checking irqaction. Fix: A2 and A3 take lock for mutual exclusion. A3 could race with B3 since they could run free_msi_irqs() in parallel. Fix: A3 takes lock for mutual exclusion. A4 could fail to allocate all needed IRQ vectors if A3 and A4 are interrupted by B3. Fix: A4 takes lock for mutual exclusion. If A5/A6 happened after B2/B1, B3 will crash since irqaction is not NULL. They are just allocated by A5/A6. Fix: Lock queue_request_irq() and setting of NVMEQ_ENABLED bit. A7 could get chance to pci_free_irq() for certain IO queue while B3 is checking irqaction. Fix: A7 takes lock. nvme_dev->online_queues need to be protected by shutdown_lock. Since it is not atomic, both paths could modify it using its own copy. Co-developed-by: NYuanyuan Zhong <yzhong@purestorage.com> Signed-off-by: NCasey Chen <cachen@purestorage.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Prabhakar Kushwaha 提交于
dev_get_by_name() finds network device by name but it also increases the reference count. If a nvme-tcp queue is present and the network device driver is removed before nvme_tcp, we will face the following continuous log: "kernel:unregister_netdevice: waiting for <eth> to become free. Usage count = 2" And rmmod further halts. Similar case arises during reboot/shutdown with nvme-tcp queue present and both never completes. To fix this, use __dev_get_by_name() which finds network device by name without increasing any reference counter. Fixes: 3ede8f72 ("nvme-tcp: allow selecting the network interface for connections") Signed-off-by: NOmkar Kulkarni <okulkarni@marvell.com> Signed-off-by: NShai Malin <smalin@marvell.com> Signed-off-by: NPrabhakar Kushwaha <pkushwaha@marvell.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> [hch: remove the ->ndev member entirely] Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 07 7月, 2021 3 次提交
-
-
由 Yu Kuai 提交于
We run a test that create millions of cgroups and blkgs, and then trigger blkg_destroy_all(). blkg_destroy_all() will hold spin lock for a long time in such situation. Thus release the lock when a batch of blkgs are destroyed. blkcg_activate_policy() and blkcg_deactivate_policy() might have the same problem, however, as they are basically only called from module init/exit paths, let's leave them alone for now. Signed-off-by: NYu Kuai <yukuai3@huawei.com> Acked-by: NTejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20210707015649.1929797-1-yukuai3@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Chunguang Xu 提交于
On the IO submission path, blk_account_io_start() may interrupt the system interruption. When the interruption returns, the value of part->stamp may have been updated by other cores, so the time value collected before the interruption may be less than part-> stamp. So when this happens, we should do nothing to make io_ticks more accurate? For kernels less than 5.0, this may cause io_ticks to become smaller, which in turn may cause abnormal ioutil values. Signed-off-by: NChunguang Xu <brookxu@tencent.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/1625521646-1069-1-git-send-email-brookxu.cn@gmail.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
git://git.infradead.org/nvme由 Jens Axboe 提交于
Pull single NVMe fix from Christoph. * 'nvme-5.14' of git://git.infradead.org/nvme: nvme-tcp: can't set sk_user_data without write_lock
-
- 05 7月, 2021 1 次提交
-
-
由 Maurizio Lombardi 提交于
The sk_user_data pointer is supposed to be modified only while holding the write_lock "sk_callback_lock", otherwise we could race with other threads and crash the kernel. we can't take the write_lock in nvmet_tcp_state_change() because it would cause a deadlock, but the release_work queue will set the pointer to NULL later so we can simply remove the assignment. Fixes: b5332a9f ("nvmet-tcp: fix incorrect locking in state_change sk callback") Signed-off-by: NMaurizio Lombardi <mlombard@redhat.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 02 7月, 2021 3 次提交
-
-
由 Tetsuo Handa 提交于
Commit 0384264e ("block: pass a gendisk to bdev_disk_changed") changed to pass lo->lo_disk instead of lo->lo_device. Fixes: 0384264e ("block: pass a gendisk to bdev_disk_changed") Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/r/20210702152714.7978-1-penguin-kernel@I-love.SAKURA.ne.jpSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
There is no need to hold a bdev reference when removing the partition. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210701081638.246552-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Sending uevents requires the struct device to be alive. To ensure that grab the device refcount instead of just an inode reference. Fixes: bc359d03 ("block: add a disk_uevent helper") Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210701081638.246552-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 01 7月, 2021 27 次提交
-
-
由 Kees Cook 提交于
In preparation for FORTIFY_SOURCE performing compile-time and run-time field array bounds checking for memcpy(), memmove(), and memset(), avoid intentionally reading across neighboring array fields. Add a wrapping structure to serve as the memcpy() source, so the compiler can do appropriate bounds checking, avoiding this future warning: In function '__fortify_memcpy', inlined from 'create_uid' at drivers/s390/block/dasd_eckd.c:749:2: ./include/linux/fortify-string.h:246:4: error: call to '__read_overflow2_field' declared with attribute error: detected read beyond size of field (2nd parameter) Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NStefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20210701142221.3408680-3-sth@linux.ibm.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
dasd_set_target_state is only used inside of dasd_mod.ko, so don't export it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NStefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20210701142221.3408680-2-sth@linux.ibm.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Yufen Yu 提交于
If disk have been deleted, we should return fail for ioctl BLKPG_DEL_PARTITION. Otherwise, the directory /sys/class/block may remain invalid symlinks file. The race as following: blkdev_open del_gendisk disk->flags &= ~GENHD_FL_UP; blk_drop_partitions blkpg_ioctl bdev_add_partition add_partition device_add device_add_class_symlinks ioctl may add_partition after del_gendisk() have tried to delete partitions. Then, symlinks file will be created. Reviewed-by: NJan Kara <jack@suse.cz> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NYufen Yu <yuyufen@huawei.com> Link: https://lore.kernel.org/r/20210610023241.3646241-1-yuyufen@huawei.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Remove some leftovers of the fake major number parsing that cause complains from some compilers. Fixes: 2933a1b2c6f3 ("ubd: remove the code to register as the legacy IDE driver") Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210628093937.1325608-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
We don't have an nvme status to report if the driver's .queue_rq() returns an error without dispatching the requested nvme command. Check the return value from blk_execute_rq() for all passthrough commands so the caller may know their command was not successful. If the command is from the target passthrough interface and fails to dispatch, synthesize the response back to the host as a internal target error. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210610214437.641245-5-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
The synchronous blk_execute_rq() had not provided a way for its callers to know if its request was successful or not. Return the blk_status_t result of the request. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210610214437.641245-4-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
The generic blk_execute_rq() knows how to handle polled completions. Use that instead of implementing an nvme specific handler. Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210610214437.641245-3-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Keith Busch 提交于
Poll for completions if the request's hctx is a polling type. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210610214437.641245-2-kbusch@kernel.orgSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
With the legacy IDE driver gone drivers now use either REQ_OP_DRV_* or REQ_OP_SCSI_*, so unify the two concepts of passthrough requests into a single one. Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
All driver uses are gone now. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20210624081012.256464-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use idr_for_each_entry to simplify removing all devices. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210623145908.92973-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
loop_lookup has two callers - one wants to do the a find by index and the other wants any unbound loop device. Open code the respective functionality in each caller. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210623145908.92973-9-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Passing a negative index to loop_lookup while return any unbound device. Doing that for a delete does not make much sense, so add check to explicitly reject that case. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210623145908.92973-8-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Move acquiring and releasing loop_ctl_mutex from the callers into loop_add. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210623145908.92973-7-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Split loop_control_ioctl into a helper for each command. This keeps the code nicely separated for the upcoming locking changes. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210623145908.92973-6-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
loop_add returns the right error if the slot wasn't available. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210623145908.92973-5-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
None of the callers cares about the allocated struct loop_device. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210623145908.92973-4-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
loop_ctl_mutex is only needed to iterate the IDR for removing the loop devices, so reduce the coverage. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210623145908.92973-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Unregister the misc and blockdevice first to prevent further access, and only then iterate to remove the devices. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210623145908.92973-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fix a let hunk from the blk_mq_alloc_disk conversion. Fixes: 281ea6a5bfdc ("mmc: switch to blk_mq_alloc_disk") Reported-by: NMarek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com> Acked-by: NUlf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210621080144.3655131-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use the blk_mq_alloc_disk to allocate the request_queue and gendisk together. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210616053934.880951-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
The gendisk already acquires a reference to the queue when add_disk is called, which dropped on put_disk. So remove the superflous extra refcounting. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210616053934.880951-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Prasanna Kumar Kalever 提交于
Problem: On reconfigure of device, there is no way to defend if the backend storage is matching with the initial backend storage. Say, if an initial connect request for backend "pool1/image1" got mapped to /dev/nbd0 and the userspace process is terminated. A next reconfigure request within NBD_ATTR_DEAD_CONN_TIMEOUT is allowed to use /dev/nbd0 for a different backend "pool1/image2" For example, an operation like below could be dangerous: $ sudo rbd-nbd map --try-netlink rbd-pool/ext4-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="bfc444b4-64b1-418f-8b36-6e0d170cfc04" TYPE="ext4" $ sudo pkill -9 rbd-nbd $ sudo rbd-nbd attach --try-netlink --device /dev/nbd0 rbd-pool/xfs-image /dev/nbd0 $ sudo blkid /dev/nbd0 /dev/nbd0: UUID="d29bf343-6570-4069-a9ea-2fa156ced908" TYPE="xfs" Solution: Provide a way for userspace processes to keep some metadata to identify between the device and the backend, so that when a reconfigure request is made, we can compare and avoid such dangerous operations. With this solution, as part of the initial connect request, backend path can be stored in the sysfs per device config, so that on a reconfigure request it's easy to check if the backend path matches with the initial connect backend path. Please note, ioctl interface to nbd will not have these changes, as there won't be any reconfigure. Signed-off-by: NPrasanna Kumar Kalever <prasanna.kalever@redhat.com> Reviewed-by: NXiubo Li <xiubli@redhat.com> Reviewed-by: NMing Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210429102828.31248-1-prasanna.kalever@redhat.comSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and request_queue allocation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210614060759.3965724-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
With the legacy IDE driver long deprecated, and modern userspace being much more flexible about dev_t assignments there is no reason to fake a registration as the legacy IDE driver in ubd. This registeration is a little problematic as it registers the same request_queue for multiple gendisks, so just remove it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-By: NAnton Ivanov <anton.ivanov@cambridgegreys.com> Link: https://lore.kernel.org/r/20210614060759.3965724-2-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Fix up the recent blk_alloc_disk conversion. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210614060231.3965278-1-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and request_queue allocation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210614060343.3965416-3-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-