- 31 5月, 2022 2 次提交
-
-
由 Niklas Cassel 提交于
The NVM Express Base Specification 2.0 specifies in the description of the CC – Controller Configuration register: "Host software shall set the Arbitration Mechanism Selected (CC.AMS), the Memory Page Size (CC.MPS), and the I/O Command Set Selected (CC.CSS) to valid values prior to enabling the controller by setting CC.EN to ‘1’. While we haven't seen any controller misbehaving while setting all bits in a single write, let's do it in the order that it is written in the spec, as there could potentially be controllers that are implemented to rely on the configuration bits being set before enabling the controller. Signed-off-by: NNiklas Cassel <niklas.cassel@wdc.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
The MAXIO MAP1001 controllers reports completely bogus Namespace identifiers that even change after suspend cycles. Disable using the Identifiers entirely. Reported-by: NArman Hajishafieha <arman.hajishafieha@hotmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Tested-by: NArman Hajishafieha <arman.hajishafieha@hotmail.com>
-
- 20 5月, 2022 1 次提交
-
-
由 Chaitanya Kulkarni 提交于
In current implementation we set the non-mdts limits by calling nvme_init_non_mdts_limits() from nvme_init_ctrl_finish(). This also tries to set the limits for the discovery controller which has no I/O queues resulting in the warning message reported by the nvme_log_error() when running blktest nvme/002: - [ 2005.155946] run blktests nvme/002 at 2022-04-09 16:57:47 [ 2005.192223] loop: module loaded [ 2005.196429] nvmet: adding nsid 1 to subsystem blktests-subsystem-0 [ 2005.200334] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 <------------------------------SNIP----------------------------------> [ 2008.958108] nvmet: adding nsid 1 to subsystem blktests-subsystem-997 [ 2008.962082] nvmet: adding nsid 1 to subsystem blktests-subsystem-998 [ 2008.966102] nvmet: adding nsid 1 to subsystem blktests-subsystem-999 [ 2008.973132] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN testhostnqn. *[ 2008.973196] nvme1: Identify(0x6), Invalid Field in Command (sct 0x0 / sc 0x2) MORE DNR* [ 2008.974595] nvme nvme1: new ctrl: "nqn.2014-08.org.nvmexpress.discovery" [ 2009.103248] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery" Move the call of nvme_init_non_mdts_limits() to nvme_scan_work() after we verify that I/O queues are created since that is a converging point for each transport where these limits are actually used. 1. FC : nvme_fc_create_association() ... nvme_fc_create_io_queues(ctrl); ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() 2. PCIe:- nvme_reset_work() ... nvme_setup_io_queues() nvme_create_io_queues() nvme_alloc_queue() ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() 3. RDMA :- nvme_rdma_setup_ctrl ... nvme_rdma_configure_io_queues ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() 4. TCP :- nvme_tcp_setup_ctrl ... nvme_tcp_configure_io_queues ... nvme_start_ctrl() nvme_scan_queue() nvme_scan_work() * nvme_scan_work() ... nvme_validate_or_alloc_ns() nvme_alloc_ns() nvme_update_ns_info() nvme_update_disk_info() nvme_config_discard() <--- blk_queue_max_write_zeroes_sectors() <--- Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 19 5月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
Add support for using longer timeouts during controller initialization and letting the controller come up with namespaces that are not ready for I/O yet. We skip these not ready namespaces during scanning and only bring them online once anoter scan is kicked off by the AEN that is set when the NRDY bit gets set in the I/O Command Set Independent Identify Namespace Data Structure. This asynchronous probing avoids blocking the kernel boot when controllers take a very long time to recover after unclean shutdowns (up to minutes). Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NHannes Reinecke <hare@suse.de>
-
- 16 5月, 2022 8 次提交
-
-
由 Chaitanya Kulkarni 提交于
The RDAMA and TCP transport both complete the timed out request in the same manner and hence code is duplicated. Add and use the helper nvmf_complete_timed_out_request() to remove the duplicate code. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Stefan Roese 提交于
On our ZynqMP system we observe, that a NVMe drive that resets itself while doing a firmware update causes a Kernel crash like this: [ 67.720772] pcieport 0000:02:02.0: pciehp: Slot(2): Link Down [ 67.720783] pcieport 0000:02:02.0: pciehp: Slot(2): Card not present [ 67.720795] nvme 0000:04:00.0: PME# disabled [ 67.720849] Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP [ 67.720853] nwl-pcie fd0e0000.pcie: Slave error Analysis: When nvme_dev_disable() is called because of this PCIe hotplug event, pci_is_enabled() is still true. And accessing the NVMe drive which is currently not available as it's in reboot process causes this "synchronous external abort" on this ARM64 platform. This patch adds the pci_device_is_present() check as well, which returns false in this "Card not present" hot-plug case. With this change, the NVMe driver does not try to access the NVMe registers any more and the FW update finishes without any problems. Signed-off-by: NStefan Roese <sr@denx.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
In nvme_alloc_admin_tags, the admin_q can be set to an error (typically -ENOMEM) if the blk_mq_init_queue call fails to set up the queue, which is checked immediately after the call. However, when we return the error message up the stack, to nvme_reset_work the error takes us to nvme_remove_dead_ctrl() nvme_dev_disable() nvme_suspend_queue(&dev->queues[0]). Here, we only check that the admin_q is non-NULL, rather than not an error or NULL, and begin quiescing a queue that never existed, leading to bad / NULL pointer dereference. Signed-off-by: NKyle Smith <kyles@hpe.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Most of the internal passthru commands use __nvme_submit_sync_cmd() interface. There are few places we open code the request submission :- 1. nvme_keep_alive_work(struct work_struct *work) 2. nvme_timeout(struct request *req, bool reserved) 3. nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode) Mark the internal passthru request quiet so that we can skip the verbose error message from nvme_log_error() in nvme_end_req() completion path, this will be consistent with what we have in __nvme_submit_sync_cmd(). Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NAlan Adamson <alan.adamson@oracle.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Max Gurtovoy 提交于
No usage of blkdev.h elements. Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Max Gurtovoy 提交于
Log a few more path related status codes. Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Keith Busch 提交于
The nvme specification only requires qword alignment for segment descriptors, and the driver already guarantees that. The spec has always allowed user data to be dword aligned, which is what the queue's attribute is for, so relax the alignment requirement to that value. While we could allow byte alignment for some controllers when using SGLs, we still need to support PRP, and that only allows dword. Fixes: 3b2a1ebc ("nvme: set dma alignment to qword") Signed-off-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Tom Yan 提交于
DMRSLl is in the unit of logical blocks, while max_discard_sectors is in the unit of "linux sector". Signed-off-by: NTom Yan <tom.ty89@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 04 5月, 2022 1 次提交
-
-
由 Christoph Hellwig 提交于
The nvme driver never sets a discard_alignment, so it also doens't need to clear it to zero. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220418045314.360785-10-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 18 4月, 2022 3 次提交
-
-
由 Christoph Hellwig 提交于
Secure erase is a very different operation from discard in that it is a data integrity operation vs hint. Fully split the limits and helper infrastructure to make the separation more clear. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2] Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: NChao Yu <chao@kernel.org> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Christoph Hellwig 提交于
Add a helper to check the max supported sectors for zone append based on the block_device instead of having to poke into the block layer internal request_queue. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NDamien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NJohannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-16-hch@lst.deSigned-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 4月, 2022 4 次提交
-
-
由 Christoph Hellwig 提交于
Qemu unconditionally reports a UUID, which depending on the qemu version is either all-null (which is incorrect but harmless) or contains a single bit set for all controllers. In addition it can also optionally report a eui64 which needs to be manually set. Disable namespace identifiers for Qemu controlles entirely even if in some cases they could be set correctly through manual intervention. Reported-by: NLuis Chamberlain <mcgrof@kernel.org> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Christoph Hellwig 提交于
The MAXIO MAP1002/1202 controllers reports completely bogus Namespace identifiers that even change after suspend cycles. Disable using the Identifiers entirely. Reported-by: N金韬 <me@kingtous.cn> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Tested-by: N金韬 <me@kingtous.cn>
-
由 Christoph Hellwig 提交于
Add a quirk to disable using and exporting namespace identifiers for controllers where they are broken beyond repair. The most directly visible problem with non-unique namespace identifiers is that they break the /dev/disk/by-id/ links, with the link for a supposedly unique identifier now pointing to one of multiple possible namespaces that share the same ID, and a somewhat random selection of which one actually shows up. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
由 Chaitanya Kulkarni 提交于
Use the RQF_QUIET flag to skip the newly added verbose error reporting, and set the flag in __nvme_submit_sync_cmd, which is used for most internal passthrough requests where we do expect errors (e.g. due to probing for optional functionality). This is similar to what the SCSI verbose error logging does. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NAlan Adamson <alan.adamson@oracle.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Tested-by: NAlan Adamson <alan.adamson@oracle.com> Tested-by: NYi Zhang <yi.zhang@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 29 3月, 2022 5 次提交
-
-
由 Anton Eidelman 提交于
nvme_mpath_init_identify() invoked from nvme_init_identify() fetches a fresh ANA log from the ctrl. This is essential to have an up to date path states for both existing namespaces and for those scan_work may discover once the ctrl is up. This happens in the following cases: 1) A new ctrl is being connected. 2) An existing ctrl is successfully reconnected. 3) An existing ctrl is being reset. While in (1) ctrl->namespaces is empty, (2 & 3) may have namespaces, and nvme_read_ana_log() may call nvme_update_ns_ana_state(). This result in a hang when the ANA state of an existing namespace changes and makes the disk live: nvme_mpath_set_live() issues IO to the namespace through the ctrl, which does NOT have IO queues yet. See sample hang below. Solution: - nvme_update_ns_ana_state() to call set_live only if ctrl is live - nvme_read_ana_log() call from nvme_mpath_init_identify() therefore only fetches and parses the ANA log; any erros in this process will fail the ctrl setup as appropriate; - a separate function nvme_mpath_update() is called in nvme_start_ctrl(); this parses the ANA log without fetching it. At this point the ctrl is live, therefore, disks can be set live normally. Sample failure: nvme nvme0: starting error recovery nvme nvme0: Reconnecting in 10 seconds... block nvme0n6: no usable path - requeuing I/O INFO: task kworker/u8:3:312 blocked for more than 122 seconds. Tainted: G E 5.14.5-1.el7.elrepo.x86_64 #1 Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp] Call Trace: __schedule+0x2a2/0x7e0 schedule+0x4e/0xb0 io_schedule+0x16/0x40 wait_on_page_bit_common+0x15c/0x3e0 do_read_cache_page+0x1e0/0x410 read_cache_page+0x12/0x20 read_part_sector+0x46/0x100 read_lba+0x121/0x240 efi_partition+0x1d2/0x6a0 bdev_disk_changed.part.0+0x1df/0x430 bdev_disk_changed+0x18/0x20 blkdev_get_whole+0x77/0xe0 blkdev_get_by_dev+0xd2/0x3a0 __device_add_disk+0x1ed/0x310 device_add_disk+0x13/0x20 nvme_mpath_set_live+0x138/0x1b0 [nvme_core] nvme_update_ns_ana_state+0x2b/0x30 [nvme_core] nvme_update_ana_state+0xca/0xe0 [nvme_core] nvme_parse_ana_log+0xac/0x170 [nvme_core] nvme_read_ana_log+0x7d/0xe0 [nvme_core] nvme_mpath_init_identify+0x105/0x150 [nvme_core] nvme_init_identify+0x2df/0x4d0 [nvme_core] nvme_init_ctrl_finish+0x8d/0x3b0 [nvme_core] nvme_tcp_setup_ctrl+0x337/0x390 [nvme_tcp] nvme_tcp_reconnect_ctrl_work+0x24/0x40 [nvme_tcp] process_one_work+0x1bd/0x360 worker_thread+0x50/0x3d0 Signed-off-by: NAnton Eidelman <anton@lightbitslabs.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chris Leech 提交于
Make nvme_ns_remove match the assumptions elsewhere. 1) !NVME_NS_READY needs to be srcu synchronized to make sure nothing is running in __nvme_find_path or nvme_round_robin_path that will re-assign this ns to current_path. 2) Any matching current_path entries need to be cleared before removing from the siblings list, to prevent calling nvme_round_robin_path with an "old" ns that's off list. 3) Finally the list_del_rcu can happen, and then synchronize again before releasing any reference counts. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sungup Moon 提交于
A NVMe subsystem with multiple controller can have private namespaces that use the same NSID under some conditions: "If Namespace Management, ANA Reporting, or NVM Sets are supported, the NSIDs shall be unique within the NVM subsystem. If the Namespace Management, ANA Reporting, and NVM Sets are not supported, then NSIDs: a) for shared namespace shall be unique; and b) for private namespace are not required to be unique." Reference: Section 6.1.6 NSID and Namespace Usage; NVM Express 1.4c spec. Make sure this specific setup is supported in Linux. Fixes: 9ad1927a ("nvme: always search for namespace head") Signed-off-by: NSungup Moon <sungup.moon@samsung.com> [hch: refactored and fixed the controller vs subsystem based naming conflict] Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Colin Ian King 提交于
The left shift is followed by a re-assignment back to cc_css, the assignment is redundant. Fix this by replacing the "<<=" operator with "<<" instead. This cleans up the clang scan build warning: drivers/nvme/target/core.c:1124:10: warning: Although the value stored to 'cc_css' is used in the enclosing expression, the value is never actually read from 'cc_css' [deadcode.DeadStores] Signed-off-by: NColin Ian King <colin.i.king@gmail.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Sagi Grimberg 提交于
Any attempt to flush kernel-global WQs has possibility of deadlock so we should simply stop using them, instead introduce nvmet_wq which is the generic nvmet workqueue for work elements that don't explicitly require a dedicated workqueue (by the mere fact that they are using the system_wq). Changes were done using the following replaces: - s/schedule_work(/queue_work(nvmet_wq, /g - s/schedule_delayed_work(/queue_delayed_work(nvmet_wq, /g - s/flush_scheduled_work()/flush_workqueue(nvmet_wq)/g Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 23 3月, 2022 3 次提交
-
-
由 Monish Kumar R 提交于
Add quirks to not fail the initialization and to have quick resume latency after cold/warm reboot. Signed-off-by: NMonish Kumar R <monish.kumar.r@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Xin Hao 提交于
Allow reading /sys/module/nvme/parameters/use_threaded_interrupts to see if the use_threaded_interrupts module parameter is in use. Signed-off-by: NXin Hao <xhao@linux.alibaba.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Pankaj Raghav 提交于
commit 2f4c9ba2 ("nvme: export zoned namespaces without Zone Append support read-only") marks zoned namespaces without append support read-only. It does iso by setting NVME_NS_FORCE_RO in ns->flags in nvme_update_zone_info and checking for that flag later in nvme_update_disk_info to mark the disk as read-only. But commit 73d90386 ("nvme: cleanup zone information initialization") rearranged nvme_update_disk_info to be called before nvme_update_zone_info and thus not marking the disk as read-only. The call order cannot be just reverted because nvme_update_zone_info sets certain queue parameters such as zone_write_granularity that depend on the prior call to nvme_update_disk_info. Remove the call to set_disk_ro in nvme_update_disk_info. and call set_disk_ro after nvme_update_zone_info and nvme_update_disk_info to set the permission for ZNS drives correctly. The same applies to the multipath disk path. Fixes: 73d90386 ("nvme: cleanup zone information initialization") Signed-off-by: NPankaj Raghav <p.raghav@samsung.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 16 3月, 2022 3 次提交
-
-
由 Christoph Hellwig 提交于
Start warning about exposing a namespace as multiple block devices, and set a fixed deprecation release. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org>
-
由 Christoph Hellwig 提交于
Just open code the allocation + initialization in the callers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
由 Christoph Hellwig 提交于
They way how assigning the disk name and commenting on why it is done is split over core.c and multipath.c seems to be rather confusing. Now that ns_head->disk always exists we can do all the work in core.c and have a single big comment explaining the issues. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
- 15 3月, 2022 2 次提交
-
-
由 Hannes Reinecke 提交于
Revert commit 626851e9 ("nvmet: make discovery NQN configurable"); the interface was deemed incorrect and will be replaced with a different one. Fixes: 626851e9 ("nvmet: make discovery NQN configurable") Signed-off-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
nvmet_ns_changed states via lockdep that the ns->subsys->lock must be held. The only caller of nvmet_ns_changed which does not acquire that lock is nvmet_ns_revalidate. nvmet_ns_revalidate has 3 callers, of which 2 do not acquire that lock: nvmet_execute_identify_cns_cs_ns and nvmet_execute_identify_ns. The other caller nvmet_ns_revalidate_size_store does acquire the lock. Move the call to nvmet_ns_changed from nvmet_ns_revalidate to the callers so that they can perform the correct locking as needed. This issue was found using a static type-based analyser and manually verified. Reported-by: NNiels Dossche <dossche.niels@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChaitanya Kulkarni <kch@nvidia.com>
-
- 14 3月, 2022 7 次提交
-
-
由 Chaitanya Kulkarni 提交于
Instead of using sprintf, use snprintf with buffer size limited to PAGE_SIZE just like what we have for the rest of the file. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
Don't fold line that can fit into 80 char limit. No functional change in this patch. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
This fixes following kernel-doc warning:- drivers/nvme/target/rdma.c:1722: warning: expecting prototype for nvme_rdma_device_removal(). Prototype was for nvmet_rdma_device_removal() instead Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
This fixes following kernel-doc warning:- drivers/nvme/target/fc.c:1619: warning: expecting prototype for nvme_fc_unregister_targetport(). Prototype was for nvmet_fc_unregister_targetport() instead Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
This fixes following kernel-doc warning :- drivers/nvme/target/fc.c:1365: warning: expecting prototype for nvme_fc_register_targetport(). Prototype was for nvmet_fc_register_targetport() instead Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chris Leech 提交于
Put NVMe/TCP sockets in their own class to avoid some lockdep warnings. Sockets created by nvme-tcp are not exposed to user-space, and will not trigger certain code paths that the general socket API exposes. Lockdep complains about a circular dependency between the socket and filesystem locks, because setsockopt can trigger a page fault with a socket lock held, but nvme-tcp sends requests on the socket while file system locks are held. ====================================================== WARNING: possible circular locking dependency detected 5.15.0-rc3 #1 Not tainted ------------------------------------------------------ fio/1496 is trying to acquire lock: (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80 but task is already holding lock: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] which lock already depends on the new lock. other info that might help us debug this: chain exists of: sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&xfs_dir_ilock_class/5); lock(sb_internal); lock(&xfs_dir_ilock_class/5); lock(sk_lock-AF_INET); *** DEADLOCK *** 6 locks held by fio/1496: #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20 #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20 #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs] #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs] #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0 #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp] This annotation lets lockdep analyze nvme-tcp controlled sockets independently of what the user-space sockets API does. Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/Signed-off-by: NChris Leech <cleech@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Chaitanya Kulkarni 提交于
The call to nvme_tcp_alloc_queue() fits perfectly in one line without exceeding 80 char limit for the line. Signed-off-by: NChaitanya Kulkarni <kch@nvidia.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-