- 14 10月, 2019 5 次提交
-
-
由 Keith Busch 提交于
Prevent simultaneous controller disabling/enabling tasks from interfering with each other through a function to wait until the task successfully transitioned the controller to the RESETTING state. This ensures disabling the controller will not be interrupted by another reset path, otherwise a concurrent reset may leave the controller in the wrong state. Tested-by: NEdmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <kbusch@kernel.org>
-
由 Keith Busch 提交于
A paused controller is doing critical internal activation work in the background. Prevent subsequent controller resets from occurring during this period by setting the controller state to RESETTING first. A helper function, nvme_try_sched_reset_work(), is introduced for these paths so they may continue with scheduling the reset_work after they've completed their uninterruptible critical section. Tested-by: NEdmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <kbusch@kernel.org>
-
由 Keith Busch 提交于
A controller in the resetting state has not yet completed its recovery actions. The pci and fc transports were already handling this, so update the remaining transports to not attempt additional recovery in this state. Instead, just restart the request timer. Tested-by: NEdmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <kbusch@kernel.org>
-
由 Keith Busch 提交于
The admin only state was intended to fence off actions that don't apply to a non-IO capable controller. The only actual user of this is the scan_work, and pci was the only transport to ever set this state. The consequence of having this state is placing an additional burden on every other action that applies to both live and admin only controllers. Remove the admin only state and place the admin only burden on the only place that actually cares: scan_work. This also prepares to make it easier to temporarily pause a LIVE state so that we don't need to remember which state the controller had been in prior to the pause. Tested-by: NEdmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <kbusch@kernel.org>
-
由 Keith Busch 提交于
If a controller becomes degraded after a reset, we will not be able to perform any IO. We currently teardown previously created request queues and namespaces, but we had kept the unusable tagset. Free it after all queues using it have been released. Tested-by: NEdmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NKeith Busch <kbusch@kernel.org>
-
- 05 10月, 2019 2 次提交
-
-
由 Ard Biesheuvel 提交于
Commit 7fd8930f "nvme: add a common helper to read Identify Controller data" has re-introduced an issue that we have attempted to work around in the past, in commit a310acd7 ("NVMe: use split lo_hi_{read,write}q"). The problem is that some PCIe NVMe controllers do not implement 64-bit outbound accesses correctly, which is why the commit above switched to using lo_hi_[read|write]q for all 64-bit BAR accesses occuring in the code. In the mean time, the NVMe subsystem has been refactored, and now calls into the PCIe support layer for NVMe via a .reg_read64() method, which fails to use lo_hi_readq(), and thus reintroduces the problem that the workaround above aimed to address. Given that, at the moment, .reg_read64() is only used to read the capability register [which is known to tolerate split reads], let's switch .reg_read64() to lo_hi_readq() as well. This fixes a boot issue on some ARM boxes with NVMe behind a Synopsys DesignWare PCIe host controller. Fixes: 7fd8930f ("nvme: add a common helper to read Identify Controller data") Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
nvme_update_formats may fail to revalidate the namespace and attempt to remove the namespace. This may lead to a deadlock as nvme_ns_remove will attempt to acquire the subsystem lock which is already acquired by the passthru command with effects. Move the invalid namepsace removal to after the passthru command releases the subsystem lock. Reported-by: NJudy Brock <judy.brock@samsung.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 28 9月, 2019 1 次提交
-
-
由 Sagi Grimberg 提交于
If the connect times out, we may have already destroyed the queue in the timeout handler, so test if the queue is still allocated in the connect error handler. Reported-by: NYi Zhang <yi.zhang@redhat.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 27 9月, 2019 1 次提交
-
-
由 Keith Busch 提交于
This isn't specific to fabrics. Signed-off-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 26 9月, 2019 9 次提交
-
-
由 James Smart 提交于
Current controller interrogation requires a lot of guesswork on how many io queues were created and what the io sq size is. The numbers are dependent upon core/fabric defaults, connect arguments, and target responses. Add sysfs attributes for queue_count and sqsize. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Marta Rybczynska 提交于
It is not possible to get 64-bit results from the passthru commands, what prevents from getting for the Capabilities (CAP) property value. As a result, it is not possible to implement IOL's NVMe Conformance test 4.3 Case 1 for Fabrics targets [1] (page 123). This issue has been already discussed [2], but without a solution. This patch solves the problem by adding new ioctls with a new passthru structure, including 64-bit results. The older ioctls stay unchanged. [1] https://www.iol.unh.edu/sites/default/files/testsuites/nvme/UNH-IOL_NVMe_Conformance_Test_Suite_v11.0.pdf [2] http://lists.infradead.org/pipermail/linux-nvme/2018-June/018791.htmlSigned-off-by: NMarta Rybczynska <marta.rybczynska@kalray.eu> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Jian-Hong Pan 提交于
Kingston NVME SSD with firmware version E8FK11.T has no interrupt after resume with actions related to suspend to idle. This patch applied NVME_QUIRK_SIMPLE_SUSPEND quirk to fix this issue. Fixes: d916b1be ("nvme-pci: use host managed power state for suspend") Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204887Signed-off-by: NJian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
Now that sgl_free is null safe, drop the superflous check. Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Gabriel Craciunescu 提交于
Booting with default_ps_max_latency_us >6000 makes the device fail. Also SUBNQN is NULL and gives a warning on each boot/resume. $ nvme id-ctrl /dev/nvme0 | grep ^subnqn subnqn : (null) I use this device with an Acer Nitro 5 (AN515-43-R8BF) Laptop. To be sure is not a Laptop issue only, I tested the device on my server board with the same results. ( with 2x,4x link on the board and 4x link on a PCI-E card ). Signed-off-by: NGabriel Craciunescu <nix.or.die@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Max Gurtovoy 提交于
By default, the NVMe/RDMA driver should support max io_size of 1MiB (or upto the maximum supported size by the HCA). Currently, one will see that /sys/class/block/<bdev>/queue/max_hw_sectors_kb is 1020 instead of 1024. A non power of 2 value can cause performance degradation due to unnecessary splitting of IO requests and unoptimized allocation units. The number of pages per MR has been fixed here, so there is no longer any need to reduce max_sectors by 1. Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Dan Carpenter 提交于
"ret" should be a negative error code here, but it's either success or possibly uninitialized. Fixes: 32fd90c4 ("nvme: change locking for the per-subsystem controller list") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Mario Limonciello 提交于
The action of saving the PCI state will cause numerous PCI configuration space reads which depending upon the vendor implementation may cause the drive to exit the deepest NVMe state. In these cases ASPM will typically resolve the PCIe link state and APST may resolve the NVMe power state. However it has also been observed that this register access after quiesced will cause PC10 failure on some device combinations. To resolve this, move the PCI state saving to before SetFeatures has been called. This has been proven to resolve the issue across a 5000 sample test on previously failing disk/system combinations. Signed-off-by: NMario Limonciello <mario.limonciello@dell.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Wunderlich, Mark 提交于
Allow the do/while statement to continue if current time is not after the proposed time 'deadline'. Intent is to allow loop to proceed for a specific time period. Currently the loop, as coded, will exit after first pass. Signed-off-by: NMark Wunderlich <mark.wunderlich@intel.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 24 9月, 2019 2 次提交
-
-
由 Balbir Singh 提交于
User space programs like udevd may try to read to partitions at the same time the driver detects a namespace is unusable, and may deadlock if revalidate_disk() is called while such a process is waiting to enter the frozen queue. On detecting a dead namespace, move the disk revalidate after unblocking dispatchers that may be holding bd_butex. changelog Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: NBalbir Singh <sblbir@amzn.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 John Pittman 提交于
In nvmet_bdev_set_limits() the number of logical blocks per physical block is calculated, but the opposite is mentioned in the associated comment and reflected in the variable name. Correct the comment and adjust the variable name to reflect the calculation done. Signed-off-by: NJohn Pittman <jpittman@redhat.com> Reviewed-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 18 9月, 2019 1 次提交
-
-
由 Max Gurtovoy 提交于
Currently t10_pi_prepare/t10_pi_complete functions are called during the NVMe and SCSi layers command preparetion/completion, but their actual place should be the block layer since T10-PI is a general data integrity feature that is used by block storage protocols. Introduce .prepare_fn and .complete_fn callbacks within the integrity profile that each type can implement according to its needs. Suggested-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChristoph Hellwig <hch@lst.de> Suggested-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Fixed to not call queue integrity functions if BLK_DEV_INTEGRITY isn't defined in the config. Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 12 9月, 2019 18 次提交
-
-
由 Amit 提交于
When the command data_len cannot hold all the controller errors, we should simply return as much errors as we can fit instead of failing the command. Signed-off-by: NAmit Engel <amit.engel@dell.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
If the controller supports discovery log page change events, we want to enable it. When we see a discovery log change event we will send it up to userspace and expect it to handle it. Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
When we send uevents to userspace, add controller specific environment variables to uniquly identify the controller beyond its device name. This will be useful to address discovery log change events by actually verifying that the discovery controller is indeed the same as the device that generated the event. Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
AENs in general are not related to the presence of I/O queues, so enable them regardless. Note that the only exception is that discovery controller will not support any of the requested AENs and nvme_enable_aen will respect that and return, so it is still safe to enable regardless. Note it is safe to enable AENs even before the initial namespace scanning as we have the scan operation in a workqueue context. Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
This modifies the behavior of discovery subsystems to accept a kato as a preparation to support discovery log change events. This also means that now every discovery controller will have a default kato value, and for non-persistent connections the host needs to pass in a zero kato value (keep_alive_tmo=0). Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Markus Elfring 提交于
Simplify this function implementation by using a known function. Generated by: scripts/coccinelle/api/ptr_ret.cocci Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Israel Rukshin 提交于
The cq vector is already assigned with the correct value. Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Keith Busch 提交于
The namespace disk names must be unique for the lifetime of the subsystem. This was accomplished by using their parent subsystems' instances which were allocated independently from the controllers connected to that subsystem. This allowed name prefixes assigned to namespaces to match a controller from an unrelated subsystem, and has created confusion among users examining device nodes. Ensure a namespace's subsystem instance never clashes with a controller instance of another subsystem by transferring the instance ownership to the parent subsystem from the first controller discovered in that subsystem. Reviewed-by: NLogan Gunthorpe <logang@deltatee.com> Signed-off-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMinwoo Im <minwoo.im@samsung.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Colin Ian King 提交于
The variable ret is being initialized with a value that is never read and is being re-assigned immediately afterwards. The assignment is redundant and hence can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: NColin Ian King <colin.king@canonical.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Edmund Nadolski 提交于
nvme_sync_queues currently syncs all namespace queues, but should also sync the admin queue, if present. Signed-off-by: NEdmund Nadolski <edmund.nadolski@intel.com> Reviewed-by: NKeith Busch <kbusch@kernel.org> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 James Smart 提交于
Current code matches subnqn and collapses all controllers to the same subnqn to a single subsystem structure. This is good for recognizing multiple controllers for the same subsystem. But with the well-known discovery subnqn, the subsystems aren't truly the same subsystem. As such, subsystem specific rules, such as no overlap of controller id, do not apply. With today's behavior, the check for overlap of controller id can fail, preventing the new discovery controller from being created. When searching for like subsystem nqn, exclude the discovery nqn from matching. This will result in each discovery controller being attached to a unique subsystem structure. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
If a controller reset is racing with a namespace revalidation, the revalidation (admin) I/O will surely fail, but we should not remove the namespace as we will execute the I/O when the controller is back up. Same for spurious allocation errors (return -ENOMEM). Fix this by checking the specific error code in nvme_revalidate_disk and if it is a transient error (for example non DNR nvme statuses or a negative ENOMEM as allocation failure), do not remove the namespace as it will either recover when the controller is back up and schedule a subsequent scan, or the controller is going away and the namespaces will be removed anyways. This fixes a hang namespace scanning racing with a controller reset and also sporious I/O errors in path failover coditions where the controller reset is racing with the namespace scan work with multipath enabled. Reported-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
Make the callers check the return status and propagate back accordingly (casting to errno from a positive nvme status). Also print the return status in nvme_report_ns_ids. Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
right now callers of nvme_identify_ns only know that it failed, but don't know why. Make nvme_identify_ns propagate the error back. Because nvme_submit_sync_cmd may return a positive status code, we make nvme_identify_ns receive the id by reference and return that status up the call chain, but make sure not to leak positive nvme status codes to the upper layers. Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
No need for the full blown request structure. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 James Smart 提交于
NVME_SC_INTERNAL should indicate an internal controller errors and not host transport errors. These errors will propagate to upper layers (essentially nvme core) and be interpereted as transport errors which should not be taken into account for namespace state or condition. Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
This is a more appropriate error status for a transport error detected by us (the host). Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Sagi Grimberg 提交于
NVME_SC_ABORT_REQ means that the request was aborted due to an abort command received. In our case, this is a transport cancellation, so host pathing error is much more appropriate. Also, convert NVME_SC_HOST_PATH_ERROR to BLK_STS_TRANSPORT for such that callers can understand that the status is a transport related error. This will be used by the ns scanning code to understand if it got an error from the controller or that the controller happens to be unreachable by the transport. Reviewed-by: NMinwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 30 8月, 2019 1 次提交
-
-
由 Israel Rukshin 提交于
Remove code duplication. Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-