- 19 12月, 2018 1 次提交
-
-
由 Sagi Grimberg 提交于
Preparation for polling support for fabrics. Polling support means that our completion queues are not generating any interrupts which means we need to poll for the nvmf io queue connect as well. Reviewed by Steve Wise <swise@opengridcomputing.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 08 12月, 2018 1 次提交
-
-
由 Hannes Reinecke 提交于
Instead of directly poking into the struct device add a new numa_node field to struct nvme_ctrl. This allows fabrics drivers where ctrl->dev is a virtual device to support NUMA affinity as well. Also expose the field as a sysfs attribute, and populate it for the RDMA and FC transports. Signed-off-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 11月, 2018 1 次提交
-
-
由 Ewan D. Milne 提交于
__nvme_fc_init_request() invokes memset() on the nvme_fcp_op_w_sgl structure, which NULLed-out the nvme_req(req)->ctrl field previously set by nvme_fc_init_request(). This apparently was not referenced until commit faf4a44fff ("nvme: support traffic based keep-alive") which now results in a crash in nvme_complete_rq(): [ 8386.897130] RIP: 0010:panic+0x220/0x26c [ 8386.901406] Code: 83 3d 6f ee 72 01 00 74 05 e8 e8 54 02 00 48 c7 c6 40 fd 5b b4 48 c7 c7 d8 8d c6 b3 31e [ 8386.922359] RSP: 0018:ffff99650019fc40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 8386.930804] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000006 [ 8386.938764] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8e325f8168b0 [ 8386.946725] RBP: ffff99650019fcb0 R08: 0000000000000000 R09: 00000000000004f8 [ 8386.954687] R10: 0000000000000000 R11: ffff99650019f9b8 R12: ffffffffb3c55f3c [ 8386.962648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [ 8386.970613] oops_end+0xd1/0xe0 [ 8386.974116] no_context+0x1b2/0x3c0 [ 8386.978006] do_page_fault+0x32/0x140 [ 8386.982090] page_fault+0x1e/0x30 [ 8386.985786] RIP: 0010:nvme_complete_rq+0x65/0x1d0 [nvme_core] [ 8386.992195] Code: 41 bc 03 00 00 00 74 16 0f 86 c3 00 00 00 66 3d 83 00 41 bc 06 00 00 00 0f 85 e7 00 000 [ 8387.013147] RSP: 0018:ffff99650019fe18 EFLAGS: 00010246 [ 8387.018973] RAX: 0000000000000000 RBX: ffff8e322ae51280 RCX: 0000000000000001 [ 8387.026935] RDX: 0000000000000400 RSI: 0000000000000001 RDI: ffff8e322ae51280 [ 8387.034897] RBP: ffff8e322ae51280 R08: 0000000000000000 R09: ffffffffb2f0b890 [ 8387.042859] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 8387.050821] R13: 0000000000000100 R14: 0000000000000004 R15: ffff8e2b0446d990 [ 8387.058782] ? swiotlb_unmap_page+0x40/0x40 [ 8387.063448] nvme_fc_complete_rq+0x2d/0x70 [nvme_fc] [ 8387.068986] blk_done_softirq+0xa1/0xd0 [ 8387.073264] __do_softirq+0xd6/0x2a9 [ 8387.077251] run_ksoftirqd+0x26/0x40 [ 8387.081238] smpboot_thread_fn+0x10e/0x160 [ 8387.085807] kthread+0xf8/0x130 [ 8387.089309] ? sort_range+0x20/0x20 [ 8387.093198] ? kthread_stop+0x110/0x110 [ 8387.097475] ret_from_fork+0x35/0x40 [ 8387.101462] ---[ end trace 7106b0adf5e422f8 ]--- Fixes: faf4a44fff ("nvme: support traffic based keep-alive") Signed-off-by: NEwan D. Milne <emilne@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 20 11月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
It's specifically looking for a given request, which we will not be supporting going forward. Also kill the qla2xxx poll implementation as that's the only user of the nvme-fc poll, and the now unused ->poll_queue() hook. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 11月, 2018 1 次提交
-
-
由 James Smart 提交于
If an io error occurs on an io issued while connecting, recovery of the io falls flat as the state checking ends up nooping the error handler. Create an err_work work item that is scheduled upon an io error while connecting. The work thread terminates all io on all queues and marks the queues as not connected. The termination of the io will return back to the callee, which will then back out of the connection attempt and will reschedule, if possible, the connection attempt. The changes: - in case there are several commands hitting the error handler, a state flag is kept so that the error work is only scheduled once, on the first error. The subsequent errors can be ignored. - The calling sequence to stop keep alive and terminate the queues and their io is lifted from the reset routine. Made a small service routine used by both reset and err_work. - During debugging, found that the teardown path can reference an uninitialized pointer, resulting in a NULL pointer oops. The aen_ops weren't initialized yet. Add validation on their initialization before calling the teardown routine. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 09 11月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
We have this functionality in sbitmap, but we don't export it in blk-mq for users of the tags busy iteration. This can be useful for stopping the iteration, if the caller doesn't need to find more requests. Reviewed-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 11月, 2018 1 次提交
-
-
由 James Smart 提交于
The patch made to avoid Coverity reporting of out of bounds access on aen_op moved the assignment of a pointer, leaving it null when it was subsequently used to calculate a private pointer. Thus the private pointer was bad. Move/correct the private pointer initialization to be in sync with the patch. Fixes: 0d2bdf9f ("nvme-fc: rework the request initialization code") Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 10月, 2018 3 次提交
-
-
由 Bart Van Assche 提交于
Instead of setting and then clearing the first_sgl pointer for AEN requests, leave that pointer zero. This patch does not change how requests are initialized but avoids that Coverity reports the following complaint for nvme_fc_init_aen_ops(): CID 1418400 (#1 of 1): Out-of-bounds access (OVERRUN) 4. overrun-buffer-val: Overrunning buffer pointed to by aen_op of 312 bytes by passing it to a function which accesses it at byte offset 312. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Bart Van Assche 提交于
This patch does not change any functionality but makes the intent of the code more clear. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Bart Van Assche 提交于
This patch avoids that the kernel-doc tool complains about several multiple function headers when building with W=1. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 02 10月, 2018 2 次提交
-
-
由 James Smart 提交于
The fc transport device should allow for a rediscovery, as userspace might have lost the events. Example is udev events not handled during system startup. This patch add a sysfs entry 'nvme_discovery' on the fc class to have it replay all udev discovery events for all local port/remote port address pairs. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Milan P. Gandhi 提交于
Signed-off-by: NMilan P. Gandhi <mgandhi@redhat.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 24 7月, 2018 1 次提交
-
-
由 James Smart 提交于
The revised if_ready checks skipped over the case of returning error when the controller is being deleted. Instead it was returning BUSY, which caused the ios to retry, which caused the ns delete to hang waiting for the ios to drain. Stack trace of hang looks like: kworker/u64:2 D 0 74 2 0x80000000 Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core] Call Trace: ? __schedule+0x26d/0x820 schedule+0x32/0x80 blk_mq_freeze_queue_wait+0x36/0x80 ? remove_wait_queue+0x60/0x60 blk_cleanup_queue+0x72/0x160 nvme_ns_remove+0x106/0x140 [nvme_core] nvme_remove_namespaces+0x7e/0xa0 [nvme_core] nvme_delete_ctrl_work+0x4d/0x80 [nvme_core] process_one_work+0x160/0x350 worker_thread+0x1c3/0x3d0 kthread+0xf5/0x130 ? process_one_work+0x350/0x350 ? kthread_bind+0x10/0x10 ret_from_fork+0x1f/0x30 Extend nvmf_fail_nonready_command() to supply the controller pointer so that the controller state can be looked at. Fail any io to a controller that is deleting. Fixes: 3bc32bb1 ("nvme-fabrics: refactor queue ready check") Fixes: 35897b92 ("nvme-fabrics: fix and refine state checks in __nvmf_check_ready") Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NEwan D. Milne <emilne@redhat.com> Reviewed-by: NEwan D. Milne <emilne@redhat.com>
-
- 23 7月, 2018 1 次提交
-
-
由 Sagi Grimberg 提交于
We will need to reference the controller in the setup and completion time for tracing and future traffic based keep alive support. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 21 6月, 2018 1 次提交
-
-
由 James Smart 提交于
Rather than leaving io queues quiesced after tearing down an association, restart them. This allows ios to be replayed, with fastfail ios terminating and non-fastfail getting into loops of retry. This follows rdma's lead. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimber.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 15 6月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Move the is_connected check to the fibre channel transport, as it has no meaning for other transports. To facilitate this split out a new nvmf_fail_nonready_command helper that is called by the transport when it is asked to handle a command on a queue that is not ready. Also avoid a function call for the queue live fast path by inlining the check. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Smart <james.smart@broadcom.com>
-
- 14 6月, 2018 3 次提交
-
-
由 James Smart 提交于
The reconnect path is calling the init routines to clear a queue structure. But the queue structure has state that perhaps needs to persist as long as the controller is live. Remove the nvme_fc_init_queue() calls on reconnect. The nvme_fc_free_queue() calls will clear state bits and reset any relevant queue state for a new connection. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
The reinit_request routine is not necessary. Remove support for the op callback. As all that nvme_reinit_tagset() does is itterate and call the reinit routine, it too has no purpose. Remove the call. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Current code follows the framework that has been in the transports from the beginning where initial link-side controller connect occurs as part of "creating the controller". Thus that first connect fully talks to the controller and obtains values that can then be used in for blk-mq setup, etc. It also means that everything about the controller is fully know before the "create controller" call returns. This has several weaknesses: - The initial create_ctrl call made by the cli will block for a long time as wire transactions are performed synchronously. This delay becomes longer if errors occur or connectivity is lost and retries need to be performed. - Code wise, it means there is a separate connect path for initial controller connect vs the (same) steps used in the reconnect path. - And as there's separate paths, it means there's separate error handling and retry logic. It also plays havoc with the NEW state (should transition out of it after successful initial connect) vs the RESETTING and CONNECTING (reconnect) states that want to be transitioned to on error. - As there's separate paths, to recover from errors and disruptions, it requires separate recovery/retry paths as well and can severely convolute the controller state. This patch reworks the fc transport to use the same connect paths for the initial connection as it uses for reconnect. This makes a single path for error recovery and handling. This patch: - Removes the driving of the initial connect and replaces it with a state transition to CONNECTING and initiating the reconnect thread. A dummy state transition of RESETTING had to be traversed as a direct transtion of NEW->CONNECTING is not allowed. Given that the controller is "new", the RESETTING transition is a simple no-op. Once in the reconnecting thread, the normal behaviors of ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will apply before the controller is torn down. - Only if the state transitions couldn't be traversed and the reconnect thread not scheduled, will the controller be torn down while in create_ctrl. - The prior code used the controller state of NEW to indicate whether request queues had been initialized or not. For the admin queue, the request queue is always created, so there's no need to check a state. For IO queues, change to tracking whether a successful io request queue create has occurred (e.g. 1st successful connect). - The initial controller id is initialized to the dynamic controller id used in the initial connect message. It will be overwritten by the real controller id once the controller is connected on the wire. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 31 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
We already check for started commands in all callbacks, but we should also protect against already completed commands. Do this by taking the checks to common code. Acked-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 5月, 2018 1 次提交
-
-
由 James Smart 提交于
Current code will set DNR if the controller is deleting or there is an error during controller init. None of this is necessary. Remove the code that sets DNR Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 27 4月, 2018 1 次提交
-
-
由 Johannes Thumshirn 提交于
Provide a descriptive error in case an lport to rport association isn't found when creating the FC-NVME controller. Currently it's very hard to debug the reason for a failed connect attempt without a look at the source. Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NKeith Busch <keith.busch@intel.com>
-
- 12 4月, 2018 1 次提交
-
-
由 James Smart 提交于
The nvmf_check_if_ready() checks that were added are very simplistic. As such, the routine allows a lot of cases to fail ios during windows of reset or re-connection. In cases where there are not multi-path options present, the error goes back to the callee - the filesystem or application. Not good. The common routine was rewritten and calling syntax slightly expanded so that per-transport is_ready routines don't need to be present. The transports now call the routine directly. The routine is now a fabrics routine rather than an inline function. The routine now looks at controller state to decide the action to take. Some states mandate io failure. Others define the condition where a command can be accepted. When the decision is unclear, a generic queue-or-reject check is made to look for failfast or multipath ios and only fails the io if it is so marked. Otherwise, the io will be queued and wait for the controller state to resolve. Admin commands issued via ioctl share a live admin queue with commands from the transport for controller init. The ioctls could be intermixed with the initialization commands. It's possible for the ioctl cmd to be issued prior to the controller being enabled. To block this, the ioctl admin commands need to be distinguished from admin commands used for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to reflect this division and set it on ioctls requests. As the nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(), ensure that commands allocated by the ioctl path (actually anything in core.c) preps the nvme_req(req) before starting the io. This will preserve the USERCMD flag during execution and/or retry. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.e> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 3月, 2018 5 次提交
-
-
由 James Smart 提交于
When reattaching to a removed remoteport that has not yet been fully deleted as it's waiting for reconnect timeouts, be sure to re-set the ports nport id and role. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
Another abort race: An io request is started, becomes active, and is attempted to be started with the lldd. At the same time the controller is stopped/torndown and an itterator is run to abort the ios. As the io is active, it is added to the outstanding aborted io count. However on the original io request thread, the driver ends up rejecting the io due to the condition that induced the controller teardown. The driver reject path didn't check whether it was in the outstanding io count. This left the count outstanding stopping controller teardown. Correct by, in the driver reject case, setting the state to inactive and checking whether it was in the outstanding io count. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
The current nvme_fc code, when an io times out, will abort the io on the fc link, then call the error recovery routine to reset the controller. It is during the reset of the controller that the transport will wait for all ios to be aborted before sending a Disconnect LS to the target. However, the reset routine only waits for the io which it generates the abort for to complete. Any io that was aborted just prior to the reset isn't in it's list to wait for. Thus the Disconnect is getting sent before the aborts have completed. Correct by removing the abort in the timeout handler. The reset will generate the abort. At that point the timeout handler can be simplified to request the reset (via the error handler) and restart the timeout timer. Also fixes a small typo in a comment in the reset handler. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
If there are errors during initial controller create, the transport will teardown the partially initialized controller struct and free the ctlr memory. Trouble is - most of those errors can occur due to asynchronous events happening such io timeouts and subsystem connectivity failures. Those failures invoke async workq items to reset the controller and attempt reconnect. Those may be in progress as the main thread frees the ctrl memory, resulting in NULL ptr oops. Prevent this from happening by having the main ctrl failure thread changing state to DELETING followed by synchronously cancelling any pending queued work item. The change of state will prevent the scheduling of resets or reconnect events. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Max Gurtovoy 提交于
nvme_delete_ctrl can be called from various contexts in parallel, and cause duplicated information prints, even though the specific context doesn't perform the actual removal. Instead, print the information when the actual removal occurs. Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 3月, 2018 1 次提交
-
-
由 James Smart 提交于
Corrected four outstanding issues in the transport around sqsize. 1: Create Connection LS is sending the 1's-based sqsize, should be sending the 0's-based value. 2: allocation of hw queue is using the 0's-base size. It should be using the 1's-based value. 3: normalization of ctrl.sqsize by MQES is using MQES+1 (1's-based value). It should be MQES (0's-based value). 4: Missing clause to ensure queue_count not larger than ctrl->sqsize. Corrected by: Clean up routines that pass queue size around. The queue size value is the actual count (1's-based) value and determined from ctrl->sqsize + 1. Routines that send 0's-based value adapt from queue size. Sset ctrl->sqsize properly for MQES. Added clause to nsure queue_count not larger than ctrl->sqsize + 1. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NKeith Busch <keith.busch@intel.com>
-
- 11 2月, 2018 2 次提交
-
-
由 James Smart 提交于
There was some old cold that dealt with complete_rq being called prior to the lldd returning the io completion. This is garbage code. The complete_rq routine was being called after eh_timeouts were called and it was due to eh_timeouts not being handled properly. The timeouts were fixed in prior patches so that in general, a timeout will initiate an abort and the reset timer restarted as the abort operation will take care of completing things. Given the reset timer restarted, the erroneous complete_rq calls were eliminated. So remove the work that was synchronizing complete_rq with io completion. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 James Smart 提交于
During reset handling, there is live io completing while the reset is taking place. The reset path attempts to abort all outstanding io, counting the number of ios that were reset. It then waits for those ios to be reclaimed from the lldd before continuing. The transport's logic on io state and flag setting was poor, allowing ios to complete simultaneous to the abort request. The completed ios were counted, but as the completion had already occurred, the completion never reduced the count. As the count never zeros, the reset/delete never completes. Tighten it up by unconditionally changing the op state to completed when the io done handler is called. The reset/abort path now changes the op state to aborted, but the abort only continues if the op state was live priviously. If complete, the abort is backed out. Thus proper counting of io aborts and their completions is working again. Also removed the TERMIO state on the op as it's redundant with the op's aborted state. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 09 2月, 2018 1 次提交
-
-
由 Max Gurtovoy 提交于
In pci transport, this state is used to mark the initialization process. This should be also used in other transports as well. Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 31 1月, 2018 1 次提交
-
-
由 Ming Lei 提交于
This status is returned from driver to block layer if device related resource is unavailable, but driver can guarantee that IO dispatch will be triggered in future when the resource is available. Convert some drivers to return BLK_STS_DEV_RESOURCE. Also, if driver returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls. BLK_MQ_DELAY_QUEUE is 3 ms because both scsi-mq and nvmefc are using that magic value. If a driver can make sure there is in-flight IO, it is safe to return BLK_STS_DEV_RESOURCE because: 1) If all in-flight IOs complete before examining SCHED_RESTART in blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue is run immediately in this case by blk_mq_dispatch_rq_list(); 2) if there is any in-flight IO after/when examining SCHED_RESTART in blk_mq_dispatch_rq_list(): - if SCHED_RESTART isn't set, queue is run immediately as handled in 1) - otherwise, this request will be dispatched after any in-flight IO is completed via blk_mq_sched_restart() 3) if SCHED_RESTART is set concurently in context because of BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two cases and make sure IO hang can be avoided. One invariant is that queue will be rerun if SCHED_RESTART is set. Suggested-by: NJens Axboe <axboe@kernel.dk> Tested-by: NLaurence Oberman <loberman@redhat.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 18 1月, 2018 2 次提交
-
-
由 James Smart 提交于
When connectivity is lost to a device, the association is terminated and the blk-mq queues are quiesced/stopped. When connectivity is re-established, they are resumed. If connectivity is lost for a sufficient amount of time that the controller is then deleted, the delete path starts tearing down queues, and eventually calling nvme_ns_remove(). It appears that pending commands may cause blk_cleanup_queue() to never complete and the teardown stalls. Correct by starting the ns queues after transitioning to a DELETING state, allowing pending commands to be flushed with io failures. Thus the delete path is clear when reached. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
When connectivity is lost to a device, the association is terminated and the blk-mq queues are quiesced/stopped. When connectivity is re-established, they are resumed. If an admin command is received while connectivity is list, the ioctl queues the command on the admin_q and the command stalls (the thread issuing the ioctl hangs/waits). if the connectivity is lost long enough such that the controller is then deleted, the delete code makes its calls to initiate the delete, which then expects the core layer to call the transport when all references are removed and the controller can be freed. Unfortunately, nothing in this path dequeued the admin command, so a reference sits outstanding and things stop, hanging the delete indefinitely. Correct by unquiescing the admin queue in the delete association. This means any admin command (which should only be from an ioctl) issued after connectivity is lost will detect the controller is in a reconnecting state and will (fast) fail the command. Thus, a pending reference can no longer be created. Once connectivity is re-established, a new ioctl/admin command would see proper device state and function again. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 08 1月, 2018 1 次提交
-
-
由 Roy Shterman 提交于
NVMe transport driver module unload may (and usually does) trigger iteration over the active controllers and delete them all (sometimes under a mutex). However, a controller can be created concurrently with module unload which can lead to leakage of resources (most important char device node leakage) in case the controller creation occured after the unload delete and drain sequence. To protect against this, we take a module reference to guarantee that the nvme transport driver is not unloaded while creating a controller. Signed-off-by: NRoy Shterman <roys@lightbitslabs.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 15 12月, 2017 1 次提交
-
-
由 James Smart 提交于
There are two put references in the failure case of initial create_association. The first put actually frees the controller, thus the second put references freed memory. Remove the unnecessary 2nd put. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 25 11月, 2017 1 次提交
-
-
由 Jens Axboe 提交于
So far harmless, but it's confusing and a bug waiting to happen if the shifts grow larger than 4. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 11月, 2017 1 次提交
-
-
由 Sagi Grimberg 提交于
In case the queue is not LIVE (fully functional and connected at the nvmf level), we cannot allow any commands other than connect to pass through. Add a new queue state flag NVME_FC_Q_LIVE which is set after nvmf connect and cleared in queue teardown. Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 11 11月, 2017 1 次提交
-
-
由 Keith Busch 提交于
The driver can handle tracking only one AEN request, so this patch removes handling for multiple ones. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-