- 13 5月, 2019 1 次提交
-
-
由 Hannes Reinecke 提交于
When tearing down a controller the following warning is issued: WARNING: CPU: 0 PID: 30681 at ../kernel/workqueue.c:2418 check_flush_dependency This happens as the err_work workqueue item is scheduled on the system workqueue (which has WQ_MEM_RECLAIM not set), but is flushed from a workqueue which has WQ_MEM_RECLAIM set. Fix this by providing an FC-NVMe specific workqueue. Fixes: 4cff280a ("nvme-fc: resolve io failures during connect") Signed-off-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 13 4月, 2019 1 次提交
-
-
由 Hannes Reinecke 提交于
Currently the FC-NVMe driver is leverating the SCSI FC transport class to access the remote ports. Which means that all FC-NVMe remote ports will be visible to the fc transport layer, but due to missing definitions the port roles will always be 'unknown'. This patch adds the missing definitions to the fc transport class to that the port roles are correctly displayed. Signed-off-by: NHannes Reinecke <hare@suse.de> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NEwan D. Milne <emilne@redhat.com> Reviewed-by: NGiridhar Malavali <gmalavali@marvell.com> Reviewed-by: NHimanshu Madhani <hmadhani@marvell.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 11 4月, 2019 1 次提交
-
-
由 James Smart 提交于
This patch fixes a long-standing bug that initialized the FC-NVME cmnd iu CSN value to 1. Early FC-NVME specs had the connection starting with CSN=1. By the time the spec reached approval, the language had changed to state a connection should start with CSN=0. This patch corrects the initialization value for FC-NVME connections. Additionally, in reviewing the transport, the CSN value is assigned to the new IU early in the start routine. It's possible that a later dma map request may fail, causing the command to never be sent to the controller. Change the location of the assignment so that it is immediately prior to calling the lldd. Add a comment block to explain the impacts if the lldd were to additionally fail sending the command. Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NEwan D. Milne <emilne@redhat.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 14 3月, 2019 3 次提交
-
-
由 James Smart 提交于
If: - A successful connect has occurred with an io queue count greater than zero and namespaces detected and running. - An error or something occurs which causes a termination of the prior association and then starts a reconnect, - The reconnect then creates a new controller, but for whatever reason, nvme_set_queue_count() results in io queue count set to zero. This will skip io queue and tag set changes. - But... the controller will transition to live, calling nvme_start_ctrl, which calls nvme_start_queues(), which then releases I/Os into the transport which then sends them to the driver. As there are no queues, things eventually hit the driver looking for a handle, which was cleared when the original controller was reset, and it can't proceed. Worst case, things progress, but everything fails. In the failing scenario, the nvme_set_features(NVME_FEAT_NUM_QUEUES) command actually failed with a NVME_SC_INTERNAL error. For some reason, although nvme_set_queue_count() saw the error and set io queue count to zero, it doesn't return a failure status to the transport, which allows the transport to continue using the controller. Fix the problem by simply rejecting the new association if at least 1 I/O queue can't be created. The association reject will fail the reconnect attempt and fall into the reconnect retry policy. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
A recent change added a numa_node field to the nvme controller and has the transport assign the node using dev_to_node(). However, fcloop registers with a NULL device struct, so the dev_to_node() call oops. Revise the assignment to assign no node when device struct is null. Fixes: 103e515e ("nvme: add a numa_node field to struct nvme_ctrl") Reported-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NMike Snitzer <snitzer@redhat.com> [hch: small coding style fixup] Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
For some nvme command, when issued by the nvme core layer, there is an internal buffer which can cause blk_rq_payload_bytes() to return a non-zero value yet there is no actual/real command payload and sg list. An example is the WRITE ZEROES command. To address this, when making choices on whether to dma map an sgl, use blk_rq_nr_phys_segments() instead of blk_rq_payload_bytes(). When there is a sgl, blk_rq_payload_bytes() will return the amount of data to be transferred by the sgl. Signed-off-by: NChaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 2月, 2019 1 次提交
-
-
由 Christoph Hellwig 提交于
Update license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
- 19 12月, 2018 1 次提交
-
-
由 Sagi Grimberg 提交于
Preparation for polling support for fabrics. Polling support means that our completion queues are not generating any interrupts which means we need to poll for the nvmf io queue connect as well. Reviewed by Steve Wise <swise@opengridcomputing.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 08 12月, 2018 1 次提交
-
-
由 Hannes Reinecke 提交于
Instead of directly poking into the struct device add a new numa_node field to struct nvme_ctrl. This allows fabrics drivers where ctrl->dev is a virtual device to support NUMA affinity as well. Also expose the field as a sysfs attribute, and populate it for the RDMA and FC transports. Signed-off-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 28 11月, 2018 1 次提交
-
-
由 Ewan D. Milne 提交于
__nvme_fc_init_request() invokes memset() on the nvme_fcp_op_w_sgl structure, which NULLed-out the nvme_req(req)->ctrl field previously set by nvme_fc_init_request(). This apparently was not referenced until commit faf4a44fff ("nvme: support traffic based keep-alive") which now results in a crash in nvme_complete_rq(): [ 8386.897130] RIP: 0010:panic+0x220/0x26c [ 8386.901406] Code: 83 3d 6f ee 72 01 00 74 05 e8 e8 54 02 00 48 c7 c6 40 fd 5b b4 48 c7 c7 d8 8d c6 b3 31e [ 8386.922359] RSP: 0018:ffff99650019fc40 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 8386.930804] RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000006 [ 8386.938764] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8e325f8168b0 [ 8386.946725] RBP: ffff99650019fcb0 R08: 0000000000000000 R09: 00000000000004f8 [ 8386.954687] R10: 0000000000000000 R11: ffff99650019f9b8 R12: ffffffffb3c55f3c [ 8386.962648] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [ 8386.970613] oops_end+0xd1/0xe0 [ 8386.974116] no_context+0x1b2/0x3c0 [ 8386.978006] do_page_fault+0x32/0x140 [ 8386.982090] page_fault+0x1e/0x30 [ 8386.985786] RIP: 0010:nvme_complete_rq+0x65/0x1d0 [nvme_core] [ 8386.992195] Code: 41 bc 03 00 00 00 74 16 0f 86 c3 00 00 00 66 3d 83 00 41 bc 06 00 00 00 0f 85 e7 00 000 [ 8387.013147] RSP: 0018:ffff99650019fe18 EFLAGS: 00010246 [ 8387.018973] RAX: 0000000000000000 RBX: ffff8e322ae51280 RCX: 0000000000000001 [ 8387.026935] RDX: 0000000000000400 RSI: 0000000000000001 RDI: ffff8e322ae51280 [ 8387.034897] RBP: ffff8e322ae51280 R08: 0000000000000000 R09: ffffffffb2f0b890 [ 8387.042859] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 8387.050821] R13: 0000000000000100 R14: 0000000000000004 R15: ffff8e2b0446d990 [ 8387.058782] ? swiotlb_unmap_page+0x40/0x40 [ 8387.063448] nvme_fc_complete_rq+0x2d/0x70 [nvme_fc] [ 8387.068986] blk_done_softirq+0xa1/0xd0 [ 8387.073264] __do_softirq+0xd6/0x2a9 [ 8387.077251] run_ksoftirqd+0x26/0x40 [ 8387.081238] smpboot_thread_fn+0x10e/0x160 [ 8387.085807] kthread+0xf8/0x130 [ 8387.089309] ? sort_range+0x20/0x20 [ 8387.093198] ? kthread_stop+0x110/0x110 [ 8387.097475] ret_from_fork+0x35/0x40 [ 8387.101462] ---[ end trace 7106b0adf5e422f8 ]--- Fixes: faf4a44fff ("nvme: support traffic based keep-alive") Signed-off-by: NEwan D. Milne <emilne@redhat.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 20 11月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
It's specifically looking for a given request, which we will not be supporting going forward. Also kill the qla2xxx poll implementation as that's the only user of the nvme-fc poll, and the now unused ->poll_queue() hook. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 15 11月, 2018 1 次提交
-
-
由 James Smart 提交于
If an io error occurs on an io issued while connecting, recovery of the io falls flat as the state checking ends up nooping the error handler. Create an err_work work item that is scheduled upon an io error while connecting. The work thread terminates all io on all queues and marks the queues as not connected. The termination of the io will return back to the callee, which will then back out of the connection attempt and will reschedule, if possible, the connection attempt. The changes: - in case there are several commands hitting the error handler, a state flag is kept so that the error work is only scheduled once, on the first error. The subsequent errors can be ignored. - The calling sequence to stop keep alive and terminate the queues and their io is lifted from the reset routine. Made a small service routine used by both reset and err_work. - During debugging, found that the teardown path can reference an uninitialized pointer, resulting in a NULL pointer oops. The aen_ops weren't initialized yet. Add validation on their initialization before calling the teardown routine. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 09 11月, 2018 1 次提交
-
-
由 Jens Axboe 提交于
We have this functionality in sbitmap, but we don't export it in blk-mq for users of the tags busy iteration. This can be useful for stopping the iteration, if the caller doesn't need to find more requests. Reviewed-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 02 11月, 2018 1 次提交
-
-
由 James Smart 提交于
The patch made to avoid Coverity reporting of out of bounds access on aen_op moved the assignment of a pointer, leaving it null when it was subsequently used to calculate a private pointer. Thus the private pointer was bad. Move/correct the private pointer initialization to be in sync with the patch. Fixes: 0d2bdf9f ("nvme-fc: rework the request initialization code") Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 17 10月, 2018 3 次提交
-
-
由 Bart Van Assche 提交于
Instead of setting and then clearing the first_sgl pointer for AEN requests, leave that pointer zero. This patch does not change how requests are initialized but avoids that Coverity reports the following complaint for nvme_fc_init_aen_ops(): CID 1418400 (#1 of 1): Out-of-bounds access (OVERRUN) 4. overrun-buffer-val: Overrunning buffer pointed to by aen_op of 312 bytes by passing it to a function which accesses it at byte offset 312. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Bart Van Assche 提交于
This patch does not change any functionality but makes the intent of the code more clear. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Bart Van Assche 提交于
This patch avoids that the kernel-doc tool complains about several multiple function headers when building with W=1. Signed-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 02 10月, 2018 2 次提交
-
-
由 James Smart 提交于
The fc transport device should allow for a rediscovery, as userspace might have lost the events. Example is udev events not handled during system startup. This patch add a sysfs entry 'nvme_discovery' on the fc class to have it replay all udev discovery events for all local port/remote port address pairs. Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Milan P. Gandhi 提交于
Signed-off-by: NMilan P. Gandhi <mgandhi@redhat.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 24 7月, 2018 1 次提交
-
-
由 James Smart 提交于
The revised if_ready checks skipped over the case of returning error when the controller is being deleted. Instead it was returning BUSY, which caused the ios to retry, which caused the ns delete to hang waiting for the ios to drain. Stack trace of hang looks like: kworker/u64:2 D 0 74 2 0x80000000 Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core] Call Trace: ? __schedule+0x26d/0x820 schedule+0x32/0x80 blk_mq_freeze_queue_wait+0x36/0x80 ? remove_wait_queue+0x60/0x60 blk_cleanup_queue+0x72/0x160 nvme_ns_remove+0x106/0x140 [nvme_core] nvme_remove_namespaces+0x7e/0xa0 [nvme_core] nvme_delete_ctrl_work+0x4d/0x80 [nvme_core] process_one_work+0x160/0x350 worker_thread+0x1c3/0x3d0 kthread+0xf5/0x130 ? process_one_work+0x350/0x350 ? kthread_bind+0x10/0x10 ret_from_fork+0x1f/0x30 Extend nvmf_fail_nonready_command() to supply the controller pointer so that the controller state can be looked at. Fail any io to a controller that is deleting. Fixes: 3bc32bb1 ("nvme-fabrics: refactor queue ready check") Fixes: 35897b92 ("nvme-fabrics: fix and refine state checks in __nvmf_check_ready") Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NEwan D. Milne <emilne@redhat.com> Reviewed-by: NEwan D. Milne <emilne@redhat.com>
-
- 23 7月, 2018 1 次提交
-
-
由 Sagi Grimberg 提交于
We will need to reference the controller in the setup and completion time for tracing and future traffic based keep alive support. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 21 6月, 2018 1 次提交
-
-
由 James Smart 提交于
Rather than leaving io queues quiesced after tearing down an association, restart them. This allows ios to be replayed, with fastfail ios terminating and non-fastfail getting into loops of retry. This follows rdma's lead. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimber.me> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 15 6月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Move the is_connected check to the fibre channel transport, as it has no meaning for other transports. To facilitate this split out a new nvmf_fail_nonready_command helper that is called by the transport when it is asked to handle a command on a queue that is not ready. Also avoid a function call for the queue live fast path by inlining the check. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJames Smart <james.smart@broadcom.com>
-
- 14 6月, 2018 3 次提交
-
-
由 James Smart 提交于
The reconnect path is calling the init routines to clear a queue structure. But the queue structure has state that perhaps needs to persist as long as the controller is live. Remove the nvme_fc_init_queue() calls on reconnect. The nvme_fc_free_queue() calls will clear state bits and reset any relevant queue state for a new connection. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
The reinit_request routine is not necessary. Remove support for the op callback. As all that nvme_reinit_tagset() does is itterate and call the reinit routine, it too has no purpose. Remove the call. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 James Smart 提交于
Current code follows the framework that has been in the transports from the beginning where initial link-side controller connect occurs as part of "creating the controller". Thus that first connect fully talks to the controller and obtains values that can then be used in for blk-mq setup, etc. It also means that everything about the controller is fully know before the "create controller" call returns. This has several weaknesses: - The initial create_ctrl call made by the cli will block for a long time as wire transactions are performed synchronously. This delay becomes longer if errors occur or connectivity is lost and retries need to be performed. - Code wise, it means there is a separate connect path for initial controller connect vs the (same) steps used in the reconnect path. - And as there's separate paths, it means there's separate error handling and retry logic. It also plays havoc with the NEW state (should transition out of it after successful initial connect) vs the RESETTING and CONNECTING (reconnect) states that want to be transitioned to on error. - As there's separate paths, to recover from errors and disruptions, it requires separate recovery/retry paths as well and can severely convolute the controller state. This patch reworks the fc transport to use the same connect paths for the initial connection as it uses for reconnect. This makes a single path for error recovery and handling. This patch: - Removes the driving of the initial connect and replaces it with a state transition to CONNECTING and initiating the reconnect thread. A dummy state transition of RESETTING had to be traversed as a direct transtion of NEW->CONNECTING is not allowed. Given that the controller is "new", the RESETTING transition is a simple no-op. Once in the reconnecting thread, the normal behaviors of ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will apply before the controller is torn down. - Only if the state transitions couldn't be traversed and the reconnect thread not scheduled, will the controller be torn down while in create_ctrl. - The prior code used the controller state of NEW to indicate whether request queues had been initialized or not. For the admin queue, the request queue is always created, so there's no need to check a state. For IO queues, change to tracking whether a successful io request queue create has occurred (e.g. 1st successful connect). - The initial controller id is initialized to the dynamic controller id used in the initial connect message. It will be overwritten by the real controller id once the controller is connected on the wire. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 31 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
We already check for started commands in all callbacks, but we should also protect against already completed commands. Do this by taking the checks to common code. Acked-by: NJosef Bacik <josef@toxicpanda.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 25 5月, 2018 1 次提交
-
-
由 James Smart 提交于
Current code will set DNR if the controller is deleting or there is an error during controller init. None of this is necessary. Remove the code that sets DNR Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 27 4月, 2018 1 次提交
-
-
由 Johannes Thumshirn 提交于
Provide a descriptive error in case an lport to rport association isn't found when creating the FC-NVME controller. Currently it's very hard to debug the reason for a failed connect attempt without a look at the source. Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de> Reviewed-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NKeith Busch <keith.busch@intel.com>
-
- 12 4月, 2018 1 次提交
-
-
由 James Smart 提交于
The nvmf_check_if_ready() checks that were added are very simplistic. As such, the routine allows a lot of cases to fail ios during windows of reset or re-connection. In cases where there are not multi-path options present, the error goes back to the callee - the filesystem or application. Not good. The common routine was rewritten and calling syntax slightly expanded so that per-transport is_ready routines don't need to be present. The transports now call the routine directly. The routine is now a fabrics routine rather than an inline function. The routine now looks at controller state to decide the action to take. Some states mandate io failure. Others define the condition where a command can be accepted. When the decision is unclear, a generic queue-or-reject check is made to look for failfast or multipath ios and only fails the io if it is so marked. Otherwise, the io will be queued and wait for the controller state to resolve. Admin commands issued via ioctl share a live admin queue with commands from the transport for controller init. The ioctls could be intermixed with the initialization commands. It's possible for the ioctl cmd to be issued prior to the controller being enabled. To block this, the ioctl admin commands need to be distinguished from admin commands used for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to reflect this division and set it on ioctls requests. As the nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(), ensure that commands allocated by the ioctl path (actually anything in core.c) preps the nvme_req(req) before starting the io. This will preserve the USERCMD flag during execution and/or retry. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.e> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 26 3月, 2018 5 次提交
-
-
由 James Smart 提交于
When reattaching to a removed remoteport that has not yet been fully deleted as it's waiting for reconnect timeouts, be sure to re-set the ports nport id and role. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
Another abort race: An io request is started, becomes active, and is attempted to be started with the lldd. At the same time the controller is stopped/torndown and an itterator is run to abort the ios. As the io is active, it is added to the outstanding aborted io count. However on the original io request thread, the driver ends up rejecting the io due to the condition that induced the controller teardown. The driver reject path didn't check whether it was in the outstanding io count. This left the count outstanding stopping controller teardown. Correct by, in the driver reject case, setting the state to inactive and checking whether it was in the outstanding io count. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
The current nvme_fc code, when an io times out, will abort the io on the fc link, then call the error recovery routine to reset the controller. It is during the reset of the controller that the transport will wait for all ios to be aborted before sending a Disconnect LS to the target. However, the reset routine only waits for the io which it generates the abort for to complete. Any io that was aborted just prior to the reset isn't in it's list to wait for. Thus the Disconnect is getting sent before the aborts have completed. Correct by removing the abort in the timeout handler. The reset will generate the abort. At that point the timeout handler can be simplified to request the reset (via the error handler) and restart the timeout timer. Also fixes a small typo in a comment in the reset handler. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
If there are errors during initial controller create, the transport will teardown the partially initialized controller struct and free the ctlr memory. Trouble is - most of those errors can occur due to asynchronous events happening such io timeouts and subsystem connectivity failures. Those failures invoke async workq items to reset the controller and attempt reconnect. Those may be in progress as the main thread frees the ctrl memory, resulting in NULL ptr oops. Prevent this from happening by having the main ctrl failure thread changing state to DELETING followed by synchronously cancelling any pending queued work item. The change of state will prevent the scheduling of resets or reconnect events. Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Max Gurtovoy 提交于
nvme_delete_ctrl can be called from various contexts in parallel, and cause duplicated information prints, even though the specific context doesn't perform the actual removal. Instead, print the information when the actual removal occurs. Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 09 3月, 2018 1 次提交
-
-
由 James Smart 提交于
Corrected four outstanding issues in the transport around sqsize. 1: Create Connection LS is sending the 1's-based sqsize, should be sending the 0's-based value. 2: allocation of hw queue is using the 0's-base size. It should be using the 1's-based value. 3: normalization of ctrl.sqsize by MQES is using MQES+1 (1's-based value). It should be MQES (0's-based value). 4: Missing clause to ensure queue_count not larger than ctrl->sqsize. Corrected by: Clean up routines that pass queue size around. The queue size value is the actual count (1's-based) value and determined from ctrl->sqsize + 1. Routines that send 0's-based value adapt from queue size. Sset ctrl->sqsize properly for MQES. Added clause to nsure queue_count not larger than ctrl->sqsize + 1. Signed-off-by: NJames Smart <james.smart@broadcom.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NKeith Busch <keith.busch@intel.com>
-
- 11 2月, 2018 2 次提交
-
-
由 James Smart 提交于
There was some old cold that dealt with complete_rq being called prior to the lldd returning the io completion. This is garbage code. The complete_rq routine was being called after eh_timeouts were called and it was due to eh_timeouts not being handled properly. The timeouts were fixed in prior patches so that in general, a timeout will initiate an abort and the reset timer restarted as the abort operation will take care of completing things. Given the reset timer restarted, the erroneous complete_rq calls were eliminated. So remove the work that was synchronizing complete_rq with io completion. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
由 James Smart 提交于
During reset handling, there is live io completing while the reset is taking place. The reset path attempts to abort all outstanding io, counting the number of ios that were reset. It then waits for those ios to be reclaimed from the lldd before continuing. The transport's logic on io state and flag setting was poor, allowing ios to complete simultaneous to the abort request. The completed ios were counted, but as the completion had already occurred, the completion never reduced the count. As the count never zeros, the reset/delete never completes. Tighten it up by unconditionally changing the op state to completed when the io done handler is called. The reset/abort path now changes the op state to aborted, but the abort only continues if the op state was live priviously. If complete, the abort is backed out. Thus proper counting of io aborts and their completions is working again. Also removed the TERMIO state on the op as it's redundant with the op's aborted state. Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 09 2月, 2018 1 次提交
-
-
由 Max Gurtovoy 提交于
In pci transport, this state is used to mark the initialization process. This should be also used in other transports as well. Signed-off-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
-
- 31 1月, 2018 1 次提交
-
-
由 Ming Lei 提交于
This status is returned from driver to block layer if device related resource is unavailable, but driver can guarantee that IO dispatch will be triggered in future when the resource is available. Convert some drivers to return BLK_STS_DEV_RESOURCE. Also, if driver returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls. BLK_MQ_DELAY_QUEUE is 3 ms because both scsi-mq and nvmefc are using that magic value. If a driver can make sure there is in-flight IO, it is safe to return BLK_STS_DEV_RESOURCE because: 1) If all in-flight IOs complete before examining SCHED_RESTART in blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue is run immediately in this case by blk_mq_dispatch_rq_list(); 2) if there is any in-flight IO after/when examining SCHED_RESTART in blk_mq_dispatch_rq_list(): - if SCHED_RESTART isn't set, queue is run immediately as handled in 1) - otherwise, this request will be dispatched after any in-flight IO is completed via blk_mq_sched_restart() 3) if SCHED_RESTART is set concurently in context because of BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two cases and make sure IO hang can be avoided. One invariant is that queue will be rerun if SCHED_RESTART is set. Suggested-by: NJens Axboe <axboe@kernel.dk> Tested-by: NLaurence Oberman <loberman@redhat.com> Signed-off-by: NMing Lei <ming.lei@redhat.com> Signed-off-by: NMike Snitzer <snitzer@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-