提交 · 81b1dab45809234958653dca29d04d4161f0a476 · openeuler / Kernel

27 4月, 2018 1 次提交

nvme: fc: provide a descriptive error · 4fb135ad

由 Johannes Thumshirn 提交于 4月 19, 2018

Provide a descriptive error in case an lport to rport association
isn't found when creating the FC-NVME controller.

Currently it's very hard to debug the reason for a failed connect
attempt without a look at the source.
Signed-off-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NJames Smart  <james.smart@broadcom.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

4fb135ad

12 4月, 2018 1 次提交

nvme: expand nvmf_check_if_ready checks · bb06ec31

由 James Smart 提交于 4月 12, 2018

The nvmf_check_if_ready() checks that were added are very simplistic.
As such, the routine allows a lot of cases to fail ios during windows
of reset or re-connection. In cases where there are not multi-path
options present, the error goes back to the callee - the filesystem
or application. Not good.

The common routine was rewritten and calling syntax slightly expanded
so that per-transport is_ready routines don't need to be present.
The transports now call the routine directly. The routine is now a
fabrics routine rather than an inline function.

The routine now looks at controller state to decide the action to
take. Some states mandate io failure. Others define the condition where
a command can be accepted. When the decision is unclear, a generic
queue-or-reject check is made to look for failfast or multipath ios and
only fails the io if it is so marked. Otherwise, the io will be queued
and wait for the controller state to resolve.

Admin commands issued via ioctl share a live admin queue with commands
from the transport for controller init. The ioctls could be intermixed
with the initialization commands. It's possible for the ioctl cmd to
be issued prior to the controller being enabled. To block this, the
ioctl admin commands need to be distinguished from admin commands used
for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to
reflect this division and set it on ioctls requests. As the
nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(),
ensure that commands allocated by the ioctl path (actually anything
in core.c) preps the nvme_req(req) before starting the io. This will
preserve the USERCMD flag during execution and/or retry.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.e>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bb06ec31

26 3月, 2018 5 次提交

nvme_fc: on remoteport reuse, set new nport_id and role. · 0cdd5fca

由 James Smart 提交于 3月 05, 2018

When reattaching to a removed remoteport that has not yet been
fully deleted as it's waiting for reconnect timeouts, be sure to
re-set the ports nport id and role.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

0cdd5fca

nvme_fc: fix abort race on teardown with lld reject · b12740d3

由 James Smart 提交于 2月 28, 2018

Another abort race: An io request is started, becomes active,
and is attempted to be started with the lldd. At the same time
the controller is stopped/torndown and an itterator is run to
abort the ios. As the io is active, it is added to the outstanding
aborted io count.  However on the original io request thread, the
driver ends up rejecting the io due to the condition that induced
the controller teardown. The driver reject path didn't check whether
it was in the outstanding io count. This left the count outstanding
stopping controller teardown.

Correct by, in the driver reject case, setting the state to
inactive and checking whether it was in the outstanding io count.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b12740d3

nvme_fc: io timeout should defer abort to ctrl reset · 041018c6

由 James Smart 提交于 3月 12, 2018

The current nvme_fc code, when an io times out, will abort the io
on the fc link, then call the error recovery routine to reset the
controller. It is during the reset of the controller that the
transport will wait for all ios to be aborted before sending a
Disconnect LS to the target.

However, the reset routine only waits for the io which it generates
the abort for to complete. Any io that was aborted just prior to the
reset isn't in it's list to wait for. Thus the Disconnect is getting
sent before the aborts have completed.

Correct by removing the abort in the timeout handler. The reset will
generate the abort. At that point the timeout handler can be simplified
to request the reset (via the error handler) and restart the timeout
timer.

Also fixes a small typo in a comment in the reset handler.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

041018c6

nvme_fc: fix ctrl create failures racing with workq items · cf25809b

由 James Smart 提交于 3月 13, 2018

If there are errors during initial controller create, the transport
will teardown the partially initialized controller struct and free
the ctlr memory.  Trouble is - most of those errors can occur due
to asynchronous events happening such io timeouts and subsystem
connectivity failures. Those failures invoke async workq items to
reset the controller and attempt reconnect.  Those may be in progress
as the main thread frees the ctrl memory, resulting in NULL ptr oops.

Prevent this from happening by having the main ctrl failure thread
changing state to DELETING followed by synchronously cancelling any
pending queued work item. The change of state will prevent the
scheduling of resets or reconnect events.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

cf25809b

nvme: centralize ctrl removal prints · 77d0612d

由 Max Gurtovoy 提交于 3月 11, 2018

nvme_delete_ctrl can be called from various contexts in parallel,
and cause duplicated information prints, even though the specific
context doesn't perform the actual removal. Instead, print the
information when the actual removal occurs.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

77d0612d

09 3月, 2018 1 次提交

nvme_fc: rework sqsize handling · d157e534

由 James Smart 提交于 3月 07, 2018

Corrected four outstanding issues in the transport around sqsize.

1: Create Connection LS is sending the 1's-based sqsize, should be
sending the 0's-based value.

2: allocation of hw queue is using the 0's-base size. It should be
using the 1's-based value.

3: normalization of ctrl.sqsize by MQES is using MQES+1 (1's-based
value). It should be MQES (0's-based value).

4: Missing clause to ensure queue_count not larger than ctrl->sqsize.

Corrected by:
Clean up routines that pass queue size around. The queue size value is
the actual count (1's-based) value and determined from ctrl->sqsize + 1.

Routines that send 0's-based value adapt from queue size.

Sset ctrl->sqsize properly for MQES.

Added clause to nsure queue_count not larger than ctrl->sqsize + 1.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NKeith Busch <keith.busch@intel.com>

d157e534

11 2月, 2018 2 次提交

nvme_fc: cleanup io completion · c3aedd22

由 James Smart 提交于 2月 06, 2018

There was some old cold that dealt with complete_rq being called
prior to the lldd returning the io completion. This is garbage code.
The complete_rq routine was being called after eh_timeouts were
called and it was due to eh_timeouts not being handled properly.
The timeouts were fixed in prior patches so that in general, a
timeout will initiate an abort and the reset timer restarted as
the abort operation will take care of completing things. Given the
reset timer restarted, the erroneous complete_rq calls were eliminated.

So remove the work that was synchronizing complete_rq with io
completion.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

c3aedd22

nvme_fc: correct abort race condition on resets · 3efd6e8e

由 James Smart 提交于 2月 06, 2018

During reset handling, there is live io completing while the reset
is taking place. The reset path attempts to abort all outstanding io,
counting the number of ios that were reset. It then waits for those
ios to be reclaimed from the lldd before continuing.

The transport's logic on io state and flag setting was poor, allowing
ios to complete simultaneous to the abort request. The completed ios
were counted, but as the completion had already occurred, the
completion never reduced the count. As the count never zeros, the
reset/delete never completes.

Tighten it up by unconditionally changing the op state to completed
when the io done handler is called.  The reset/abort path now changes
the op state to aborted, but the abort only continues if the op
state was live priviously. If complete, the abort is backed out.
Thus proper counting of io aborts and their completions is working
again.

Also removed the TERMIO state on the op as it's redundant with the
op's aborted state.
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

3efd6e8e

09 2月, 2018 1 次提交

nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING · ad6a0a52

由 Max Gurtovoy 提交于 1月 31, 2018

In pci transport, this state is used to mark the initialization
process. This should be also used in other transports as well.
Signed-off-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>

ad6a0a52

31 1月, 2018 1 次提交

blk-mq: introduce BLK_STS_DEV_RESOURCE · 86ff7c2a

由 Ming Lei 提交于 1月 30, 2018

This status is returned from driver to block layer if device related
resource is unavailable, but driver can guarantee that IO dispatch
will be triggered in future when the resource is available.

Convert some drivers to return BLK_STS_DEV_RESOURCE.  Also, if driver
returns BLK_STS_RESOURCE and SCHED_RESTART is set, rerun queue after
a delay (BLK_MQ_DELAY_QUEUE) to avoid IO stalls.  BLK_MQ_DELAY_QUEUE is
3 ms because both scsi-mq and nvmefc are using that magic value.

If a driver can make sure there is in-flight IO, it is safe to return
BLK_STS_DEV_RESOURCE because:

1) If all in-flight IOs complete before examining SCHED_RESTART in
blk_mq_dispatch_rq_list(), SCHED_RESTART must be cleared, so queue
is run immediately in this case by blk_mq_dispatch_rq_list();

2) if there is any in-flight IO after/when examining SCHED_RESTART
in blk_mq_dispatch_rq_list():
- if SCHED_RESTART isn't set, queue is run immediately as handled in 1)
- otherwise, this request will be dispatched after any in-flight IO is
  completed via blk_mq_sched_restart()

3) if SCHED_RESTART is set concurently in context because of
BLK_STS_RESOURCE, blk_mq_delay_run_hw_queue() will cover the above two
cases and make sure IO hang can be avoided.

One invariant is that queue will be rerun if SCHED_RESTART is set.
Suggested-by: NJens Axboe <axboe@kernel.dk>
Tested-by: NLaurence Oberman <loberman@redhat.com>
Signed-off-by: NMing Lei <ming.lei@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

86ff7c2a

18 1月, 2018 2 次提交

nvme-fc: correct hang in nvme_ns_remove() · 0fd997d3

由 James Smart 提交于 1月 11, 2018

When connectivity is lost to a device, the association is terminated
and the blk-mq queues are quiesced/stopped. When connectivity is
re-established, they are resumed.

If connectivity is lost for a sufficient amount of time that the
controller is then deleted, the delete path starts tearing down queues,
and eventually calling nvme_ns_remove(). It appears that pending
commands may cause blk_cleanup_queue() to never complete and the
teardown stalls.

Correct by starting the ns queues after transitioning to a DELETING
state, allowing pending commands to be flushed with io failures. Thus
the delete path is clear when reached.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0fd997d3

nvme-fc: fix rogue admin cmds stalling teardown · d625d05e

由 James Smart 提交于 1月 11, 2018

When connectivity is lost to a device, the association is terminated
and the blk-mq queues are quiesced/stopped. When connectivity is
re-established, they are resumed.

If an admin command is received while connectivity is list, the ioctl
queues the command on the admin_q and the command stalls (the thread
issuing the ioctl hangs/waits). if the connectivity is lost long
enough such that the controller is then deleted, the delete code
makes its calls to initiate the delete, which then expects the core
layer to call the transport when all references are removed and the
controller can be freed. Unfortunately, nothing in this path dequeued
the admin command, so a reference sits outstanding and things stop,
hanging the delete indefinitely.

Correct by unquiescing the admin queue in the delete association. This
means any admin command (which should only be from an ioctl) issued
after connectivity is lost will detect the controller is in a
reconnecting state and will (fast) fail the command. Thus, a pending
reference can no longer be created. Once connectivity is re-established,
a new ioctl/admin command would see proper device state and function again.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

d625d05e

08 1月, 2018 1 次提交

nvme-fabrics: protect against module unload during create_ctrl · 0de5cd36

由 Roy Shterman 提交于 12月 25, 2017

NVMe transport driver module unload may (and usually does) trigger
iteration over the active controllers and delete them all (sometimes
under a mutex).  However, a controller can be created concurrently with
module unload which can lead to leakage of resources (most important char
device node leakage) in case the controller creation occured after the
unload delete and drain sequence.  To protect against this, we take a
module reference to guarantee that the nvme transport driver is not
unloaded while creating a controller.
Signed-off-by: NRoy Shterman <roys@lightbitslabs.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0de5cd36

15 12月, 2017 1 次提交

nvme-fc: remove double put reference if admin connect fails · 4596e752

由 James Smart 提交于 11月 29, 2017

There are two put references in the failure case of initial
create_association. The first put actually frees the controller, thus the
second put references freed memory.

Remove the unnecessary 2nd put.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4596e752

25 11月, 2017 1 次提交

nvme-fc: don't use bit masks for set/test_bit() numbers · 26c0a26d

由 Jens Axboe 提交于 11月 24, 2017

So far harmless, but it's confusing and a bug waiting to happen if the
shifts grow larger than 4.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

26c0a26d

20 11月, 2017 1 次提交

nvme-fc: check if queue is ready in queue_rq · 9e0ed16a

由 Sagi Grimberg 提交于 10月 24, 2017

In case the queue is not LIVE (fully functional and connected at the nvmf
level), we cannot allow any commands other than connect to pass through.

Add a new queue state flag NVME_FC_Q_LIVE which is set after nvmf connect
and cleared in queue teardown.
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

9e0ed16a

11 11月, 2017 5 次提交

nvme: remove handling of multiple AEN requests · ad22c355

由 Keith Busch 提交于 11月 07, 2017

The driver can handle tracking only one AEN request, so this patch
removes handling for multiple ones.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJames Smart  <james.smart@broadcom.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ad22c355

nvme-fc: remove unused "queue_size" field · 08e15075

由 Keith Busch 提交于 11月 07, 2017

This was being saved in a structure, but never used anywhere. The queue
size is obtained through other means, so there's no reason to duplicate
this without a user for it.
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

08e15075

nvme: centralize AEN defines · 38dabe21

由 Keith Busch 提交于 11月 07, 2017

All the transports were unnecessarilly duplicating the AEN request
accounting. This patch defines everything in one place.
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

38dabe21

nvme-fc: decouple ns references from lldd references · 158bfb88

由 James Smart 提交于 11月 03, 2017

In the lldd api, a lldd may unregister a remoteport (loss of connectivity
or driver unload) or localport (driver unload). The lldd must wait for the
remoteport_delete or localport_delete before completing its actions post
the unregister. The xxx_deletes currently occur only when the xxxport
structure is fully freed after all references are removed. Thus the lldd
may be held hostage until an app or in-kernel entity that has a namespace
open finally closes so the namespace can be removed, the controller
removed, thus the transport objects, thus the lldd.

This patch decouples the transport and os-facing objects from the lldd
and the remoteport and localport. There is a point in all deletions
where the transport will no longer interact with the lldd on behalf of
a controller. That point centers around the association established
with the target/subsystem. It will access the lldd whenever it attempts
to create an association and while the association is active. New
associations may only be created if the remoteport is live (thus the
localport is live). It will not access the lldd after deleting the
association.

Therefore, the patch tracks the count of active controllers - those with
associations being created or that are active - on a remoteport. It also
tracks the number of remoteports that have active controllers, on a
a localport. When a remoteport is unregistered, as soon as there are no
active controllers, the lldd's remoteport_delete may be called and the
lldd may continue. Similarly, when a localport is unregistered, as soon
as there are no remoteports with active controllers, the localport_delete
callback may be made. This significantly speeds up unregistration with
the lldd.

The transport objects continue in suspended status with reconnect timers
running, and upon expiration, normal ref-counting will occur and the
objects will be freed. The transport object may still be held hostage
by the application/kernel module, but that is acceptable.

With this change, the lldd may be fully unloaded and reloaded, and
if registrations occur prior to the timeouts, the nvme controller and
namespaces will resume normally as if a link bounce.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

158bfb88

nvme-fc: fix localport resume using stale values · c5760f30

由 James Smart 提交于 11月 03, 2017

The localport resume was not updating the lldd ops structure. If the
lldd is unloaded and reloaded, the ops pointers will differ.

Additionally, as there are device references taken by the localport,
ensure that resume only resumes if the device matches as well.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c5760f30

01 11月, 2017 9 次提交

nvme-fc: add dev_loss_tmo timeout and remoteport resume support · 2b632970

由 James Smart 提交于 10月 25, 2017

When a remoteport is unregistered (connectivity lost), the following
actions are taken:

 - the remoteport is marked DELETED
 - the time when dev_loss_tmo would expire is set in the remoteport
 - all controllers on the remoteport are reset.

After a controller resets, it will stall in a RECONNECTING state waiting
for one of the following:

 - the controller will continue to attempt reconnect per max_retries and
   reconnect_delay.  As no remoteport connectivity, the reconnect attempt
   will immediately fail.  If max reconnects has not been reached, a new
   reconnect_delay timer will be schedule.  If the current time plus
   another reconnect_delay exceeds when dev_loss_tmo expires on the remote
   port, then the reconnect_delay will be shortend to schedule no later
   than when dev_loss_tmo expires.  If max reconnect attempts are reached
   (e.g. ctrl_loss_tmo reached) or dev_loss_tmo ix exceeded without
   connectivity, the controller is deleted.
 - the remoteport is re-registered prior to dev_loss_tmo expiring.
   The resume of the remoteport will immediately attempt to reconnect
   each of its suspended controllers.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
[hch: updated to use nvme_delete_ctrl]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

2b632970

nvme-fc: check connectivity before initiating reconnects · 96e24801

由 James Smart 提交于 10月 25, 2017

Check remoteport connectivity before initiating reconnects
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

96e24801

nvme-fc: add a dev_loss_tmo field to the remoteport · ac7fe82b

由 James Smart 提交于 10月 25, 2017

Add a dev_loss_tmo value, paralleling the SCSI FC transport, for device
connectivity loss.

The transport initializes the value in the nvme_fc_register_remoteport()
call. If the value is not set, a default of 60s is set.

Add a new routine to the api, nvme_fc_set_remoteport_devloss() routine,
which allows the lldd to dynamically update the value on an existing
remoteport.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ac7fe82b

nvme-fc: change ctlr state assignments during reset/reconnect · 44c6ec77

由 James Smart 提交于 10月 25, 2017

Clean up some of the controller state checks and add the
RESETTING->RECONNECTING state transition.

Specifically:
- the movement of the RESETTING state change and schedule of reset_work
  to core doesn't work wiht nvme_fc_error_recovery setting state to
  RECONNECTING before attempting to reset.  Remove the state change as
  the reset request does it.
- In the rare cases where an error occurs right as we're transitioning
  to LIVE, defer the controller start actions.
- In error handling on teardown of associations while performing initial
  controller creation - avoid quiesce calls on the admin_q.  They are
  unneeded.
- Add the RESETTING->RECONNECTING transition in the reset handler.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

44c6ec77

nvme: flush reset_work before safely continuing with delete operation · 4054637c

由 Sagi Grimberg 提交于 10月 29, 2017

Prevent racing controller reset and delete flows. reset_work must not
ever self-requeue so flushing it suffices.
Reported-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

4054637c

nvme: consolidate common code from ->reset_work · 6cd53d14

由 Christoph Hellwig 提交于 10月 29, 2017

No change in behavior except that the FC code cancels two work items a
little later now.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

6cd53d14

nvme: move controller deletion to common code · c5017e85

由 Christoph Hellwig 提交于 10月 29, 2017

Move the ->delete_work and the associated helpers to common code instead
of duplicating them in every driver.  This also adds the missing reference
get/put for the loop driver.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

c5017e85

nvme-fc: merge __nvme_fc_schedule_delete_work into __nvme_fc_del_ctrl · 29c09648

由 Christoph Hellwig 提交于 10月 29, 2017

No need to have two functions doing the same thing.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

29c09648

nvme-fc: avoid workqueue flush stalls · 71c691fd

由 James Smart 提交于 9月 28, 2017

There's no need to wait for the full nvme_wq, which is now shared,
to flush. flush only the delete_work item.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sgi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

71c691fd

27 10月, 2017 3 次提交

nvme-fc: remove NVME_FC_MAX_SEGMENTS · ecad0d2c

由 James Smart 提交于 10月 23, 2017

The define is an arbitrary limit to the io size on the initiator,
capping the io to 1MB-4KB.

Remove the define from the transport. I/O size will solely be limited
by the LLDD sg limits.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

ecad0d2c

nvme-fc: add support for duplicate_connect option · 56d5f4f1

由 James Smart 提交于 10月 20, 2017

Adds support for the duplicate_connect option. When set to true,
checks whether there's an existing controller via the same host port
and target port for the same host (hostnqn, hostid) to the same
subsystem. Fails the connection request if an existing controller.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

56d5f4f1

nvme: switch controller refcounting to use struct device · d22524a4

由 Christoph Hellwig 提交于 10月 18, 2017

Instead of allocating a separate struct device for the character device
handle embedd it into struct nvme_ctrl and use it for the main controller
refcounting.  This removes double refcounting and gets us an automatic
reference for the character device operations.  We keep ctrl->device as a
pointer for now to avoid chaning printks all over, but in the future we
could look into message printing helpers that take a controller structure
similar to what other subsystems do.

Note the delete_ctrl operation always already has a reference (either
through sysfs due this change, or because every open file on the
/dev/nvme-fabrics node has a refernece) when it is entered now, so we
don't need to do the unless_zero variant there.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>

d22524a4

20 10月, 2017 2 次提交

nvme-fc: correct io timeout behavior · 134aedc9

由 James Smart 提交于 10月 19, 2017

The transport io timeout behavior wasn't quite correct. It ignored
that the io error handler is supposed to be synchronous so it possibly
allowed the blk request to be restarted while the io associated was
still aborting. Timeouts on reserved commands, those used for
association create, were never timing out thus they hung out forever.

To correct:
If an io is times out while a remoteport is not connected, just
restart the io timer. The lack of connectivity will simultaneously
be resetting the controller, so the reset path will abort and terminate
the io.

If an io is times out while it was marked for transport abort, just
reset the io timer. The abort process is underway and will complete
the io.

Otherwise, if an io times out, abort the io. If the abort was
unsuccessful (unlikely) give up and return not handled.

If the abort was successful, as the abort process is underway it will
terminate the io, so rather than synchronously waiting, just restart
the io timer.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

134aedc9

nvme-fc: correct io termination handling · 0a02e39f

由 James Smart 提交于 10月 19, 2017

The io completion handling for i/o's that are failing due to
to a transport error or association termination had issues, causing
io failures (DNR set so retries didn't kick in) or long stalls.

Change the io completion handler for the following items:

When an io has been completed due to a transport abort (based on an
exchange error) or when marked as aborted as part of an association
termination (FCOP_FLAGS_TERMIO), set the NVME completion status to
NVME_SC_ABORTED. By default, do not set DNR on the status so that a
retry can be attempted after association recreate.

In cases where an io is failed (non-successful nvme status including
aborted), if the controller is being deleted (blk_queue_dying) or
the io was part of the ios used for association creation (ctrl state
is NEW or RECONNECTING), then additionally set the DNR bit so the io
will not be retried. If the failed io was part of association creation,
the failure will tear down the partially completioned association and
typically restart a new reconnect attempt (another create association
later).

Rearranged code flow to remove a largely unneeded local variable.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0a02e39f

19 10月, 2017 3 次提交

nvme-fc: Add BLK_MQ_F_NO_SCHED flag to admin tag set · 5a22e2bf

由 Israel Rukshin 提交于 10月 18, 2017

Since commit b86dd815
"block: get rid of blk-mq default scheduler choice Kconfig entries",
when setting nr_hw_queues to 1 the admin tag set uses mq-deadline scheduler.
This flag is useful for admin queues that aren't used for normal IO.
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NJames Smart  <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5a22e2bf

nvme-fc: retry initial controller connections 3 times · 17c4dc6e

由 James Smart 提交于 10月 09, 2017

Currently, if a frame is lost of command fails as part of initial
association create for a new controller, the new controller connection
request will immediately fail.

Add in an immediate 3 retry loop before giving up.
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

17c4dc6e

nvme-fc: fix iowait hang · 8a82dbf1

由 James Smart 提交于 10月 09, 2017

Add missing iowait head initialization.
Fix irqsave vs irq: wait_event_lock_irq() doesn't do irq save/restore

Fixes: 36715cf4 ("nvme_fc: replace ioabort msleep loop with completion”)
Cc: <stable@vger.kernel.org> # 4.13
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Tested-by: NHimanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

8a82dbf1

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功