1. 11 11月, 2017 4 次提交
    • K
      nvme-fc: remove unused "queue_size" field · 08e15075
      Keith Busch 提交于
      This was being saved in a structure, but never used anywhere. The queue
      size is obtained through other means, so there's no reason to duplicate
      this without a user for it.
      Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com>
      Reviewed-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      08e15075
    • K
      nvme: centralize AEN defines · 38dabe21
      Keith Busch 提交于
      All the transports were unnecessarilly duplicating the AEN request
      accounting. This patch defines everything in one place.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Reviewed-by: NGuan Junxiong <guanjunxiong@huawei.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      38dabe21
    • J
      nvme-fc: decouple ns references from lldd references · 158bfb88
      James Smart 提交于
      In the lldd api, a lldd may unregister a remoteport (loss of connectivity
      or driver unload) or localport (driver unload). The lldd must wait for the
      remoteport_delete or localport_delete before completing its actions post
      the unregister.  The xxx_deletes currently occur only when the xxxport
      structure is fully freed after all references are removed. Thus the lldd
      may be held hostage until an app or in-kernel entity that has a namespace
      open finally closes so the namespace can be removed, the controller
      removed, thus the transport objects, thus the lldd.
      
      This patch decouples the transport and os-facing objects from the lldd
      and the remoteport and localport. There is a point in all deletions
      where the transport will no longer interact with the lldd on behalf of
      a controller. That point centers around the association established
      with the target/subsystem. It will access the lldd whenever it attempts
      to create an association and while the association is active. New
      associations may only be created if the remoteport is live (thus the
      localport is live). It will not access the lldd after deleting the
      association.
      
      Therefore, the patch tracks the count of active controllers - those with
      associations being created or that are active - on a remoteport. It also
      tracks the number of remoteports that have active controllers, on a
      a localport. When a remoteport is unregistered, as soon as there are no
      active controllers, the lldd's remoteport_delete may be called and the
      lldd may continue. Similarly, when a localport is unregistered, as soon
      as there are no remoteports with active controllers, the localport_delete
      callback may be made. This significantly speeds up unregistration with
      the lldd.
      
      The transport objects continue in suspended status with reconnect timers
      running, and upon expiration, normal ref-counting will occur and the
      objects will be freed. The transport object may still be held hostage
      by the application/kernel module, but that is acceptable.
      
      With this change, the lldd may be fully unloaded and reloaded, and
      if registrations occur prior to the timeouts, the nvme controller and
      namespaces will resume normally as if a link bounce.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      158bfb88
    • J
      nvme-fc: fix localport resume using stale values · c5760f30
      James Smart 提交于
      The localport resume was not updating the lldd ops structure. If the
      lldd is unloaded and reloaded, the ops pointers will differ.
      
      Additionally, as there are device references taken by the localport,
      ensure that resume only resumes if the device matches as well.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      c5760f30
  2. 01 11月, 2017 9 次提交
  3. 27 10月, 2017 3 次提交
  4. 20 10月, 2017 2 次提交
    • J
      nvme-fc: correct io timeout behavior · 134aedc9
      James Smart 提交于
      The transport io timeout behavior wasn't quite correct. It ignored
      that the io error handler is supposed to be synchronous so it possibly
      allowed the blk request to be restarted while the io associated was
      still aborting. Timeouts on reserved commands, those used for
      association create, were never timing out thus they hung out forever.
      
      To correct:
      If an io is times out while a remoteport is not connected, just
      restart the io timer. The lack of connectivity will simultaneously
      be resetting the controller, so the reset path will abort and terminate
      the io.
      
      If an io is times out while it was marked for transport abort, just
      reset the io timer. The abort process is underway and will complete
      the io.
      
      Otherwise, if an io times out, abort the io. If the abort was
      unsuccessful (unlikely) give up and return not handled.
      
      If the abort was successful, as the abort process is underway it will
      terminate the io, so rather than synchronously waiting, just restart
      the io timer.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      134aedc9
    • J
      nvme-fc: correct io termination handling · 0a02e39f
      James Smart 提交于
      The io completion handling for i/o's that are failing due to
      to a transport error or association termination had issues, causing
      io failures (DNR set so retries didn't kick in) or long stalls.
      
      Change the io completion handler for the following items:
      
      When an io has been completed due to a transport abort (based on an
      exchange error) or when marked as aborted as part of an association
      termination (FCOP_FLAGS_TERMIO), set the NVME completion status to
      NVME_SC_ABORTED. By default, do not set DNR on the status so that a
      retry can be attempted after association recreate.
      
      In cases where an io is failed (non-successful nvme status including
      aborted), if the controller is being deleted (blk_queue_dying) or
      the io was part of the ios used for association creation (ctrl state
      is NEW or RECONNECTING), then additionally set the DNR bit so the io
      will not be retried. If the failed io was part of association creation,
      the failure will tear down the partially completioned association and
      typically restart a new reconnect attempt (another create association
      later).
      
      Rearranged code flow to remove a largely unneeded local variable.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      0a02e39f
  5. 19 10月, 2017 2 次提交
  6. 05 10月, 2017 1 次提交
  7. 04 10月, 2017 2 次提交
    • J
      nvme-fc: create fc class and transport device · 5f568556
      James Smart 提交于
      Added a new fc class and a device node for udev events under it.  I
      expect the fc class will eventually be the location where the FC SCSI and
      FC NVME merge in the future. Therefore names are kept somewhat generic.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      5f568556
    • J
      nvme-fc: add uevent for auto-connect · eaefd5ab
      James Smart 提交于
      To support auto-connecting to FC-NVME devices upon their dynamic
      appearance, add a uevent that can kick off connection scripts.
      uevent is posted against the fc_udev device.
      
      patch set tested with the following rule to kick an nvme-cli connect-all
      for the FC initiator and FC target ports. This is just an example for
      testing and not intended for real life use.
      
      ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \
              ENV{NVMEFC_HOST_TRADDR}=="*", ENV{NVMEFC_TRADDR}=="*", \
      	RUN+="/bin/sh -c '/usr/local/sbin/nvme connect-all --transport=fc --host-traddr=$env{NVMEFC_HOST_TRADDR} --traddr=$env{NVMEFC_TRADDR} >> /tmp/nvme_fc.log'"
      
      I will post proposed udev/systemd scripts for possible kernel support.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      eaefd5ab
  8. 25 9月, 2017 2 次提交
  9. 29 8月, 2017 2 次提交
    • J
      nvme-fc: Reattach to localports on re-registration · 5533d424
      James Smart 提交于
      If the LLDD resets or detaches from an fc port, the LLDD will
      deregister all remoteports seen by the fc port and deregister the
      localport associated with the fc port. The teardown of the localport
      structure will be held off due to reference counting until all the
      remoteports are removed (and they are held off until all
      controllers/associations to terminated). Currently, if the fc port
      is reinit/reattached and registered again as a localport it is
      treated as an independent entity from the prior localport and all
      prior remoteports and controllers cannot be revived. They are
      created as new and separate entities.
      
      This patch changes the localport registration to look at the known
      localports that are waiting to be torndown. If they are the same port
      based on wwn's, the local port is transitioned out of the teardown
      state.  This allows the remote ports and controller connections to
      be reestablished and resumed as long as the localport can also be
      reregistered within the timeout windows.
      
      The patch adds a new routine nvme_fc_attach_to_unreg_lport() with
      the functionality and moves the lport get/put routines to avoid
      forward references.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      5533d424
    • S
      nvme: Add admin_tagset pointer to nvme_ctrl · 34b6c231
      Sagi Grimberg 提交于
      Will be used when we centralize control flows.
      Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      34b6c231
  10. 18 8月, 2017 1 次提交
  11. 26 7月, 2017 1 次提交
    • J
      nvme-fc: revise TRADDR parsing · 9c5358e1
      James Smart 提交于
      The FC-NVME spec hasn't locked down on the format string for TRADDR.
      Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>"
      where the wwn's are hex values but not prefixed by 0x.
      
      Most implementations so far expect a string format of
      "nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport
      uses the match_u64 parser which requires a leading 0x prefix to set
      the base properly. If it's not there, a match will either fail or return
      a base 10 value.
      
      The resolution in T11 is pushing out. Therefore, to fix things now and
      to cover any eventuality and any implementations already in the field,
      this patch adds support for both formats.
      
      The change consists of replacing the token matching routine with a
      routine that validates the fixed string format, and then builds
      a local copy of the hex name with a 0x prefix before calling
      the system parser.
      
      Note: the same parser routine exists in both the initiator and target
      transports. Given this is about the only "shared" item, we chose to
      replicate rather than create an interdendency on some shared code.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      9c5358e1
  12. 25 7月, 2017 1 次提交
    • J
      nvme-fc: address target disconnect race conditions in fcp io submit · 8b25f351
      James Smart 提交于
      There are cases where threads are in the process of submitting new
      io when the LLDD calls in to remove the remote port. In some cases,
      the next io actually goes to the LLDD, who knows the remoteport isn't
      present and rejects it. To properly recovery/restart these i/o's we
      don't want to hard fail them, we want to treat them as temporary
      resource errors in which a delayed retry will work.
      
      Add a couple more checks on remoteport connectivity and commonize the
      busy response handling when it's seen.
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      8b25f351
  13. 06 7月, 2017 2 次提交
  14. 04 7月, 2017 3 次提交
  15. 02 7月, 2017 2 次提交
  16. 28 6月, 2017 3 次提交