1. 25 8月, 2021 3 次提交
  2. 19 7月, 2021 1 次提交
    • J
      scsi: lpfc: Delay unregistering from transport until GIDFT or ADISC completes · 06145683
      James Smart 提交于
      On an RSCN event, the nodes specified in RSCN payload and in MAPPED state
      are moved to NPR state in order to revalidate the login. This triggers an
      immediate unregister from SCSI/NVMe backend. The assumption is that the
      node may be missing. The re-registration with the backend happens after
      either relogin (PLOGI/PRLI; if ADISC is disabled or login truly lost) or
      when ADISC completes successfully (rediscover with ADISC enabled).
      
      However, the NVMe-FC standard provides for an RSCN to be triggered when
      the remote port supports a discovery controller and there was a change
      of discovery log content. As the remote port typically also supports
      storage subsystems, this unregister causes all storage controller
      connections to fail and require reconnect.
      
      Correct by reworking the code to ensure that the unregistration only occurs
      when a login state is truly terminated, thereby leaving the NVMe storage
      controllers in place.
      
      The changes made are:
      
       - Retain node state in ADISC_ISSUE when scheduling ADISC ELS retry.
      
       - Do not clear wwpn/wwnn values upon ADISC failure.
      
       - Move MAPPED nodes to NPR during RSCN processing, but do not unregister
         with transport.  On GIDFT completion, identify missing nodes (not marked
         NLP_NPR_2B_DISC) and unregister them.
      
       - Perform unregistration for nodes that will go through ADISC processing
         if ADISC completion fails.
      
       - Successful ADISC completion will move node back to MAPPED state.
      
      Link: https://lore.kernel.org/r/20210707184351.67872-16-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
      Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      06145683
  3. 10 6月, 2021 1 次提交
  4. 22 5月, 2021 1 次提交
  5. 13 4月, 2021 2 次提交
  6. 05 3月, 2021 2 次提交
  7. 08 1月, 2021 2 次提交
    • J
      scsi: lpfc: Implement health checking when aborting I/O · a22d73b6
      James Smart 提交于
      Several errors have occurred where the adapter stops or fails but does not
      raise the register values for the driver to detect failure. Thus driver is
      unaware of the failure. The failure typically results in I/O timeouts, the
      I/O timeout handler failing (after several seconds), and the error handler
      escalating recovery policy and resulting in more errors. Eventually, the
      driver is in a position where things have spiraled and it can't do recovery
      because other recovery ops are still outstanding and it becomes unusable.
      
      Resolve the situation by having the I/O timeout handler (actually a els,
      SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O,
      perform a mailbox command and look for a response from the hardware.  If
      the mailbox command fails, it will mark the adapter offline and then invoke
      the adapter reset handler to clean up.
      
      The new I/O timeout test will be limited to a test every 5s. If there are
      multiple I/O timeouts concurrently, only the 1st I/O timeout will generate
      the mailbox command. Further testing will only occur once a timeout occurs
      after a 5s delay from the last mailbox command has expired.
      
      Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      a22d73b6
    • J
      scsi: lpfc: Fix NVMe recovery after mailbox timeout · 9ec58ec7
      James Smart 提交于
      If a mailbox command times out, the SLI port is deemed in error and the
      port is reset.  The HBA cleanup is not returning I/Os to the NVMe layer
      before the port is unregistered. This is due to the HBA being marked
      offline (!SLI_ACTIVE) and cleanup being done by the mailbox timeout handler
      rather than an general adapter reset routine.  The mailbox timeout handler
      mailbox handler only cleaned up SCSI I/Os.
      
      Fix by reworking the mailbox handler to:
      
       - After handling the mailbox error, detect the board is already in
         failure (may be due to another error), and leave cleanup to the
         other handler.
      
       - If the mailbox command timeout is initial detector of the port error,
         continue with the board cleanup and marking the adapter offline
         (!SLI_ACTIVE). Remove the SCSI-only I/O cleanup routine. The generic
         reset adapter routine that is subsequently invoked, will clean up the
         I/Os.
      
       - Have the reset adapter routine flush all NVMe and SCSI I/Os if the
         adapter has been marked failed (!SLI_ACTIVE).
      
       - Rework the NVMe I/O terminate routine to take a status code to fail the
         I/O with and update so that cleaned up I/O calls the wqe completion
         routine. Currently it is bypassing the wqe cleanup and calling the NVMe
         I/O completion directly. The wqe completion routine will take care of
         data structure and node cleanup then call the NVMe I/O completion
         handler.
      
      Link: https://lore.kernel.org/r/20210104180240.46824-11-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      9ec58ec7
  8. 17 11月, 2020 5 次提交
  9. 03 7月, 2020 1 次提交
    • D
      scsi: lpfc: Add an internal trace log buffer · 372c187b
      Dick Kennedy 提交于
      The current logging methods typically end up requesting a reproduction with
      a different logging level set to figure out what happened. This was mainly
      by design to not clutter the kernel log messages with things that were
      typically not interesting and the messages themselves could cause other
      issues.
      
      When looking to make a better system, it was seen that in many cases when
      more data was wanted was when another message, usually at KERN_ERR level,
      was logged.  And in most cases, what the additional logging that was then
      enabled was typically. Most of these areas fell into the discovery machine.
      
      Based on this summary, the following design has been put in place: The
      driver will maintain an internal log (256 elements of 256 bytes).  The
      "additional logging" messages that are usually enabled in a reproduction
      will be changed to now log all the time to the internal log.  A new logging
      level is defined - LOG_TRACE_EVENT.  When this level is set (it is not by
      default) and a message marked as KERN_ERR is logged, all the messages in
      the internal log will be dumped to the kernel log before the KERN_ERR
      message is logged.
      
      There is a timestamp on each message added to the internal log. However,
      this timestamp is not converted to wall time when logged. The value of the
      timestamp is solely to give a crude time reference for the messages.
      
      Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      372c187b
  10. 10 5月, 2020 3 次提交
    • J
      lpfc: nvmet: Add support for NVME LS request hosthandle · 4c2805aa
      James Smart 提交于
      As the nvmet layer does not have the concept of a remoteport object, which
      can be used to identify the entity on the other end of the fabric that is
      to receive an LS, the hosthandle was introduced.  The driver passes the
      hosthandle, a value representative of the remote port, with a ls request
      receive. The LS request will create the association.  The transport will
      remember the hosthandle for the association, and if there is a need to
      initiate a LS request to the remote port for the association, the
      hosthandle will be used. When the driver loses connectivity with the
      remote port, it needs to notify the transport that the hosthandle is no
      longer valid, allowing the transport to terminate associations related to
      the hosthandle.
      
      This patch adds support to the driver for the hosthandle. The driver will
      use the ndlp pointer of the remote port for the hosthandle in calls to
      nvmet_fc_rcv_ls_req().  The discovery engine is updated to invalidate the
      hosthandle whenever connectivity with the remote port is lost.
      Signed-off-by: NPaul Ely <paul.ely@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      4c2805aa
    • J
      lpfc: Refactor NVME LS receive handling · 3a8070c5
      James Smart 提交于
      In preparation for supporting both intiator mode and target mode
      receiving NVME LS's, commonize the existing NVME LS request receive
      handling found in the base driver and in the nvmet side.
      
      Using the original lpfc_nvmet_unsol_ls_event() and
      lpfc_nvme_unsol_ls_buffer() routines as a templates, commonize the
      reception of an NVME LS request. The common routine will validate the LS
      request, that it was received from a logged-in node, and allocate a
      lpfc_async_xchg_ctx that is used to manage the LS request. The role of
      the port is then inspected to determine which handler is to receive the
      LS - nvme or nvmet. As such, the nvmet handler is tied back in. A handler
      is created in nvme and is stubbed out.
      Signed-off-by: NPaul Ely <paul.ely@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      3a8070c5
    • J
      lpfc: Refactor nvmet_rcv_ctx to create lpfc_async_xchg_ctx · 7cacae2a
      James Smart 提交于
      To support FC-NVME-2 support (actually FC-NVME (rev 1) with Ammendment 1),
      both the nvme (host) and nvmet (controller/target) sides will need to be
      able to receive LS requests.  Currently, this support is in the nvmet side
      only. To prepare for both sides supporting LS receive, rename
      lpfc_nvmet_rcv_ctx to lpfc_async_xchg_ctx and commonize the definition.
      Signed-off-by: NPaul Ely <paul.ely@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      7cacae2a
  11. 30 3月, 2020 1 次提交
  12. 27 3月, 2020 1 次提交
  13. 18 2月, 2020 1 次提交
    • J
      scsi: lpfc: add RDF registration and Link Integrity FPIN logging · df3fe766
      James Smart 提交于
      This patch modifies lpfc to register for Link Integrity events via the use
      of an RDF ELS and to perform Link Integrity FPIN logging.
      
      Specifically, the driver was modified to:
      
       - Format and issue the RDF ELS immediately following SCR registration.
         This registers the ability of the driver to receive FPIN ELS.
      
       - Adds decoding of the FPIN els into the received descriptors, with
         logging of the Link Integrity event information. After decoding, the ELS
         is delivered to the scsi fc transport to be delivered to any user-space
         applications.
      
       - To aid in logging, simple helpers were added to create enum to name
         string lookup functions that utilize the initialization helpers from the
         fc_els.h header.
      
       - Note: base header definitions for the ELS's don't populate the
         descriptor payloads. As such, lpfc creates it's own version of the
         structures, using the base definitions (mostly headers) and additionally
         declaring the descriptors that will complete the population of the ELS.
      
      Link: https://lore.kernel.org/r/20200210173155.547-3-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      df3fe766
  14. 22 12月, 2019 1 次提交
  15. 13 11月, 2019 1 次提交
  16. 06 11月, 2019 1 次提交
    • J
      scsi: lpfc: Add registration for CPU Offline/Online events · 93a4d6f4
      James Smart 提交于
      The recent affinitization didn't address cpu offlining/onlining.  If an
      interrupt vector is shared and the low order cpu owning the vector is
      offlined, as interrupts are managed, the vector is taken offline. This
      causes the other CPUs sharing the vector will hang as they can't get io
      completions.
      
      Correct by registering callbacks with the system for Offline/Online
      events. When a cpu is taken offline, its eq, which is tied to an interrupt
      vector is found. If the cpu is the "owner" of the vector and if the
      eq/vector is shared by other CPUs, the eq is placed into a polled mode.
      Additionally, code paths that perform io submission on the "sharing CPUs"
      will check the eq state and poll for completion after submission of new io
      to a wq that uses the eq.
      
      Similarly, when a cpu comes back online and owns an offlined vector, the eq
      is taken out of polled mode and rearmed to start driving interrupts for eq.
      
      Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      93a4d6f4
  17. 01 10月, 2019 1 次提交
  18. 30 8月, 2019 1 次提交
  19. 20 8月, 2019 2 次提交
    • J
      scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair · c00f62e6
      James Smart 提交于
      Currently, each hardware queue, typically allocated per-cpu, consists of a
      WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2
      WQ/CQ pairs will exist for the hardware queue. Separate queues are
      unnecessary. The current implementation wastes memory backing the 2nd set
      of queues, and the use of double the SLI-4 WQ/CQ's means less hardware
      queues can be supported which means there may not always be enough to have
      a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their
      own WQ/CQ.
      
      Rework the implementation to use a single WQ/CQ pair by both protocols.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      c00f62e6
    • J
      scsi: lpfc: Fix hang when downloading fw on port enabled for nvme · 84f2ddf8
      James Smart 提交于
      As part of firmware download, the adapter is reset. On the adapter the
      reset causes the function to stop and all outstanding io is terminated
      (without responses). The reset path then starts teardown of the adapter,
      starting with deregistration of the remote ports with the nvme-fc
      transport. The local port is then deregistered and the driver waits for
      local port deregistration. This never finishes.
      
      The remote port deregistrations terminated the nvme controllers, causing
      them to send aborts for all the outstanding io. The aborts were serviced in
      the driver, but stalled due to its state. The nvme layer then stops to
      reclaim it's outstanding io before continuing.  The io must be returned
      before the reset on the controller is deemed complete and the controller
      delete performed.  The remote port deregistration won't complete until all
      the controllers are terminated. And the local port deregistration won't
      complete until all controllers and remote ports are terminated. Thus things
      hang.
      
      The issue is the reset which stopped the adapter also stopped all the
      responses that would drive i/o completions, and the aborts were also
      stopped that stopped i/o completions. The driver, when resetting the
      adapter like this, needs to be generating the completions as part of the
      adapter reset so that I/O complete (in error), and any aborts are not
      queued.
      
      Fix by adding flush routines whenever the adapter port has been reset or
      discovered in error. The flush routines will generate the completions for
      the scsi and nvme outstanding io. The abort ios, if waiting, will be caught
      and flushed as well.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      84f2ddf8
  20. 21 6月, 2019 2 次提交
  21. 19 6月, 2019 1 次提交
    • J
      scsi: lpfc: Separate CQ processing for nvmet_fc upcalls · d74a89aa
      James Smart 提交于
      Currently the driver is notified of new command frame receipt by CQEs. As
      part of the CQE processing, the driver upcalls the nvmet_fc transport to
      deliver the command. nvmet_fc, as part of receiving the command builds out
      a context for it, where one of the first steps is to allocate memory for
      the io.
      
      When running with tests that do large ios (1MB), it was found on some
      systems, the total number of outstanding I/O's, at 1MB per, completely
      consumed the system's memory. Thus additional ios were getting blocked in
      the memory allocator.  Given that this blocked the lpfc thread processing
      CQEs, there were lots of other commands that were received and which are
      then held up, and given CQEs are serially processed, the aggregate delays
      for an IO waiting behind the others became cummulative - enough so that the
      initiator hit timeouts for the ios.
      
      The basic fix is to avoid the direct upcall and instead schedule a work
      item for each io as it is received. This allows the cq processing to
      complete very quickly, and each io can then run or block on it's own.
      However, this general solution hurts latency when there are few ios.  As
      such, implemented the fix such that the driver watches how many CQEs it has
      processed sequentially in one run. As long as the count is below a
      threshold, the direct nvmet_fc upcall will be made. Only when the count is
      exceeded will it revert to work scheduling.
      
      Given that debug of this showed a surprisingly long delay in cq processing,
      the io timer stats were updated to better reflect the processing of the
      different points.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <jsmart2021@gmail.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      d74a89aa
  22. 06 2月, 2019 6 次提交