1. 13 3月, 2018 1 次提交
  2. 13 2月, 2018 2 次提交
  3. 05 12月, 2017 1 次提交
  4. 02 11月, 2017 1 次提交
  5. 25 8月, 2017 1 次提交
    • D
      scsi: lpfc: Fix MRQ > 1 context list handling · 66d7ce93
      Dick Kennedy 提交于
      Various oops including cpu LOCKUPs were seen.
      
      For asynchronously received ius where the driver must assign exchange
      resources, the resources were on a single get (free) list and put list
      (finished, waiting to be put on get list). As all cpus are sharing the
      lists, an interrupt for a receive frame may have to wait for all the
      other cpus to place their done work onto the put list before it can
      acquire the lock to pull from the list.
      
      Fix by breaking the resource lists into per-cpu lists or at least more
      than 1 list with cpu's sharing the lists). A cpu would allocate from the
      free list for its own cpu, and put its done work on the its own put list
      - avoiding the contention. As cpu load may vary, when empty, a cpu may
      grab from another cpu, thereby changing resource distribution.  But
      searching for a resource only occurs on 1 or a few cpus until a single
      resource can be allocated. if the condition reoccurs, it starts looking
      at a different cpu.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      66d7ce93
  6. 01 6月, 2017 1 次提交
  7. 17 5月, 2017 3 次提交
  8. 09 5月, 2017 1 次提交
    • J
      scsi: lpfc: Fix panic on BFS configuration · 4492b739
      James Smart 提交于
      To select the appropriate shost template, the driver is issuing a
      mailbox command to retrieve the wwn. Turns out the sending of the
      command precedes the reset of the function.  On SLI-4 adapters, this is
      inconsequential as the mailbox command location is specified by dma via
      the BMBX register. However, on SLI-3 adapters, the location of the
      mailbox command submission area changes. When the function is first
      powered on or reset, the cmd is submitted via PCI bar memory. Later the
      driver changes the function config to use host memory and DMA. The
      request to start a mailbox command is the same, a simple doorbell write,
      regardless of submission area.  So.. if there has not been a boot driver
      run against the adapter, the mailbox command works as defaults are
      ok. But, if the boot driver has configured the card and, and if no
      platform pci function/slot reset occurs as the os starts, the mailbox
      command will fail. The SLI-3 device will use the stale boot driver dma
      location. This can cause PCI eeh errors.
      
      Fix is to reset the sli-3 function before sending the mailbox command,
      thus synchronizing the function/driver on mailbox location.
      
      Note: The fix uses routines that are typically invoked later in the call
      flow to reset the sli-3 device. The issue in using those routines is
      that the normal (non-fix) flow does additional initialization, namely
      the allocation of the pport structure. So, rather than significantly
      reworking the initialization flow so that the pport is alloc'd first,
      pointer checks are added to work around it. Checks are limited to the
      routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
      is only invoked on a sli3 adapter. Nothing changes post the
      fix. Subsequent initialization, and another adapter reset, still occur -
      both on sli-3 and sli-4 adapters.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Fixes: 96418b5e ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
      Cc: stable@vger.kernel.org # v4.11+
      Reviewed-by: NEwan D. Milne <emilne@redhat.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      4492b739
  9. 24 4月, 2017 2 次提交
    • J
      Update ABORT processing for NVMET. · 86c67379
      James Smart 提交于
      The driver with nvme had this routine stubbed.
      
      Right now XRI_ABORTED_CQE is not handled and the FC NVMET
      Transport has a new API for the driver.
      
      Missing code path, new NVME abort API
      Update ABORT processing for NVMET
      
      There are 3 new FC NVMET Transport API/ template routines for NVMET:
      
      lpfc_nvmet_xmt_fcp_release
      This NVMET template callback routine called to release context
      associated with an IO This routine is ALWAYS called last, even
      if the IO was aborted or completed in error.
      
      lpfc_nvmet_xmt_fcp_abort
      This NVMET template callback routine called to abort an exchange that
      has an IO in progress
      
      nvmet_fc_rcv_fcp_req
      When the lpfc driver receives an ABTS, this NVME FC transport layer
      callback routine is called. For this case there are 2 paths thru the
      driver: the driver either has an outstanding exchange / context for the
      XRI to be aborted or not.  If not, a BA_RJT is issued otherwise a BA_ACC
      
      NVMET Driver abort paths:
      
      There are 2 paths for aborting an IO. The first one is we receive an IO and
      decide not to process it because of lack of resources. An unsolicated ABTS
      is immediately sent back to the initiator as a response.
      lpfc_nvmet_unsol_fcp_buffer
                  lpfc_nvmet_unsol_issue_abort  (XMIT_SEQUENCE_WQE)
      
      The second one is we sent the IO up to the NVMET transport layer to
      process, and for some reason the NVME Transport layer decided to abort the
      IO before it completes all its phases. For this case there are 2 paths
      thru the driver:
      the driver either has an outstanding TSEND/TRECEIVE/TRSP WQE or no
      outstanding WQEs are present for the exchange / context.
      lpfc_nvmet_xmt_fcp_abort
          if (LPFC_NVMET_IO_INP)
              lpfc_nvmet_sol_fcp_issue_abort  (ABORT_WQE)
                      lpfc_nvmet_sol_fcp_abort_cmp
          else
              lpfc_nvmet_unsol_fcp_issue_abort
                      lpfc_nvmet_unsol_issue_abort  (XMIT_SEQUENCE_WQE)
                              lpfc_nvmet_unsol_fcp_abort_cmp
      
      Context flags:
      LPFC_NVMET_IOP - his flag signifies an IO is in progress on the exchange.
      LPFC_NVMET_XBUSY  - this flag indicates the IO completed but the firmware
      is still busy with the corresponding exchange. The exchange should not be
      reused until after a XRI_ABORTED_CQE is received for that exchange.
      LPFC_NVMET_ABORT_OP - this flag signifies an ABORT_WQE was issued on the
      exchange.
      LPFC_NVMET_CTX_RLS  - this flag signifies a context free was requested,
      but we are deferring it due to an XBUSY or ABORT in progress.
      
      A ctxlock is added to the context structure that is used whenever these
      flags are set/read  within the context of an IO.
      The LPFC_NVMET_CTX_RLS flag is only set in the defer_relase routine when
      the transport has resolved all IO associated with the buffer. The flag is
      cleared when the CTX is associated with a new IO.
      
      An exchange can has both an LPFC_NVMET_XBUSY and a LPFC_NVMET_ABORT_OP
      condition active simultaneously. Both conditions must complete before the
      exchange is freed.
      When the abort callback (lpfc_nvmet_xmt_fcp_abort) is envoked:
      If there is an outstanding IO, the driver will issue an ABORT_WQE. This
      should result in 3 completions for the exchange:
      1) IO cmpl with XB bit set
      2) Abort WQE cmpl
      3) XRI_ABORTED_CQE cmpl
      For this scenerio, after completion #1, the NVMET Transport IO rsp
      callback is called.  After completion #2, no action is taken with respect
      to the exchange / context.  After completion #3, the exchange context is
      free for re-use on another IO.
      
      If there is no outstanding activity on the exchange, the driver will send a
      ABTS to the Initiator. Upon completion of this WQE, the exchange / context
      is freed for re-use on another IO.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      86c67379
    • J
      Fix crash after issuing lip reset · 9d3d340d
      James Smart 提交于
      When RPI is not available, driver sends WQE with invalid RPI value and
      rejected by HBA.
      lpfc 0000:82:00.3: 1:3154 BLS ABORT RSP failed, data:  x3/xa0320008
      and
      lpfc :2753 PLOGI failure DID:FFFFFA Status:x3/xa0240008
      
      In this case, driver accesses rpi_ids array out of bounds.
      
      Fix:
      Check return value of lpfc_sli4_alloc_rpi(). Do not allocate
      lpfc_nodelist entry if RPI is not available.
      
      When RPI is not available, we will get discovery timeouts and
      command drops for some of the vports as seen below.
      
      lpfc :0273 Unexpected discovery timeout, vport State x0
      lpfc :0230 Unexpected timeout, hba link state x5
      lpfc :0111 Dropping received ELS cmd Data: x0 xc90c55 x0
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      9d3d340d
  10. 07 3月, 2017 1 次提交
    • J
      scsi: lpfc: Fix eh_deadline setting for sli3 adapters. · 96418b5e
      James Smart 提交于
      A previous change unilaterally removed the hba reset entry point
      from the sli3 host template. This was done to allow tape devices
      being used for back up from being removed. Why was this done ?
      When there was non-responding device on the fabric, the error
      escalation policy would escalate to the reset handler. When the
      reset handler was called, it would reset the adapter, dropping
      link, thus logging out and terminating all i/o's - on any target.
      If there was a tape device on the same adapter that wasn't in
      error, it would kill the tape i/o's, effectively killing the
      tape device state.  With the reset point removed, the adapter
      reset avoided the fabric logout, allowing the other devices to
      continue to operate unaffected. A hack - yes. Hint: we really
      need a transport I_T nexus reset callback added to the eh process
      (in between the SCSI target reset and hba reset points), so a
      fc logout could occur to the one bad target only and stop the error
      escalation process.
      
      This patch commonizes the approach so it can be used for sli3 and sli4
      adapters, but mandates the admin, via module parameter, specifically
      identify which adapters the resets are to be removed for. Additionally,
      bus_reset, which sends Target Reset TMFs to all targets, is also removed
      from the template as it too has the same effect as the adapter reset.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NLaurence Oberman <loberman@redhat.com>
      Tested-by: NLaurence Oberman <loberman@redhat.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      96418b5e
  11. 23 2月, 2017 7 次提交
  12. 05 1月, 2017 1 次提交
  13. 18 11月, 2016 1 次提交
  14. 09 11月, 2016 1 次提交
  15. 16 7月, 2016 5 次提交
  16. 22 12月, 2015 2 次提交
  17. 06 6月, 2015 1 次提交
  18. 10 4月, 2015 3 次提交
  19. 17 9月, 2014 1 次提交
  20. 03 6月, 2014 3 次提交
  21. 16 3月, 2014 1 次提交