1. 17 5月, 2017 9 次提交
  2. 09 5月, 2017 1 次提交
    • J
      scsi: lpfc: Fix panic on BFS configuration · 4492b739
      James Smart 提交于
      To select the appropriate shost template, the driver is issuing a
      mailbox command to retrieve the wwn. Turns out the sending of the
      command precedes the reset of the function.  On SLI-4 adapters, this is
      inconsequential as the mailbox command location is specified by dma via
      the BMBX register. However, on SLI-3 adapters, the location of the
      mailbox command submission area changes. When the function is first
      powered on or reset, the cmd is submitted via PCI bar memory. Later the
      driver changes the function config to use host memory and DMA. The
      request to start a mailbox command is the same, a simple doorbell write,
      regardless of submission area.  So.. if there has not been a boot driver
      run against the adapter, the mailbox command works as defaults are
      ok. But, if the boot driver has configured the card and, and if no
      platform pci function/slot reset occurs as the os starts, the mailbox
      command will fail. The SLI-3 device will use the stale boot driver dma
      location. This can cause PCI eeh errors.
      
      Fix is to reset the sli-3 function before sending the mailbox command,
      thus synchronizing the function/driver on mailbox location.
      
      Note: The fix uses routines that are typically invoked later in the call
      flow to reset the sli-3 device. The issue in using those routines is
      that the normal (non-fix) flow does additional initialization, namely
      the allocation of the pport structure. So, rather than significantly
      reworking the initialization flow so that the pport is alloc'd first,
      pointer checks are added to work around it. Checks are limited to the
      routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
      is only invoked on a sli3 adapter. Nothing changes post the
      fix. Subsequent initialization, and another adapter reset, still occur -
      both on sli-3 and sli-4 adapters.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Fixes: 96418b5e ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
      Cc: stable@vger.kernel.org # v4.11+
      Reviewed-by: NEwan D. Milne <emilne@redhat.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      4492b739
  3. 24 4月, 2017 5 次提交
    • J
      Update ABORT processing for NVMET. · 86c67379
      James Smart 提交于
      The driver with nvme had this routine stubbed.
      
      Right now XRI_ABORTED_CQE is not handled and the FC NVMET
      Transport has a new API for the driver.
      
      Missing code path, new NVME abort API
      Update ABORT processing for NVMET
      
      There are 3 new FC NVMET Transport API/ template routines for NVMET:
      
      lpfc_nvmet_xmt_fcp_release
      This NVMET template callback routine called to release context
      associated with an IO This routine is ALWAYS called last, even
      if the IO was aborted or completed in error.
      
      lpfc_nvmet_xmt_fcp_abort
      This NVMET template callback routine called to abort an exchange that
      has an IO in progress
      
      nvmet_fc_rcv_fcp_req
      When the lpfc driver receives an ABTS, this NVME FC transport layer
      callback routine is called. For this case there are 2 paths thru the
      driver: the driver either has an outstanding exchange / context for the
      XRI to be aborted or not.  If not, a BA_RJT is issued otherwise a BA_ACC
      
      NVMET Driver abort paths:
      
      There are 2 paths for aborting an IO. The first one is we receive an IO and
      decide not to process it because of lack of resources. An unsolicated ABTS
      is immediately sent back to the initiator as a response.
      lpfc_nvmet_unsol_fcp_buffer
                  lpfc_nvmet_unsol_issue_abort  (XMIT_SEQUENCE_WQE)
      
      The second one is we sent the IO up to the NVMET transport layer to
      process, and for some reason the NVME Transport layer decided to abort the
      IO before it completes all its phases. For this case there are 2 paths
      thru the driver:
      the driver either has an outstanding TSEND/TRECEIVE/TRSP WQE or no
      outstanding WQEs are present for the exchange / context.
      lpfc_nvmet_xmt_fcp_abort
          if (LPFC_NVMET_IO_INP)
              lpfc_nvmet_sol_fcp_issue_abort  (ABORT_WQE)
                      lpfc_nvmet_sol_fcp_abort_cmp
          else
              lpfc_nvmet_unsol_fcp_issue_abort
                      lpfc_nvmet_unsol_issue_abort  (XMIT_SEQUENCE_WQE)
                              lpfc_nvmet_unsol_fcp_abort_cmp
      
      Context flags:
      LPFC_NVMET_IOP - his flag signifies an IO is in progress on the exchange.
      LPFC_NVMET_XBUSY  - this flag indicates the IO completed but the firmware
      is still busy with the corresponding exchange. The exchange should not be
      reused until after a XRI_ABORTED_CQE is received for that exchange.
      LPFC_NVMET_ABORT_OP - this flag signifies an ABORT_WQE was issued on the
      exchange.
      LPFC_NVMET_CTX_RLS  - this flag signifies a context free was requested,
      but we are deferring it due to an XBUSY or ABORT in progress.
      
      A ctxlock is added to the context structure that is used whenever these
      flags are set/read  within the context of an IO.
      The LPFC_NVMET_CTX_RLS flag is only set in the defer_relase routine when
      the transport has resolved all IO associated with the buffer. The flag is
      cleared when the CTX is associated with a new IO.
      
      An exchange can has both an LPFC_NVMET_XBUSY and a LPFC_NVMET_ABORT_OP
      condition active simultaneously. Both conditions must complete before the
      exchange is freed.
      When the abort callback (lpfc_nvmet_xmt_fcp_abort) is envoked:
      If there is an outstanding IO, the driver will issue an ABORT_WQE. This
      should result in 3 completions for the exchange:
      1) IO cmpl with XB bit set
      2) Abort WQE cmpl
      3) XRI_ABORTED_CQE cmpl
      For this scenerio, after completion #1, the NVMET Transport IO rsp
      callback is called.  After completion #2, no action is taken with respect
      to the exchange / context.  After completion #3, the exchange context is
      free for re-use on another IO.
      
      If there is no outstanding activity on the exchange, the driver will send a
      ABTS to the Initiator. Upon completion of this WQE, the exchange / context
      is freed for re-use on another IO.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      86c67379
    • J
      Fix crash after issuing lip reset · 9d3d340d
      James Smart 提交于
      When RPI is not available, driver sends WQE with invalid RPI value and
      rejected by HBA.
      lpfc 0000:82:00.3: 1:3154 BLS ABORT RSP failed, data:  x3/xa0320008
      and
      lpfc :2753 PLOGI failure DID:FFFFFA Status:x3/xa0240008
      
      In this case, driver accesses rpi_ids array out of bounds.
      
      Fix:
      Check return value of lpfc_sli4_alloc_rpi(). Do not allocate
      lpfc_nodelist entry if RPI is not available.
      
      When RPI is not available, we will get discovery timeouts and
      command drops for some of the vports as seen below.
      
      lpfc :0273 Unexpected discovery timeout, vport State x0
      lpfc :0230 Unexpected timeout, hba link state x5
      lpfc :0111 Dropping received ELS cmd Data: x0 xc90c55 x0
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      9d3d340d
    • J
      Fix driver usage of 128B WQEs when WQ_CREATE is V1. · 3f247de7
      James Smart 提交于
      There are two versions of a structure for queue creation and setup that the
      driver shares with FW. The driver was only treating as version 0.
      
      Verify WQ_CREATE with 128B WQEs in V0 and V1.
      
      Code review of another bug showed the driver passing
      128B WQEs and 8 pages in WQ CREATE and V0.
      Code inspection/instrumentation showed that the driver
      uses V0 in WQ_CREATE and if the caller passes queue->entry_size
      128B, the driver sets the hdr_version to V1 so all is good.
      When I tested the V1 WQ_CREATE, the mailbox failed causing
      the driver to unload.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      3f247de7
    • J
      Fix driver unload/reload operation. · d1f525aa
      James Smart 提交于
      There are couple of different load/unload issues fixed with this patch.
      One of the issues was reported by Junichi Nomura, a patch was submitted
      by Johannes Thumsrhirn which did fix one of the problems but the fix in
      this patch separates the pring free from the queue free and does not set
      the parameter passed in to NULL.
      
      issues:
      (1) driver could not be unloaded and reloaded without some Oops or
       Panic occurring.
      (2) The driver was panicking because of a corruption in the Memory
      Manager when the iocb list was getting allocated.
      
      Root cause for the memory corruption was a double free of the Work Queue
      ring pointer memory - Freed once in the lpfc_sli4_queue_free when the CQ
      was destroyed and again in lpfc_sli4_queue_free when the WQ was destroyed.
      
      The pring free and the queue free were separated, the pring free was moved
      to the wq destroy routine because it a better fit logically to delete the
      ring with the wq.
      
      The checkpatch flagged several alignmenet issues that were also corrected
      with this patch.
      
      The mboxq was never initialed correctly before it was used by the driver
      this patch corrects that issue.
      Reported-by: NJunichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Tested-by: NJunichi Nomura <j-nomura@ce.jp.nec.com>
      d1f525aa
    • J
      Fix spelling in comments. · 0ef69968
      James Smart 提交于
      Comment should have said Repost.
      Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NJames Smart <james.smart@broadcom.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      0ef69968
  4. 07 3月, 2017 4 次提交
  5. 28 2月, 2017 2 次提交
  6. 23 2月, 2017 9 次提交
  7. 12 1月, 2017 1 次提交
  8. 05 1月, 2017 4 次提交
  9. 25 11月, 2016 1 次提交
    • M
      scsi: lpfc: fix oops/BUG in lpfc_sli_ringtxcmpl_put() · 2319f847
      Mauricio Faria de Oliveira 提交于
      The BUG_ON() recently introduced in lpfc_sli_ringtxcmpl_put() is hit in
      the lpfc_els_abort() > lpfc_sli_issue_abort_iotag() >
      lpfc_sli_abort_iotag_issue() function path [similar names], due to
      'piocb->vport == NULL':
      
      	BUG_ON(!piocb || !piocb->vport);
      
      This happens because lpfc_sli_abort_iotag_issue() doesn't set the
      'abtsiocbp->vport' pointer -- but this is not the problem.
      
      Previously, lpfc_sli_ringtxcmpl_put() accessed 'piocb->vport' only if
      'piocb->iocb.ulpCommand' is neither CMD_ABORT_XRI_CN nor
      CMD_CLOSE_XRI_CN, which are the only possible values for
      lpfc_sli_abort_iotag_issue():
      
          lpfc_sli_ringtxcmpl_put():
      
              if ((unlikely(pring->ringno == LPFC_ELS_RING)) &&
                 (piocb->iocb.ulpCommand != CMD_ABORT_XRI_CN) &&
                 (piocb->iocb.ulpCommand != CMD_CLOSE_XRI_CN) &&
                  (!(piocb->vport->load_flag & FC_UNLOADING)))
      
          lpfc_sli_abort_iotag_issue():
      
              if (phba->link_state >= LPFC_LINK_UP)
                      iabt->ulpCommand = CMD_ABORT_XRI_CN;
              else
                      iabt->ulpCommand = CMD_CLOSE_XRI_CN;
      
      So, this function path would not have hit this possible NULL pointer
      dereference before.
      
      In order to fix this regression, move the second part of the BUG_ON()
      check prior to the pointer dereference that it does check for.
      
      For reference, this is the stack trace observed. The problem happened
      because an unsolicited event was received - a PLOGI was received after
      our PLOGI was issued but not yet complete, so the discovery state
      machine goes on to sw-abort our PLOGI.
      
          kernel BUG at drivers/scsi/lpfc/lpfc_sli.c:1326!
          Oops: Exception in kernel mode, sig: 5 [#1]
          <...>
          NIP [...] lpfc_sli_ringtxcmpl_put+0x1c/0xf0 [lpfc]
          LR  [...] __lpfc_sli_issue_iocb_s4+0x188/0x200 [lpfc]
          Call Trace:
          [...] [...] __lpfc_sli_issue_iocb_s4+0xb0/0x200 [lpfc] (unreliable)
          [...] [...] lpfc_sli_issue_abort_iotag+0x2b4/0x350 [lpfc]
          [...] [...] lpfc_els_abort+0x1a8/0x4a0 [lpfc]
          [...] [...] lpfc_rcv_plogi+0x6d4/0x700 [lpfc]
          [...] [...] lpfc_rcv_plogi_plogi_issue+0xd8/0x1d0 [lpfc]
          [...] [...] lpfc_disc_state_machine+0xc0/0x2b0 [lpfc]
          [...] [...] lpfc_els_unsol_buffer+0xcc0/0x26c0 [lpfc]
          [...] [...] lpfc_els_unsol_event+0xa8/0x220 [lpfc]
          [...] [...] lpfc_complete_unsol_iocb+0xb8/0x138 [lpfc]
          [...] [...] lpfc_sli4_handle_received_buffer+0x6a0/0xec0 [lpfc]
          [...] [...] lpfc_sli_handle_slow_ring_event_s4+0x1c4/0x240 [lpfc]
          [...] [...] lpfc_sli_handle_slow_ring_event+0x24/0x40 [lpfc]
          [...] [...] lpfc_do_work+0xd88/0x1970 [lpfc]
          [...] [...] kthread+0x108/0x130
          [...] [...] ret_from_kernel_thread+0x5c/0xbc
          <...>
      
      Cc: stable@vger.kernel.org # v4.8
      Fixes: 22466da5 ("lpfc: Fix possible NULL pointer dereference")
      Reported-by: NHarsha Thyagaraja <hathyaga@in.ibm.com>
      Signed-off-by: NMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      2319f847
  10. 09 11月, 2016 2 次提交
  11. 27 9月, 2016 1 次提交
    • B
      scsi: lpfc: Mark symbols static where possible · bd4b3e5c
      Baoyou Xie 提交于
      We get a few warnings when building kernel with W=1:
      drivers/scsi/lpfc/lpfc_sli.c:5693:1: warning: no previous prototype for 'lpfc_set_features' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_sli.c:8972:1: warning: no previous prototype for 'lpfc_sli_calc_ring' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4621:1: warning: no previous prototype for 'lpfc_rdp_res_link_service' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4633:1: warning: no previous prototype for 'lpfc_rdp_res_sfp_desc' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4698:1: warning: no previous prototype for 'lpfc_rdp_res_link_error' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4727:1: warning: no previous prototype for 'lpfc_rdp_res_bbc_desc' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4752:1: warning: no previous prototype for 'lpfc_rdp_res_oed_temp_desc' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4780:1: warning: no previous prototype for 'lpfc_rdp_res_oed_voltage_desc' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4809:1: warning: no previous prototype for 'lpfc_rdp_res_oed_txbias_desc' [-Wmissing-prototypes]
      drivers/scsi/lpfc/lpfc_els.c:4838:1: warning: no previous prototype for 'lpfc_rdp_res_oed_txpower_desc' [-Wmissing-prototypes]
      ....
      
      In fact, these functions are only used in the file in which they are
      declared and don't need a declaration, but can be made static.  So this
      patch marks these functions with 'static'.
      Signed-off-by: NBaoyou Xie <baoyou.xie@linaro.org>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      bd4b3e5c
  12. 02 8月, 2016 1 次提交