提交 · 6831ce129f1948f50f2d2a57995d2ebd7a6fa0b4 · openeuler / Kernel

16 3月, 2022 4 次提交

scsi: lpfc: SLI path split: Refactor base ELS paths and the FLOGI path · 6831ce12

由 James Smart 提交于 2月 24, 2022

The patch refactors the general ELS handling paths to migrate to SLI-4
structures or common element abstractions. The fabric login paths are
revised as part of this patch:

 - New generic lpfc_sli_prep_els_req_rsp jump table routine

 - Introduce ls_rjt_error_be and ulp_bde64_le unions to correct legacy
   endianness assignments

 - Conversion away from using SLI-3 iocb structures to set/access fields in
   common routines. Use the new generic get/set routines that were added.
   This move changes code from indirect structure references to using local
   variables with the generic routines.

 - Refactor routines when setting non-generic fields, to have both SLI3 and
   SLI4 specific sections. This replaces the set-as-SLI3 then translate to
   SLI4 behavior of the past.

 - Clean up poor indentation on some of the ELS paths

Link: https://lore.kernel.org/r/20220225022308.16486-5-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

6831ce12

scsi: lpfc: SLI path split: Introduce lpfc_prep_wqe · 56134142

由 James Smart 提交于 2月 24, 2022

Introduce lpfc_prep_wqe routine.

The lpfc_prep_wqe() routine is used with lpfc_sli_issue_iocb() and
lpfc_sli_issue_iocb_wait(). The routine performs additional SLI-4 wqe field
setting that the generic routines did not perform as they kept their
actions compatible with both SLI3 and SLI4.

Link: https://lore.kernel.org/r/20220225022308.16486-4-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

56134142

scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4 · 1b64aa9e

由 James Smart 提交于 2月 24, 2022

Convert the SLI4 fast and slow paths to use native SLI4 wqe constructs
instead of iocb SLI3-isms.

Includes the following:

 - Create simple get_xxx and set_xxx routines to wrapper access to common
   elements in both SLI3 and SLI4 commands - allowing calling routines to
   avoid sli-rev-specific structures to access the elements.

 - using the wqe in the job structure as the primary element

 - use defines from SLI-4, not SLI-3

 - Removal of iocb to wqe conversion from fast and slow path

 - Add below routines to handle fast path
	lpfc_prep_embed_io - prepares the wqe for fast path
	lpfc_wqe_bpl2sgl   - manages bpl to sgl conversion
	lpfc_sli_wqe2iocb  - converts a WQE to IOCB for SLI-3 path

 - Add lpfc_sli3_iocb2wcqecmpl in completion path to convert an SLI-3
   iocb completion to wcqe completion

 - Refactor some of the code that works on both revs for clarity

Link: https://lore.kernel.org/r/20220225022308.16486-3-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

1b64aa9e

scsi: lpfc: SLI path split: Refactor lpfc_iocbq · a680a929

由 James Smart 提交于 2月 24, 2022

Currently, SLI3 and SLI4 data paths use the same lpfc_iocbq structure.
This is a "common" structure but many of the components refer to sli-rev
specific entities which can lead the developer astray as to what they
actually mean, should be set to, or when they should be used.

This first patch prepares the lpfc_iocbq structure so that elements common
to both SLI3 and SLI4 data paths are more appropriately named, making it
clear they apply generically.

Fieldnames based on 'iocb' (sli3) or 'wqe' (sli4) which are actually
generic to the paths are renamed to 'cmd':

 - iocb_flag is renamed to cmd_flag

 - lpfc_vmid_iocb_tag is renamed to lpfc_vmid_tag

 - fabric_iocb_cmpl is renamed to fabric_cmd_cmpl

 - wait_iocb_cmpl is renamed to wait_cmd_cmpl

 - iocb_cmpl and wqe_cmpl are combined and renamed to cmd_cmpl

 - rsvd2 member is renamed to num_bdes due to pre-existing usage

The structure name itself will retain the iocb reference as changing to a
more relevant "job" or "cmd" title induces many hundreds of line changes
for only a name change.

lpfc_post_buffer is also renamed to lpfc_sli3_post_buffer to indicate use
in the SLI3 path only.

Link: https://lore.kernel.org/r/20220225022308.16486-2-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a680a929

21 10月, 2021 2 次提交

scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss · af984c87

由 James Smart 提交于 10月 20, 2021

A link bounce to a slow fabric may observe FDISC response delays lasting
longer than devloss tmo. Current logic decrements the final fabric node
kref during a devloss tmo event. This results in a NULL ptr dereference
crash if the FDISC completes for that fabric node after devloss tmo.

Fix by adding the NLP_IN_RECOV_POST_DEV_LOSS flag, which is set when
devloss tmo triggers and we've noticed that fabric node recovery has
already started or finished in between the time lpfc_dev_loss_tmo_callbk
queues lpfc_dev_loss_tmo_handler. If fabric node recovery succeeds, then
the driver reverses the devloss tmo marked kref put with a kref get. If
fabric node recovery fails, then the final kref put relies on the ELS
timing out or the REG_LOGIN cmpl routine.

Link: https://lore.kernel.org/r/20211020211417.88754-8-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

af984c87

scsi: lpfc: Correct sysfs reporting of loop support after SFP status change · 7a1dda94

由 James Smart 提交于 10月 20, 2021

Applications determine loop support in part by querying the 'pls' sysfs
node. Reporting of 'pls' (Private Loop Support) is derived from the
descriptor returned by the COMMON_GET_SLI4_PARAMETERS mailbox command,
which is issued during initialization or after a reset.

The value of this field may change if there is a dynamic SFP change. The
driver currently will not pick up the change as there was no reset
scenario.

Rework to commonize the sending of the COMMON_GET_SLI4_PARAMETERS
command. Add the calling of the routine after receipt of an async event
indicating an SFP change.

Link: https://lore.kernel.org/r/20211020211417.88754-4-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

7a1dda94

17 10月, 2021 1 次提交

scsi: lpfc: Switch to attribute groups · 08adfa75

由 Bart Van Assche 提交于 10月 12, 2021

struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.

Link: https://lore.kernel.org/r/20211012233558.4066756-28-bvanassche@acm.orgSigned-off-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

08adfa75

25 8月, 2021 7 次提交

scsi: lpfc: Add cmf_info sysfs entry · 74a7baa2

由 James Smart 提交于 8月 16, 2021

Allow abbreviated cm framework status information to be obtained via sysfs.

Link: https://lore.kernel.org/r/20210816162901.121235-14-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

74a7baa2

scsi: lpfc: Add support for maintaining the cm statistics buffer · 7481811c

由 James Smart 提交于 8月 16, 2021

Add the logic to move the congestion management and event information into
the cmd statistics buffer maintained for the adapter. The update includes
rolling up values for the last minute, hour, and day information.

Link: https://lore.kernel.org/r/20210816162901.121235-12-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

7481811c

scsi: lpfc: Add support for the CM framework · 02243836

由 James Smart 提交于 8月 16, 2021

Complete the enablement of the cm framework feature in the adapter. Perform
the following:

 - Detect the presence of the congestion management framework feature.

When the cm framework is present:

 - Issue the SET_FEATURE command to enable the feature.

 - Register the cm statistics buffer with the adapter.

 - Read the cm enablement buffer to determine the cm framework state for cm
   management.

When cm management is enabled:

 - Monitor all FPIN and congestion signalling events, incrementing
   counters.

 - Regularly sync with the adapter to communicate congestion events and to
   receive an rx request limit.

 - Monitor requests for rx data and ensure that no more than the
   adapter prescribed limit is issued on the link. If the limit is
   exceeded, SCSI and/or NVMe traffic is temporarily suspended.

 - Maintain the minute, hourly, daily statistics buffer.

 - Monitor for congestion enablement change events, causing a reread of the
   enablement buffer and acting on any change in enablement.

And:

 - Add teardown logic, including buffer deregistration, on adapter
   detachment or reset.

Link: https://lore.kernel.org/r/20210816162901.121235-10-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

02243836

scsi: lpfc: Add cmfsync WQE support · daebf93f

由 James Smart 提交于 8月 16, 2021

When congestion mgmt is enabled, cmf has the driver regularly issue a
command to synchronize reporting of congestion mgmt events such as fpin and
signal delivery.

This patch adds the definition of the CMF_SYNC WQE and its CQE fields as
well as support for issuing the command. The patch also adds the few
remaining cmf-related SLI additions, such as feature definition for
enablement of CMF and notifications to the driver if the cm enablement mode
changes.

Link: https://lore.kernel.org/r/20210816162901.121235-9-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

daebf93f

scsi: lpfc: Add support for cm enablement buffer · 72df8a45

由 James Smart 提交于 8月 16, 2021

As part of the cmf framework, the firmware maintains a table with
congestion related state information, specifically whether enabled and if
enabled, whether monitoring or actively managing congestion.

Add definition of the table and add support to read the table from the
adapter and determine if it is enabled. In support of this, the READ_OBJECT
mailbox command definition is added to the driver.

Link: https://lore.kernel.org/r/20210816162901.121235-8-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

72df8a45

scsi: lpfc: Add cm statistics buffer support · 8c42a65c

由 James Smart 提交于 8月 16, 2021

The cmf framework requires the driver to maintain a cm statistics table,
accessible inband, of congestion related statistics that are reported per
minute, rolled up to per hour, and rolled up again per day. Several days
worth may be maintained. The table is registered with the adapter when the
MIB feature is enabled.

Add definition of the table and add support to register the table with the
adapter. Includes definition and initialization of event counters that are
later added to the statistics table.

Link: https://lore.kernel.org/r/20210816162901.121235-7-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8c42a65c

scsi: lpfc: Add EDC ELS support · 9064aeb2

由 James Smart 提交于 8月 16, 2021

When congestion management is enabled, issue EDC ELS to register congestion
signaling capabilities with the fabric. The response handling will process
the fabric parameters and set the reporting parameters.

Similarly, add support for receiving an EDC request from the fabric
generating a corresponding response.

Implement handlers for congestion signals from the fabric and maintain
statistics for them.

Link: https://lore.kernel.org/r/20210816162901.121235-6-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9064aeb2

19 7月, 2021 1 次提交

scsi: lpfc: Delay unregistering from transport until GIDFT or ADISC completes · 06145683

由 James Smart 提交于 7月 07, 2021

On an RSCN event, the nodes specified in RSCN payload and in MAPPED state
are moved to NPR state in order to revalidate the login. This triggers an
immediate unregister from SCSI/NVMe backend. The assumption is that the
node may be missing. The re-registration with the backend happens after
either relogin (PLOGI/PRLI; if ADISC is disabled or login truly lost) or
when ADISC completes successfully (rediscover with ADISC enabled).

However, the NVMe-FC standard provides for an RSCN to be triggered when
the remote port supports a discovery controller and there was a change
of discovery log content. As the remote port typically also supports
storage subsystems, this unregister causes all storage controller
connections to fail and require reconnect.

Correct by reworking the code to ensure that the unregistration only occurs
when a login state is truly terminated, thereby leaving the NVMe storage
controllers in place.

The changes made are:

- Retain node state in ADISC_ISSUE when scheduling ADISC ELS retry.

- Do not clear wwpn/wwnn values upon ADISC failure.

- Move MAPPED nodes to NPR during RSCN processing, but do not unregister
with transport. On GIDFT completion, identify missing nodes (not marked
NLP_NPR_2B_DISC) and unregister them.

- Perform unregistration for nodes that will go through ADISC processing
if ADISC completion fails.

- Successful ADISC completion will move node back to MAPPED state.

Link: https://lore.kernel.org/r/20210707184351.67872-16-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

06145683

10 6月, 2021 1 次提交

scsi: lpfc: vmid: Add datastructure for supporting VMID in lpfc · 02169e84

由 Gaurav Srivastava 提交于 6月 08, 2021

Add the primary datastructures needed to implement VMID in the lpfc
driver. Maintain the capability, current state, and hash table for the
vmid/appid along with other information. This implementation supports the
two versions of vmid implementation (app header and priority tagging).

Link: https://lore.kernel.org/r/20210608043556.274139-5-muneendra.kumar@broadcom.comReviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NGaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMuneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

02169e84

22 5月, 2021 1 次提交

scsi: lpfc: Fix node handling for Fabric Controller and Domain Controller · fe83e3b9

由 James Smart 提交于 5月 14, 2021

During link bounce testing, RPI counts were seen to differ from the number
of nodes. For fabric and domain controllers, a temporary RPI is assigned,
but the code isn't registering it. If the nodes do go away, such as on link
down, the temporary RPI isn't being released.

Change the way these two fabric services are managed, make them behave like
any other remote port. Register the RPI and register with the transport.
Never leave the nodes in a NPR or UNUSED state where their RPI is in limbo.
This allows them to follow normal dev_loss_tmo handling, RPI refcounting,
and normal removal rules. It also allows fabric I/Os to use the RPI for
traffic requests.

Note: There is some logic that still has a couple of exceptions when the
Domain controller (0xfffcXX). There are cases where the fabric won't have a
valid login but will send RDP. Other times, it will it send a LOGO then an
RDP. It makes for ad-hoc behavior to manage the node. Exceptions are
documented in the code.

Link: https://lore.kernel.org/r/20210514195559.119853-7-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

fe83e3b9

13 4月, 2021 2 次提交

scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic · b62232ba

由 James Smart 提交于 4月 11, 2021

SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3
does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have
code to issue the command. The command will always fail.

Remove the code for the mailbox command and leave only the resulting
"failure path" logic.

Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b62232ba

scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotag · 078c68b8

由 James Smart 提交于 4月 11, 2021

Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in
lpfc_els_free_iocb().

A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the
changes was to convert from building/sending an abort within the routine to
using a common routine. The reworked routine passes, without modification,
the pring ptr to the new common routine. The older routine had logic to
check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were
passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine
is missing this check and adapt, so the SLI-3 ring pointers are being used
in SLI-4 paths.

Fix by cleaning up the calling routines. In review, there is no need to
pass the ring ptr argument to abort_iocb at all. The routine can look at
the adapter type itself and reference the proper ring.

Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com
Fixes: db7531d2 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers")
Cc: <stable@vger.kernel.org> # v5.11+
Co-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

078c68b8

05 3月, 2021 2 次提交

scsi: lpfc: Update copyrights for 12.8.0.7 and 12.8.0.8 changes · 67073c69

由 James Smart 提交于 3月 01, 2021

For the files modified in 2021 via the 12.8.0.7 and 12.8.0.8 patch sets,
update the copyright for 2021.

Link: https://lore.kernel.org/r/20210301171821.3427-23-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

67073c69

scsi: lpfc: Fix dropped FLOGI during pt2pt discovery recovery · 9dd83f75

由 James Smart 提交于 3月 01, 2021

When connected in pt2pt mode, there is a scenario where the remote port
significantly delays sending a response to our FLOGI, but acts on the FLOGI
it sent us and proceeds to PLOGI/PRLI. The FLOGI ends up timing out and
kicks off recovery logic. End result is a lot of unnecessary state changes
and lots of discovery messages being logged.

Fix by terminating the FLOGI and noop'ing its completion if we have already
accepted the remote ports FLOGI and are now processing PLOGI.

Link: https://lore.kernel.org/r/20210301171821.3427-13-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9dd83f75

08 1月, 2021 2 次提交

scsi: lpfc: Implement health checking when aborting I/O · a22d73b6

由 James Smart 提交于 1月 04, 2021

Several errors have occurred where the adapter stops or fails but does not
raise the register values for the driver to detect failure. Thus driver is
unaware of the failure. The failure typically results in I/O timeouts, the
I/O timeout handler failing (after several seconds), and the error handler
escalating recovery policy and resulting in more errors. Eventually, the
driver is in a position where things have spiraled and it can't do recovery
because other recovery ops are still outstanding and it becomes unusable.

Resolve the situation by having the I/O timeout handler (actually a els,
SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O,
perform a mailbox command and look for a response from the hardware. If
the mailbox command fails, it will mark the adapter offline and then invoke
the adapter reset handler to clean up.

The new I/O timeout test will be limited to a test every 5s. If there are
multiple I/O timeouts concurrently, only the 1st I/O timeout will generate
the mailbox command. Further testing will only occur once a timeout occurs
after a 5s delay from the last mailbox command has expired.

Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a22d73b6

scsi: lpfc: Fix NVMe recovery after mailbox timeout · 9ec58ec7

由 James Smart 提交于 1月 04, 2021

If a mailbox command times out, the SLI port is deemed in error and the
port is reset. The HBA cleanup is not returning I/Os to the NVMe layer
before the port is unregistered. This is due to the HBA being marked
offline (!SLI_ACTIVE) and cleanup being done by the mailbox timeout handler
rather than an general adapter reset routine. The mailbox timeout handler
mailbox handler only cleaned up SCSI I/Os.

Fix by reworking the mailbox handler to:

- After handling the mailbox error, detect the board is already in
failure (may be due to another error), and leave cleanup to the
other handler.

- If the mailbox command timeout is initial detector of the port error,
continue with the board cleanup and marking the adapter offline
(!SLI_ACTIVE). Remove the SCSI-only I/O cleanup routine. The generic
reset adapter routine that is subsequently invoked, will clean up the
I/Os.

- Have the reset adapter routine flush all NVMe and SCSI I/Os if the
adapter has been marked failed (!SLI_ACTIVE).

- Rework the NVMe I/O terminate routine to take a status code to fail the
I/O with and update so that cleaned up I/O calls the wqe completion
routine. Currently it is bypassing the wqe cleanup and calling the NVMe
I/O completion directly. The wqe completion routine will take care of
data structure and node cleanup then call the NVMe I/O completion
handler.

Link: https://lore.kernel.org/r/20210104180240.46824-11-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9ec58ec7

17 11月, 2020 5 次提交

scsi: lpfc: Update changed file copyrights for 2020 · 983f761c

由 James Smart 提交于 11月 15, 2020

Update Copyright in files changed by the 12.8.0.6 patch set to 2020

Link: https://lore.kernel.org/r/20201115192646.12977-18-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

983f761c

scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers · db7531d2

由 James Smart 提交于 11月 15, 2020

This patch reworks the abort interfaces such that SLI-3 retains the
iocb-based formatting and completions and SLI-4 now uses native WQEs and
completion routines.

The following changes are made:

- The code is refactored from a confusing 2 routine sequence of
xx_abort_iotag_issue(), which creates/formats and abort cmd, and
xx_issue_abort_tag(), which then issues and handles the completion of
the abort cmd - into a single interface of xx_issue_abort_iotag(). The
new interface will determine whether SLI-3 or SLI-4 and then call the
appropriate handler. A completion handler can now be specified to
address the differences in completion handling. Note: original code is
all iocb based, with SLI-4 converting to SLI-3 for the SCSI/ELS path,
and NVMe natively using wqes.

- The SLI-3 side is refactored:

The older iocb-base lpfc_sli_issue_abort_iotag() routine is combined
with the logic of lpfc_sli_abort_iotag_issue() as well as the
iocb-specific code in lpfc_abort_handler() and lpfc_sli_abort_iocb() to
create the new single SLI-3 abort routine that formats and issues the
iocb.

- The SLI-4 side is refactored and added to:

The native WQE abort code in NVMe is moved to the new SLI-4
issue_abort_iotag() routine. Items in SCSI that set fields not set by
NVMe is migrated into the new routine. Thus the routine supports NVMe
and SCSI initiators. The nvmet block (target) formats the abort slightly
different (like the old NVMe initiator) thus it has its own prep routine
stolen from NVMe initiator and it retains the current code it has for
issuing the WQE (does not use the commonized routine the initiators
do). SLI-4 completion handlers were also added.

- lpfc_abort_handler now becomes a wrapper that determines whether
SLI-3 or SLI-4 and calls the proper abort handler.

Link: https://lore.kernel.org/r/20201115192646.12977-16-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

db7531d2

scsi: lpfc: Enable common send_io interface for SCSI and NVMe · 47ff4c51

由 James Smart 提交于 11月 15, 2020

To set up common use by the SCSI and NVMe I/O paths, create a new routine
that issues FCP I/O commands which can be used by either protocol. The new
routine addresses SLI-3 vs SLI-4 differences within its implementation.

Replace the (SLI-3 centric) iocb routine in the SCSI path with this new
WQE-centric common routine.

Link: https://lore.kernel.org/r/20201115192646.12977-13-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

47ff4c51

scsi: lpfc: Enable common wqe_template support for both SCSI and NVMe · 840a4701

由 James Smart 提交于 11月 15, 2020

The driver is currently using SLI-4 WQE templates only for NVMe. Refactor
the template and the placement of the service routine so that it can be
used by both SCSI and NVMe.

Link: https://lore.kernel.org/r/20201115192646.12977-12-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

840a4701

scsi: lpfc: Rework remote port ref counting and node freeing · 307e3380

由 James Smart 提交于 11月 15, 2020

When a remote port is disconnected and disappears, its node structure
(ndlp) stays allocated and on a vport node list. While on the list it can
be matched, thus requires validation checks on state to be added in
numerous code paths. If the node comes back, its possible for there to be
multiple node structures for the same device on the vport node list. There
is no reason to keep the node structure around after it is no longer in
existence, and the current implementation creates problems for itself
(multiple nodes) and lots of unnecessary code for state validation.

Additionally, the reference taking on the node structure didn't follow the
normal model used by the kernel kref api. It included lots of odd logic to
match state with reference count. The combination of this odd logic plus
the way it was implicitly used in the discovery engine made its reference
taking implementation suspect and extremely hard to follow.

Change the driver such that the reference taking routines are now normal
ref increments/decrements and callout on refcount=0.

With this in place, the rework can be done such that the node structure is
fully removed and deallocated when the remote port no longer exists and all
references are removed. This removal logic, and the basic ref counting are
intrically tied, thus in a single patch.

Link: https://lore.kernel.org/r/20201115192646.12977-2-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

307e3380

03 7月, 2020 1 次提交

scsi: lpfc: Add an internal trace log buffer · 372c187b

由 Dick Kennedy 提交于 6月 30, 2020

The current logging methods typically end up requesting a reproduction with
a different logging level set to figure out what happened. This was mainly
by design to not clutter the kernel log messages with things that were
typically not interesting and the messages themselves could cause other
issues.

When looking to make a better system, it was seen that in many cases when
more data was wanted was when another message, usually at KERN_ERR level,
was logged. And in most cases, what the additional logging that was then
enabled was typically. Most of these areas fell into the discovery machine.

Based on this summary, the following design has been put in place: The
driver will maintain an internal log (256 elements of 256 bytes). The
"additional logging" messages that are usually enabled in a reproduction
will be changed to now log all the time to the internal log. A new logging
level is defined - LOG_TRACE_EVENT. When this level is set (it is not by
default) and a message marked as KERN_ERR is logged, all the messages in
the internal log will be dumped to the kernel log before the KERN_ERR
message is logged.

There is a timestamp on each message added to the internal log. However,
this timestamp is not converted to wall time when logged. The value of the
timestamp is solely to give a crude time reference for the messages.

Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

372c187b

10 5月, 2020 3 次提交

lpfc: nvmet: Add support for NVME LS request hosthandle · 4c2805aa

由 James Smart 提交于 3月 31, 2020

As the nvmet layer does not have the concept of a remoteport object, which
can be used to identify the entity on the other end of the fabric that is
to receive an LS, the hosthandle was introduced. The driver passes the
hosthandle, a value representative of the remote port, with a ls request
receive. The LS request will create the association. The transport will
remember the hosthandle for the association, and if there is a need to
initiate a LS request to the remote port for the association, the
hosthandle will be used. When the driver loses connectivity with the
remote port, it needs to notify the transport that the hosthandle is no
longer valid, allowing the transport to terminate associations related to
the hosthandle.

This patch adds support to the driver for the hosthandle. The driver will
use the ndlp pointer of the remote port for the hosthandle in calls to
nvmet_fc_rcv_ls_req(). The discovery engine is updated to invalidate the
hosthandle whenever connectivity with the remote port is lost.
Signed-off-by: NPaul Ely <paul.ely@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

4c2805aa

lpfc: Refactor NVME LS receive handling · 3a8070c5

由 James Smart 提交于 3月 31, 2020

In preparation for supporting both intiator mode and target mode
receiving NVME LS's, commonize the existing NVME LS request receive
handling found in the base driver and in the nvmet side.

Using the original lpfc_nvmet_unsol_ls_event() and
lpfc_nvme_unsol_ls_buffer() routines as a templates, commonize the
reception of an NVME LS request. The common routine will validate the LS
request, that it was received from a logged-in node, and allocate a
lpfc_async_xchg_ctx that is used to manage the LS request. The role of
the port is then inspected to determine which handler is to receive the
LS - nvme or nvmet. As such, the nvmet handler is tied back in. A handler
is created in nvme and is stubbed out.
Signed-off-by: NPaul Ely <paul.ely@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3a8070c5

lpfc: Refactor nvmet_rcv_ctx to create lpfc_async_xchg_ctx · 7cacae2a

由 James Smart 提交于 3月 31, 2020

To support FC-NVME-2 support (actually FC-NVME (rev 1) with Ammendment 1),
both the nvme (host) and nvmet (controller/target) sides will need to be
able to receive LS requests.  Currently, this support is in the nvmet side
only. To prepare for both sides supporting LS receive, rename
lpfc_nvmet_rcv_ctx to lpfc_async_xchg_ctx and commonize the definition.
Signed-off-by: NPaul Ely <paul.ely@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7cacae2a

30 3月, 2020 1 次提交

scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI · 2fcbc569

由 James Smart 提交于 3月 22, 2020

Currently driver ktime stats, measuring code paths, is NVME-specific.

Convert the stats routines such that the code paths are generic, providing
status for NVME and SCSI. Added ktime stat calls in SCSI queuecommand and
cmpl routines.

Link: https://lore.kernel.org/r/20200322181304.37655-11-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2fcbc569

27 3月, 2020 1 次提交

scsi: lpfc: Fix scsi host template for SLI3 vports · c90b4480

由 James Smart 提交于 3月 22, 2020

SCSI layer sends driver IOs with more s/g segments than driver can handle.
This results in "Too many sg segments from dma_map_sg. Config 64, seg_cnt
219" error messages from the lpfc_scsi_prep_dma_buf_s3() routine.

The was due to use the driver using individual templates for pport and
vport, host reset enabled or not, nvme vs scsi, etc. In the end, there was
a combination for a vport that didn't match the pport.

Rather than enumerating more templates and more discretionary assignments,
revert to a base template that is copied to a template specific to the
pport/vport. Then, based on role, attributes and sli type, modify the
fields that are different for that port. Added a log message to
lpfc_create_port to validate values.

Link: https://lore.kernel.org/r/20200322181304.37655-5-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c90b4480

18 2月, 2020 1 次提交

scsi: lpfc: add RDF registration and Link Integrity FPIN logging · df3fe766

由 James Smart 提交于 2月 10, 2020

This patch modifies lpfc to register for Link Integrity events via the use
of an RDF ELS and to perform Link Integrity FPIN logging.

Specifically, the driver was modified to:

 - Format and issue the RDF ELS immediately following SCR registration.
   This registers the ability of the driver to receive FPIN ELS.

 - Adds decoding of the FPIN els into the received descriptors, with
   logging of the Link Integrity event information. After decoding, the ELS
   is delivered to the scsi fc transport to be delivered to any user-space
   applications.

 - To aid in logging, simple helpers were added to create enum to name
   string lookup functions that utilize the initialization helpers from the
   fc_els.h header.

 - Note: base header definitions for the ELS's don't populate the
   descriptor payloads. As such, lpfc creates it's own version of the
   structures, using the base definitions (mostly headers) and additionally
   declaring the descriptors that will complete the population of the ELS.

Link: https://lore.kernel.org/r/20200210173155.547-3-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

df3fe766

22 12月, 2019 1 次提交

scsi: lpfc: Fix Fabric hostname registration if system hostname changes · e3ba04c9

由 James Smart 提交于 12月 18, 2019

There are reports of multiple ports on the same system displaying different
hostnames in fabric FDMI displays.

Currently, the driver registers the hostname at initialization and obtains
the hostname via init_utsname()->nodename queried at the time the FC link
comes up. Unfortunately, if the machine hostname is updated after
initialization, such as via DHCP or admin command, the value registered
initially will be incorrect.

Fix by having the driver save the hostname that was registered with FDMI.
The driver then runs a heartbeat action that will check the hostname. If
the name changes, reregister the FMDI data.

The hostname is used in RSNN_NN, FDMI RPA and FDMI RHBA.

Link: https://lore.kernel.org/r/20191218235808.31922-5-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e3ba04c9

13 11月, 2019 1 次提交

scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list() · d480e578

由 James Smart 提交于 11月 11, 2019

Compilation can fail due to having an inline function reference where the
function body is not present.

Fix by removing the inline tag.

Fixes: 93a4d6f4 ("scsi: lpfc: Add registration for CPU Offline/Online events")

Link: https://lore.kernel.org/r/20191111230401.12958-4-jsmart2021@gmail.comReviewed-by: NEwan D. Milne <emilne@redhat.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d480e578

06 11月, 2019 1 次提交

scsi: lpfc: Add registration for CPU Offline/Online events · 93a4d6f4

由 James Smart 提交于 11月 04, 2019

The recent affinitization didn't address cpu offlining/onlining. If an
interrupt vector is shared and the low order cpu owning the vector is
offlined, as interrupts are managed, the vector is taken offline. This
causes the other CPUs sharing the vector will hang as they can't get io
completions.

Correct by registering callbacks with the system for Offline/Online
events. When a cpu is taken offline, its eq, which is tied to an interrupt
vector is found. If the cpu is the "owner" of the vector and if the
eq/vector is shared by other CPUs, the eq is placed into a polled mode.
Additionally, code paths that perform io submission on the "sharing CPUs"
will check the eq state and poll for completion after submission of new io
to a wq that uses the eq.

Similarly, when a cpu comes back online and owns an offlined vector, the eq
is taken out of polled mode and rearmed to start driving interrupts for eq.

Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

93a4d6f4

01 10月, 2019 1 次提交

scsi: lpfc: Fix NVMe ABTS in response to receiving an ABTS · 51f8e43e

由 James Smart 提交于 9月 21, 2019

When the port, running as a nvme target, receives an ABTS, it submits
commands to the adapter to Abort i/o outstanding in the adapter. The Abort
command formatting routine left a command field set to zero, which
instructs the adapter to generate an ABTS on the wire as part of cleaning
up the I/O. This is common operation for an initiator, but not for a
target.

Fix the driver to check whether an ABTS had been received for the I/O, and
if so, change the Abort command formatting so that the ABTS generation is
disabled (IA=1). No need to ABTS it when the other side already has.

Also refactored the code such that there is a single routine being used for
nvme or nvmet ABORT requests, and IA is an argument.

Link: https://lore.kernel.org/r/20190922035906.10977-11-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

51f8e43e

30 8月, 2019 1 次提交

scsi: lpfc: Remove bg debugfs buffers · 9db6c14c

由 James Smart 提交于 8月 27, 2019

Capturing and downloading dif command data and dif data was done a dozen
years ago and no longer being used. Also creates a potential security hole.

Remove the debugfs buffer for dif debugging.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
CC: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com>
CC: Hannes Reinecke <hare@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9db6c14c

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功