提交 · daebf93fc3a5d12b3bc928aebb168c68e754dda2 · openeuler / Kernel

25 8月, 2021 4 次提交

scsi: lpfc: Add cmfsync WQE support · daebf93f

由 James Smart 提交于 8月 16, 2021

When congestion mgmt is enabled, cmf has the driver regularly issue a
command to synchronize reporting of congestion mgmt events such as fpin and
signal delivery.

This patch adds the definition of the CMF_SYNC WQE and its CQE fields as
well as support for issuing the command. The patch also adds the few
remaining cmf-related SLI additions, such as feature definition for
enablement of CMF and notifications to the driver if the cm enablement mode
changes.

Link: https://lore.kernel.org/r/20210816162901.121235-9-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

daebf93f

scsi: lpfc: Add support for cm enablement buffer · 72df8a45

由 James Smart 提交于 8月 16, 2021

As part of the cmf framework, the firmware maintains a table with
congestion related state information, specifically whether enabled and if
enabled, whether monitoring or actively managing congestion.

Add definition of the table and add support to read the table from the
adapter and determine if it is enabled. In support of this, the READ_OBJECT
mailbox command definition is added to the driver.

Link: https://lore.kernel.org/r/20210816162901.121235-8-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

72df8a45

scsi: lpfc: Add cm statistics buffer support · 8c42a65c

由 James Smart 提交于 8月 16, 2021

The cmf framework requires the driver to maintain a cm statistics table,
accessible inband, of congestion related statistics that are reported per
minute, rolled up to per hour, and rolled up again per day. Several days
worth may be maintained. The table is registered with the adapter when the
MIB feature is enabled.

Add definition of the table and add support to register the table with the
adapter. Includes definition and initialization of event counters that are
later added to the statistics table.

Link: https://lore.kernel.org/r/20210816162901.121235-7-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8c42a65c

scsi: lpfc: Add EDC ELS support · 9064aeb2

由 James Smart 提交于 8月 16, 2021

When congestion management is enabled, issue EDC ELS to register congestion
signaling capabilities with the fabric. The response handling will process
the fabric parameters and set the reporting parameters.

Similarly, add support for receiving an EDC request from the fabric
generating a corresponding response.

Implement handlers for congestion signals from the fabric and maintain
statistics for them.

Link: https://lore.kernel.org/r/20210816162901.121235-6-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9064aeb2

19 7月, 2021 1 次提交

scsi: lpfc: Fix NVMe support reporting in log message · ae463b60

由 James Smart 提交于 7月 07, 2021

The NVMe support indicator in log message 6422 is displaying a field that
was initialized but never set to indicate NVMe support. Remove obsolete
nvme_support element from the lpfc_hba structure and change log message to
display NVMe support status as reported in SLI4 Config Parameters mailbox
command.

Link: https://lore.kernel.org/r/20210707184351.67872-2-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ae463b60

10 6月, 2021 1 次提交

scsi: lpfc: vmid: Add datastructure for supporting VMID in lpfc · 02169e84

由 Gaurav Srivastava 提交于 6月 08, 2021

Add the primary datastructures needed to implement VMID in the lpfc
driver. Maintain the capability, current state, and hash table for the
vmid/appid along with other information. This implementation supports the
two versions of vmid implementation (app header and priority tagging).

Link: https://lore.kernel.org/r/20210608043556.274139-5-muneendra.kumar@broadcom.comReviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NGaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMuneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

02169e84

22 5月, 2021 2 次提交

scsi: lpfc: Reregister FPIN types if ELS_RDF is received from fabric controller · 8eced807

由 James Smart 提交于 5月 14, 2021

FC-LS-5 specifies that a received RDF implies a possible change to fabric
supported diagnostic functions. Endpoints are to re-perform the RDF
exchange with the fabric to enable possible new features or adapt to
changes in values.

This patch adds the logic to RDF receive to re-perform the RDF exchange
with the switch.

Link: https://lore.kernel.org/r/20210514195559.119853-11-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8eced807

scsi: lpfc: Add a option to enable interlocked ABTS before job completion · 3e49af93

由 James Smart 提交于 5月 14, 2021

Default behavior for the driver, when aborting an I/O, is to terminate the
I/O with the adapter. The adapter will initiate an ABTS to terminate the
exchange on the link and mark the exchange is terminated so that no further
use of the sgl or any traffic for the exchange is worked on. Completion on
the Abort is then posted to the driver, which as the I/O is terminated can
complete the I/O to the OS. This completion may occur prior to the ABTS
handshake completing on the wire. The ABTS handshake can take a long time
to complete with timeouts and retries reaching 60+ seconds. Note: if
retries fail, LOGO occurs.

Some devices want to ensure that the ABTS handshake fully completes (this
device has fully ack'd it) before the I/O completion is posted back to the
OS, where a failed I/O may be retried via a different path.

To support this behavior, an option was added to the driver to change I/O
completion from the Abort cmd completion to the Exchange termination (aka
ABTS) completion.

Link: https://lore.kernel.org/r/20210514195559.119853-10-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3e49af93

05 3月, 2021 2 次提交

scsi: lpfc: Update copyrights for 12.8.0.7 and 12.8.0.8 changes · 67073c69

由 James Smart 提交于 3月 01, 2021

For the files modified in 2021 via the 12.8.0.7 and 12.8.0.8 patch sets,
update the copyright for 2021.

Link: https://lore.kernel.org/r/20210301171821.3427-23-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

67073c69

scsi: lpfc: Fix dropped FLOGI during pt2pt discovery recovery · 9dd83f75

由 James Smart 提交于 3月 01, 2021

When connected in pt2pt mode, there is a scenario where the remote port
significantly delays sending a response to our FLOGI, but acts on the FLOGI
it sent us and proceeds to PLOGI/PRLI. The FLOGI ends up timing out and
kicks off recovery logic. End result is a lot of unnecessary state changes
and lots of discovery messages being logged.

Fix by terminating the FLOGI and noop'ing its completion if we have already
accepted the remote ports FLOGI and are now processing PLOGI.

Link: https://lore.kernel.org/r/20210301171821.3427-13-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9dd83f75

08 1月, 2021 2 次提交

scsi: lpfc: Implement health checking when aborting I/O · a22d73b6

由 James Smart 提交于 1月 04, 2021

Several errors have occurred where the adapter stops or fails but does not
raise the register values for the driver to detect failure. Thus driver is
unaware of the failure. The failure typically results in I/O timeouts, the
I/O timeout handler failing (after several seconds), and the error handler
escalating recovery policy and resulting in more errors. Eventually, the
driver is in a position where things have spiraled and it can't do recovery
because other recovery ops are still outstanding and it becomes unusable.

Resolve the situation by having the I/O timeout handler (actually a els,
SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O,
perform a mailbox command and look for a response from the hardware. If
the mailbox command fails, it will mark the adapter offline and then invoke
the adapter reset handler to clean up.

The new I/O timeout test will be limited to a test every 5s. If there are
multiple I/O timeouts concurrently, only the 1st I/O timeout will generate
the mailbox command. Further testing will only occur once a timeout occurs
after a 5s delay from the last mailbox command has expired.

Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a22d73b6

scsi: lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3 · d2f2547e

由 James Smart 提交于 1月 04, 2021

A very long time ago, there was a feature: auto sli mode. It gave the user
the ability to auto select the SLI mode (SLI2 or SLI3) to run the port in,
or even force SLI2 mode if configured.  Because of the convoluted logic,
the CONFIG_PORT mbox command ends up being called 2 or 3 times. It should
have been called only once.  Additionally, the driver no longer supports
SLI-2, so only SLI-3 mode should be allowed.

The following changes were made:

 - Force module parameter to SLI3 only.

 - Rip out redundant CONFIG_PORT mbox commands.

 - Force CONFIG_PORT mbox command to be in beginning of enable ISR routine.

 - Added changes for offline to online behavior

Link: https://lore.kernel.org/r/20210104180240.46824-3-jsmart2021@gmail.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d2f2547e

17 11月, 2020 3 次提交

scsi: lpfc: Convert SCSI path to use common I/O submission path · da255e2e

由 James Smart 提交于 11月 15, 2020

This patch converts the SCSI I/O path from the iocb-centric interfaces to
the common I/O submission path which supports native SLI-4 WQEs.

A wrapper routine is put in place to distinguish SLI-3 from SLI. If SLI-3,
the same iocb-centric paths are used, perhaps with refactored code that is
explicitly for SLI-3. For SLI-4, any iocb-related formatting is replaced
by wqe-based formatting, although much of that is addressed by the common
wqe templates in the SLI-4 path.

Link: https://lore.kernel.org/r/20201115192646.12977-14-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

da255e2e

scsi: lpfc: Enable common send_io interface for SCSI and NVMe · 47ff4c51

由 James Smart 提交于 11月 15, 2020

To set up common use by the SCSI and NVMe I/O paths, create a new routine
that issues FCP I/O commands which can be used by either protocol. The new
routine addresses SLI-3 vs SLI-4 differences within its implementation.

Replace the (SLI-3 centric) iocb routine in the SCSI path with this new
WQE-centric common routine.

Link: https://lore.kernel.org/r/20201115192646.12977-13-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

47ff4c51

scsi: lpfc: Rework remote port lock handling · c6adba15

由 James Smart 提交于 11月 15, 2020

Currently the discovery layers within the driver use the SCSI midlayer
host_lock to access node-specific structures. This can contend with the I/O
path and is too coarse of a lock.

Rework the driver so that it uses a lock specific to the remote port node
structure when accessing the structure contents. A few of the changes
brought out spots were some slightly reorganized routines worked better.

Link: https://lore.kernel.org/r/20201115192646.12977-6-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c6adba15

27 10月, 2020 2 次提交

scsi: lpfc: Add FDMI Vendor MIB support · 8aaa7bcf

由 James Smart 提交于 10月 20, 2020

Created new attribute lpfc_enable_mi, which by default is enabled.

Add command definition bits for SLI-4 parameters that recognize whether the
adapter has MIB information support and what revision of MIB data. Using
the adapter information, register vendor-specific MIB support with FDMI.
The registration will be done every link up.

During FDMI registration, encountered a couple of errors when reverting to
FDMI rev1. Code needed to exist once reverting. Fixed these.

Link: https://lore.kernel.org/r/20201020202719.54726-8-james.smart@broadcom.comCo-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8aaa7bcf

scsi: lpfc: Fix scheduling call while in softirq context in lpfc_unreg_rpi · e7dab164

由 James Smart 提交于 10月 20, 2020

The following call trace was seen during HBA reset testing:

BUG: scheduling while atomic: swapper/2/0/0x10000100
...
Call Trace:
dump_stack+0x19/0x1b
__schedule_bug+0x64/0x72
__schedule+0x782/0x840
__cond_resched+0x26/0x30
_cond_resched+0x3a/0x50
mempool_alloc+0xa0/0x170
lpfc_unreg_rpi+0x151/0x630 [lpfc]
lpfc_sli_abts_recover_port+0x171/0x190 [lpfc]
lpfc_sli4_abts_err_handler+0xb2/0x1f0 [lpfc]
lpfc_sli4_io_xri_aborted+0x256/0x300 [lpfc]
lpfc_sli4_sp_handle_abort_xri_wcqe.isra.51+0xa3/0x190 [lpfc]
lpfc_sli4_fp_handle_cqe+0x89/0x4d0 [lpfc]
__lpfc_sli4_process_cq+0xdb/0x2e0 [lpfc]
__lpfc_sli4_hba_process_cq+0x41/0x100 [lpfc]
lpfc_cq_poll_hdler+0x1a/0x30 [lpfc]
irq_poll_softirq+0xc7/0x100
__do_softirq+0xf5/0x280
call_softirq+0x1c/0x30
do_softirq+0x65/0xa0
irq_exit+0x105/0x110
do_IRQ+0x56/0xf0
common_interrupt+0x16a/0x16a

With the conversion to blk_io_poll for better interrupt latency in normal
cases, it introduced this code path, executed when I/O aborts or logouts
are seen, which attempts to allocate memory for a mailbox command to be
issued.  The allocation is GFP_KERNEL, thus it could attempt to sleep.

Fix by creating a work element that performs the event handling for the
remote port. This will have the mailbox commands and other items performed
in the work element, not the irq. A much better method as the "irq" routine
does not stall while performing all this deep handling code.

Ensure that allocation failures are handled and send LOGO on failure.

Additionally, enlarge the mailbox memory pool to reduce the possibility of
additional allocation in this path.

Link: https://lore.kernel.org/r/20201020202719.54726-3-james.smart@broadcom.com
Fixes: 317aeb83 ("scsi: lpfc: Add blk_io_poll support for latency improvment")
Cc: <stable@vger.kernel.org> # v5.9+
Co-developed-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e7dab164

03 7月, 2020 2 次提交

scsi: lpfc: Add an internal trace log buffer · 372c187b

由 Dick Kennedy 提交于 6月 30, 2020

The current logging methods typically end up requesting a reproduction with
a different logging level set to figure out what happened. This was mainly
by design to not clutter the kernel log messages with things that were
typically not interesting and the messages themselves could cause other
issues.

When looking to make a better system, it was seen that in many cases when
more data was wanted was when another message, usually at KERN_ERR level,
was logged. And in most cases, what the additional logging that was then
enabled was typically. Most of these areas fell into the discovery machine.

Based on this summary, the following design has been put in place: The
driver will maintain an internal log (256 elements of 256 bytes). The
"additional logging" messages that are usually enabled in a reproduction
will be changed to now log all the time to the internal log. A new logging
level is defined - LOG_TRACE_EVENT. When this level is set (it is not by
default) and a message marked as KERN_ERR is logged, all the messages in
the internal log will be dumped to the kernel log before the KERN_ERR
message is logged.

There is a timestamp on each message added to the internal log. However,
this timestamp is not converted to wall time when logged. The value of the
timestamp is solely to give a crude time reference for the messages.

Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

372c187b

scsi: lpfc: Add blk_io_poll support for latency improvment · 317aeb83

由 Dick Kennedy 提交于 6月 30, 2020

Although the existing implementation is very good at high I/O load, on
tests involving light load, especially on only a few hardware queues,
latency was a little higher than it can be due to using workqueue
scheduling. Other tasks in the system can delay handling.

Change the lower level to use irq_poll by default which uses a softirq for
I/O completion. This gives better latency as variance in when the cq is
processed is reduced over the workqueue interface. However, as high load is
better served by not being in softirq when the CPU is loaded, work queues
are still used under high I/O load.

Link: https://lore.kernel.org/r/20200630215001.70793-13-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

317aeb83

10 5月, 2020 1 次提交

lpfc: Refactor nvmet_rcv_ctx to create lpfc_async_xchg_ctx · 7cacae2a

由 James Smart 提交于 3月 31, 2020

To support FC-NVME-2 support (actually FC-NVME (rev 1) with Ammendment 1),
both the nvme (host) and nvmet (controller/target) sides will need to be
able to receive LS requests.  Currently, this support is in the nvmet side
only. To prepare for both sides supporting LS receive, rename
lpfc_nvmet_rcv_ctx to lpfc_async_xchg_ctx and commonize the definition.
Signed-off-by: NPaul Ely <paul.ely@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

7cacae2a

08 5月, 2020 1 次提交

scsi: lpfc: Change default queue allocation for reduced memory consumption · 3048e3e8

由 Dick Kennedy 提交于 5月 01, 2020

By default, the driver attempts to allocate a hdwq per logical cpu in order
to provide good cpu affinity. Some systems have extremely high cpu counts
and this can significantly raise memory consumption.

In testing on x86 platforms (non-AMD) it is found that sharing of a hdwq by
a physical cpu and its HT cpu can occur with little performance
degredation. By sharing, the hdwq count can be halved, significantly
reducing the memory overhead.

Change the default behavior of the driver on non-AMD x86 platforms to
share a hdwq by the cpu and its HT cpu.

Link: https://lore.kernel.org/r/20200501214310.91713-6-jsmart2021@gmail.comReviewed-by: NHannes Reinecke <hare@suse.de>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3048e3e8

30 3月, 2020 3 次提交

scsi: lpfc: Remove prototype FIPS/DSS options from SLI-3 · 0e75461a

由 James Smart 提交于 3月 22, 2020

During code review, identified dss feature that was a prototype only and
was never productized in SLI3. They shouldn't be there and prevents reuse
of the command areas.

Remove any code in the driver to deal with dss, including code to deal with
fips, which is associated with the dss feature.

Link: https://lore.kernel.org/r/20200322181304.37655-12-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0e75461a

scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI · 2fcbc569

由 James Smart 提交于 3月 22, 2020

Currently driver ktime stats, measuring code paths, is NVME-specific.

Convert the stats routines such that the code paths are generic, providing
status for NVME and SCSI. Added ktime stat calls in SCSI queuecommand and
cmpl routines.

Link: https://lore.kernel.org/r/20200322181304.37655-11-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2fcbc569

scsi: lpfc: Fix erroneous cpu limit of 128 on I/O statistics · 840eda96

由 James Smart 提交于 3月 22, 2020

The cpu io statistics were capped by a hard define limit of 128. This
effectively was a max number of CPUs, not an actual CPU count, nor actual
CPU numbers which can be even larger than both of those values. This made
stats off/misleading and on large CPU count systems, wrong.

Fix the stats so that all CPUs can have a stats struct. Fix the looping
such that it loops by hdwq, finds CPUs that used the hdwq, and sum the
stats, then display.

Link: https://lore.kernel.org/r/20200322181304.37655-9-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

840eda96

27 3月, 2020 1 次提交

scsi: lpfc: Fix scsi host template for SLI3 vports · c90b4480

由 James Smart 提交于 3月 22, 2020

SCSI layer sends driver IOs with more s/g segments than driver can handle.
This results in "Too many sg segments from dma_map_sg. Config 64, seg_cnt
219" error messages from the lpfc_scsi_prep_dma_buf_s3() routine.

The was due to use the driver using individual templates for pport and
vport, host reset enabled or not, nvme vs scsi, etc. In the end, there was
a combination for a vport that didn't match the pport.

Rather than enumerating more templates and more discretionary assignments,
revert to a base template that is copied to a template specific to the
pport/vport. Then, based on role, attributes and sli type, modify the
fields that are different for that port. Added a log message to
lpfc_create_port to validate values.

Link: https://lore.kernel.org/r/20200322181304.37655-5-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c90b4480

18 2月, 2020 1 次提交

scsi: lpfc: add RDF registration and Link Integrity FPIN logging · df3fe766

由 James Smart 提交于 2月 10, 2020

This patch modifies lpfc to register for Link Integrity events via the use
of an RDF ELS and to perform Link Integrity FPIN logging.

Specifically, the driver was modified to:

 - Format and issue the RDF ELS immediately following SCR registration.
   This registers the ability of the driver to receive FPIN ELS.

 - Adds decoding of the FPIN els into the received descriptors, with
   logging of the Link Integrity event information. After decoding, the ELS
   is delivered to the scsi fc transport to be delivered to any user-space
   applications.

 - To aid in logging, simple helpers were added to create enum to name
   string lookup functions that utilize the initialization helpers from the
   fc_els.h header.

 - Note: base header definitions for the ELS's don't populate the
   descriptor payloads. As such, lpfc creates it's own version of the
   structures, using the base definitions (mostly headers) and additionally
   declaring the descriptors that will complete the population of the ELS.

Link: https://lore.kernel.org/r/20200210173155.547-3-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

df3fe766

11 2月, 2020 3 次提交

scsi: lpfc: Copyright updates for 12.6.0.4 patches · 145e5a8a

由 James Smart 提交于 1月 27, 2020

Update copyrights to 2020 for files modified in the 12.6.0.4 patch set.

Link: https://lore.kernel.org/r/20200128002312.16346-13-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

145e5a8a

scsi: lpfc: Remove handler for obsolete ELS - Read Port Status (RPS) · 6cde2e3e

由 James Smart 提交于 1月 27, 2020

There was report of an odd "Fix me..." log message, which was tracked down
to the lpfc_els_rcv_rps() routine. This was in handling of a very old and
obsolete ELS - Read Port Status. The RPS ELS was defined in FC-LS-1, but
deprecated in FC-LS-2, and removed from all later FC-LS revisions. It was
replaced by the Read Diagnostic Parameters (RDP) ELS and the Link Error
Status Block descriptor.

There should be no support for the RSP ELS. Remove support from driver.

Link: https://lore.kernel.org/r/20200128002312.16346-9-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

6cde2e3e

scsi: lpfc: Fix broken Credit Recovery after driver load · 835214f5

由 James Smart 提交于 1月 27, 2020

When driver is set to enable bb credit recovery, the switch displayed the
setting as inactive. If the link bounces, it switches to Active.

During link up processing, the driver currently does a MBX_READ_SPARAM
followed by a MBX_CONFIG_LINK. These mbox commands are queued to be
executed, one at a time and the completion is processed by the worker
thread. Since the MBX_READ_SPARAM is done BEFORE the MBX_CONFIG_LINK, the
BB_SC_N bit is never set the the returned values. BB Credit recovery status
only gets set after the driver requests the feature in CONFIG_LINK, which
is done after the link up. Thus the ordering of READ_SPARAM needs to follow
the CONFIG_LINK.

Fix by reordering so that READ_SPARAM is done after CONFIG_LINK. Added a
HBA_DEFER_FLOGI flag so that any FLOGI handling waits until after the
READ_SPARAM is done so that the proper BB credit value is set in the FLOGI
payload.

Fixes: 6bfb1620 ("scsi: lpfc: Fix configuration of BB credit recovery in service parameters")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20200128002312.16346-4-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

835214f5

22 12月, 2019 1 次提交

scsi: lpfc: Fix Fabric hostname registration if system hostname changes · e3ba04c9

由 James Smart 提交于 12月 18, 2019

There are reports of multiple ports on the same system displaying different
hostnames in fabric FDMI displays.

Currently, the driver registers the hostname at initialization and obtains
the hostname via init_utsname()->nodename queried at the time the FC link
comes up. Unfortunately, if the machine hostname is updated after
initialization, such as via DHCP or admin command, the value registered
initially will be incorrect.

Fix by having the driver save the hostname that was registered with FDMI.
The driver then runs a heartbeat action that will check the hostname. If
the name changes, reregister the FMDI data.

The hostname is used in RSNN_NN, FDMI RPA and FDMI RHBA.

Link: https://lore.kernel.org/r/20191218235808.31922-5-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e3ba04c9

06 11月, 2019 2 次提交

scsi: lpfc: Change default IRQ model on AMD architectures · dcaa2136

由 James Smart 提交于 11月 04, 2019

The current driver attempts to allocate an interrupt vector per cpu using
the systems managed IRQ allocator (flag PCI_IRQ_AFFINITY). The system IRQ
allocator will either provide the per-cpu vector, or return fewer
vectors. When fewer vectors, they are evenly spread between the numa nodes
on the system. When run on an AMD architecture, if interrupts occur to a
cpu that is not in the same numa node as the adapter generating the
interrupt, there are extreme costs and overheads in performance. Thus, if
1:1 vector allocation is used, or the "balanced" vectors in the other numa
nodes, performance can be hit significantly.

A much more performant model is to allocate interrupts only on the cpus
that are in the numa node where the adapter resides. I/O completion is
still performed by the cpu where the I/O was generated. Unfortunately,
there is no flag to request the managed IRQ subsystem allocate vectors only
for the CPUs in the numa node as the adapter.

On AMD architecture, revert the irq allocation to the normal style
(non-managed) and then use irq_set_affinity_hint() to set the cpu
affinity and disable user-space rebalancing.

Tie the support into CPU offline/online. If the cpu being offlined owns a
vector, the vector is re-affinitized to one of the other CPUs on the same
numa node. If there are no more CPUs on the numa node, the vector has all
affinity removed and lets the system determine where it's serviced.
Similarly, when the cpu that owned a vector comes online, the vector is
reaffinitized to the cpu.

Link: https://lore.kernel.org/r/20191105005708.7399-10-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

dcaa2136

scsi: lpfc: Add registration for CPU Offline/Online events · 93a4d6f4

由 James Smart 提交于 11月 04, 2019

The recent affinitization didn't address cpu offlining/onlining. If an
interrupt vector is shared and the low order cpu owning the vector is
offlined, as interrupts are managed, the vector is taken offline. This
causes the other CPUs sharing the vector will hang as they can't get io
completions.

Correct by registering callbacks with the system for Offline/Online
events. When a cpu is taken offline, its eq, which is tied to an interrupt
vector is found. If the cpu is the "owner" of the vector and if the
eq/vector is shared by other CPUs, the eq is placed into a polled mode.
Additionally, code paths that perform io submission on the "sharing CPUs"
will check the eq state and poll for completion after submission of new io
to a wq that uses the eq.

Similarly, when a cpu comes back online and owns an offlined vector, the eq
is taken out of polled mode and rearmed to start driving interrupts for eq.

Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

93a4d6f4

25 10月, 2019 3 次提交

scsi: lpfc: Add FC-AL support to lpe32000 models · 83c6cb1a

由 James Smart 提交于 10月 18, 2019

In the past, the lpe32000 models, based their main support being for 32G,
and as FC-AL is not supported in the FC standards past 8G, did not support
FC-AL operation.

This patch adds private-loop FC-AL support for the LPE32000 adapters
when a link is 8G or below. To avoid conditions where link rate may
change, which would cause non-connectivity to the AL device, FC-AL
mode must become a persistent setting and the link kept at a speed
supporting FC-AL.

The patch:

 - Adds a pls attribute indicating whether the adapter properly supports
   FC-AL.

 - Adds support for the adapter to indicate that topology should be fixed
   and the topology types to be configured.

 - Adds a pt attribute to report the persistent topology if present.

Link: https://lore.kernel.org/r/20191018211832.7917-15-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

83c6cb1a

scsi: lpfc: Make FW logging dynamically configurable · 95bfc6d8

由 James Smart 提交于 10月 18, 2019

Currently, the FW logging facility is a load/boot time parameter which
requires the driver to be unloaded/reloaded or the system rebooted in order
to change its configuration.

Convert the logging facility to allow dynamic enablement and configuration.
Specifically:

 - Convert the feature so that it can be enabled dynamically via an
   attribute.  Additionally, the size of the buffer can be configured
   dynamically.

 - Add locks around states that now may be changing.

 - Tie the feature into debugfs so that the logs can be read at any time.

Link: https://lore.kernel.org/r/20191018211832.7917-12-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

95bfc6d8

scsi: lpfc: Remove lock contention target write path · ea85a20c

由 James Smart 提交于 10月 18, 2019

Lower IOps performance with write operations. Perf tool shows lock
contention in dma_pool_alloc and dma_pool_free related to the
txrdy_payload_pool.

The allocations are for dma buffers for XFER_RDY's, which actually are not
needed for the FCP_TRECEIVE command as the command contents are used by the
adapter to generate the IU.

Remove the allocations and the associated buffer pool.  Rather than leaving
NULLs in buffer pointer locations, set command and sgl to indicate skipped
SGLE indexes.

Link: https://lore.kernel.org/r/20191018211832.7917-10-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ea85a20c

01 10月, 2019 1 次提交

scsi: lpfc: Fix NVME io abort failures causing hangs · a5f7337f

由 James Smart 提交于 9月 21, 2019

The nvme-fc transport may call to abort an io on controller reset. If the
driver is out of resources to issue an abort command, it just gives up and
does nothing. The transport expects the lldd to always be able to terminate
an io it has issued. At that point, the controller hangs waiting for
aborted ios to be returned. Note: flaged by "6136" and "6176" error
messages.

Root issue was the adapter mis-allocated the number resources it allocated
for command entries for the adapter.

Convert the driver to allocate command resources based on the number of
xris supported by the FC port - 1 resource for the original command and 1
resource for the abort request.

Link: https://lore.kernel.org/r/20190922035906.10977-5-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a5f7337f

30 8月, 2019 1 次提交

scsi: lpfc: Remove bg debugfs buffers · 9db6c14c

由 James Smart 提交于 8月 27, 2019

Capturing and downloading dif command data and dif data was done a dozen
years ago and no longer being used. Also creates a potential security hole.

Remove the debugfs buffer for dif debugging.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
CC: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com>
CC: Hannes Reinecke <hare@suse.de>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9db6c14c

20 8月, 2019 3 次提交

scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair · c00f62e6

由 James Smart 提交于 8月 14, 2019

Currently, each hardware queue, typically allocated per-cpu, consists of a
WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2
WQ/CQ pairs will exist for the hardware queue. Separate queues are
unnecessary. The current implementation wastes memory backing the 2nd set
of queues, and the use of double the SLI-4 WQ/CQ's means less hardware
queues can be supported which means there may not always be enough to have
a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their
own WQ/CQ.

Rework the implementation to use a single WQ/CQ pair by both protocols.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c00f62e6

scsi: lpfc: Add NVMe sequence level error recovery support · 0d8af096

由 James Smart 提交于 8月 14, 2019

FC-NVMe-2 added support for sequence level error recovery in the FC-NVME
protocol. This allows for the detection of errors and lost frames and
immediate retransmission of data to avoid exchange termination, which
escalates into NVMeoFC connection and association failures. A significant
RAS improvement.

The driver is modified to indicate support for SLER in the NVMe PRLI is
issues and to check for support in the PRLI response.  When both sides
support it, the driver will set a bit in the WQE to enable the recovery
behavior on the exchange. The adapter will take care of all detection and
retransmission.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0d8af096

scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware. · d79c9e9d

由 James Smart 提交于 8月 14, 2019

Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI
to contain the exchanges Scatter/Gather List. This caps the number of SGL
elements that can be in the SGL. There are not extensions to extend the
list out of the 2 pages.

The G7 hardware adds a SGE type that allows the SGL to be vectored to a
different scatter/gather list segment. And that segment can contain a SGE
to go to another segment and so on. The initial segment must still be
pre-registered for the XRI, but it can be a much smaller amount (256Bytes)
as it can now be dynamically grown. This much smaller allocation can
handle the SG list for most normal I/O, and the dynamic aspect allows it to
support many MB's if needed.

The implementation creates a pool which contains "segments" and which is
initially sized to hold the initial small segment per xri. If an I/O
requires additional segments, they are allocated from the pool. If the
pool has no more segments, the pool is grown based on what is now
needed. After the I/O completes, the additional segments are returned to
the pool for use by other I/Os. Once allocated, the additional segments are
not released under the assumption of "if needed once, it will be needed
again". Pools are kept on a per-hardware queue basis, which is typically
1:1 per cpu, but may be shared by multiple cpus.

The switch to the smaller initial allocation significantly reduces the
memory footprint of the driver (which only grows if large ios are
issued). Based on the several K of XRIs for the adapter, the 8KB->256B
reduction can conserve 32MBs or more.

It has been observed with per-cpu resource pools that allocating a resource
on CPU A, may be put back on CPU B. While the get routines are distributed
evenly, only a limited subset of CPUs may be handling the put routines.
This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because
all the resources are being put on a limited subset of CPUs.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d79c9e9d

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功