提交 · 32350664497279f4ddd96164caafc8a1b573ca2a · openeuler / Kernel

20 8月, 2019 9 次提交

scsi: lpfc: Migrate to %px and %pf in kernel print calls · 32350664

由 James Smart 提交于 8月 14, 2019

In order to see real addresses, convert %p with %px for kernel addresses
and replace %p with %pf for functions.

While converting, standardize on "x%px" throughout (not %px or 0x%px).
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

32350664

scsi: lpfc: Fix coverity warnings · d9f492a1

由 James Smart 提交于 8月 14, 2019

Running on Coverity produced the following errors:

 - coding style (indentation)

 - memset size mismatch errors
   note: comment cases where it is purposely a mismatch

Fix the errors.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d9f492a1

scsi: lpfc: Fix hang when downloading fw on port enabled for nvme · 84f2ddf8

由 James Smart 提交于 8月 14, 2019

As part of firmware download, the adapter is reset. On the adapter the
reset causes the function to stop and all outstanding io is terminated
(without responses). The reset path then starts teardown of the adapter,
starting with deregistration of the remote ports with the nvme-fc
transport. The local port is then deregistered and the driver waits for
local port deregistration. This never finishes.

The remote port deregistrations terminated the nvme controllers, causing
them to send aborts for all the outstanding io. The aborts were serviced in
the driver, but stalled due to its state. The nvme layer then stops to
reclaim it's outstanding io before continuing. The io must be returned
before the reset on the controller is deemed complete and the controller
delete performed. The remote port deregistration won't complete until all
the controllers are terminated. And the local port deregistration won't
complete until all controllers and remote ports are terminated. Thus things
hang.

The issue is the reset which stopped the adapter also stopped all the
responses that would drive i/o completions, and the aborts were also
stopped that stopped i/o completions. The driver, when resetting the
adapter like this, needs to be generating the completions as part of the
adapter reset so that I/O complete (in error), and any aborts are not
queued.

Fix by adding flush routines whenever the adapter port has been reset or
discovered in error. The flush routines will generate the completions for
the scsi and nvme outstanding io. The abort ios, if waiting, will be caught
and flushed as well.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

84f2ddf8

scsi: lpfc: Fix crash due to port reset racing vs adapter error handling · 8c24a4f6

由 James Smart 提交于 8月 14, 2019

If the adapter encounters a condition which causes the adapter to fail
(driver must detect the failure) simultaneously to a request to the driver
to reset the adapter (such as a host_reset), the reset path will be racing
with the asynchronously-detect adapter failure path.  In the failing
situation, one path has started to tear down the adapter data structures
(io_wq's) while the other path has initiated a repeat of the teardown and
is in the lpfc_sli_flush_xxx_rings path and attempting to access the
just-freed data structures.

Fix by the following:

 - In cases where an adapter failure is detected, rather than explicitly
   calling offline_eratt() to start the teardown, change the adapter state
   and let the later calls of posted work to the slowpath thread invoke the
   adapter recovery.  In essence, this means all requests to reset are
   serialized on the slowpath thread.

 - Clean up the routine that restarts the adapter. If there is a failure
   from brdreset, don't immediately error and leave things in a partial
   state. Instead, ensure the adapter state is set and finish the teardown
   of structures before returning.

 - If in the scsi host reset handler and the board fails to reset and
   restart (which can be due to parallel reset/recovery paths), instead of
   hard failing and explicitly calling offline_eratt() (which gets into the
   redundant path), just fail out and let the asynchronous path resolve the
   adapter state.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8c24a4f6

scsi: lpfc: Fix loss of remote port after devloss due to lack of RPIs · b95b2119

由 James Smart 提交于 8月 14, 2019

In tests with remote ports contantly logging out/logging coupled with
occassional local link bounce, if a remote port is disocnnected for longer
than devloss_tmo and then subsequently reconnected, eventually the test
will fail to login with the remote port and remote port connectivity is
lost.

When devloss_tmo expires, the driver does not free the node struct until
the port or npiv instances is being deleted. The node is left allocated but
the state set to UNUSED. If the node was in the process of logging in when
the local link drop occurred, meaning the RPI was allocated for the node in
order to send the ELS, but not yet registered which comes after successful
login, the node is moved to the NPR state, and if devloss expires, to
UNUSED state.  If the remote port comes back, the node associated with it
is restarted and this path happens to allocate a new RPI and overwrites the
prior RPI value. In the cases where the port was logged in and loggs out,
the path did release the RPI but did not set the node rpi value.  In the
cases where the remote port never finished logging in, the path never did
the call to release the rpi. In this latter case, when the node is
subsequently restore, the new rpi allocation overwrites the rpi that was
not released, and the rpi is now leaked.  Eventually the port will run out
of RPI resources to log into new remote ports.

Fix by following changes:

 - When an rpi is released, do so under locks and ensure the node rpi value
   is set to a non-allocated value (LPFC_RPI_ALLOC_ERROR).  Note:
   refactored to a small service routine to avoid indentation issues.

 - When re-enabling a node, check the rpi value to determine if a new
   allocation is necessary. If already set, use the prior rpi.

Enhanced logging to help in the future.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b95b2119

scsi: lpfc: Fix irq raising in lpfc_sli_hba_down · 4b0a42be

由 James Smart 提交于 8月 14, 2019

The adapter reset path (lpfc_sli_hba_down) is taking/releasing a lock with
irq. But, the path is already under the hbalock which raised irq so it's
unnecessary.

Convert to simple lock/unlock.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4b0a42be

scsi: lpfc: Fix leak of ELS completions on adapter reset · 29601228

由 James Smart 提交于 8月 14, 2019

If the adapter is reset while there are outstanding ELS's, subsequent
reinitialization of the adapter will fail as it has not recovered all of
the io contexts relative to the ELS's.

If an ELS timed out or otherwise failed and an the ELS was attempted to be
aborted (which changes the ELS completion context), in causes where the
driver generates completions for the outstanding IO as the adapter would
not due to being reset, the driver released only the ELS context and failed
to release the abort context. When the adapter went to reinit, as it had
not received all of the contexts, it failed to reinit.

Fix by having the ELS completion handler identify the driver-generated
completion status and release the abort context.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

29601228

scsi: lpfc: Fix PLOGI failure with high remoteport count · 4f1a2fef

由 James Smart 提交于 8月 14, 2019

When connected to a high number of remote ports, the driver is encountering
PLOGI errors.  The errors are due to adapter detected failures indicating
illegal field values.

Turns out the driver was prematurely clearing an RPI bitmask before waiting
for an UNREG_RPI mailbox completion. This allowed the RPI to be reused
before it was actually available.

Fix by clearing RPI bitmask only after UNREG_RPI mailbox completion.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4f1a2fef

scsi: lpfc: remove redundant code · ee9a256c

由 Fuqian Huang 提交于 8月 08, 2019

Remove the redundant initialization code.
Signed-off-by: NFuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ee9a256c

21 6月, 2019 1 次提交

lpfc: add support to generate RSCN events for nport · f60cb93b

由 James Smart 提交于 5月 14, 2019

This patch adds general RSCN support:

 - The ability to transmit an RSCN to the port on the other end of
   the link (regular port if pt2pt, or fabric controller if fabric).
 - And general recognition of an RSCN ELS when an ELS is received.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Reviewed-by: NArun Easi <aeasi@marvell.com>
Signed-off-by: NSagi Grimberg <sagi@grimberg.me>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

f60cb93b

19 6月, 2019 4 次提交

scsi: lpfc: Make some symbols static · d7b761b0

由 YueHaibing 提交于 5月 31, 2019

Fix sparse warnings:

drivers/scsi/lpfc/lpfc_sli.c:115:1: warning: symbol 'lpfc_sli4_pcimem_bcopy' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_sli.c:7854:1: warning: symbol 'lpfc_sli4_process_missed_mbox_completions' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:223:27: warning: symbol 'lpfc_nvmet_get_ctx_for_xri' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:245:27: warning: symbol 'lpfc_nvmet_get_ctx_for_oxid' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_init.c:75:10: warning: symbol 'lpfc_present_cpu' was not declared. Should it be static?
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d7b761b0

scsi: lpfc: Fix poor use of hardware queues if fewer irq vectors · 657add4e

由 James Smart 提交于 5月 21, 2019

While fixing the resources per socket, realized the driver was not using
hardware queues (up to 1 per cpu) if there were fewer interrupt
vectors. The driver was only using the hardware queue assigned to the cpu
with the vector.

Rework the affinity map check to use the additional hardware queue elements
that had been allocated.  If the cpu count exceeds the hardware queue count
- share, but choose what is shared with by: hyperthread peer, core peer,
socket peer, or finally similar cpu in a different socket.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

657add4e

scsi: lpfc: Fix memory leak in abnormal exit path from lpfc_eq_create · 04d210c9

由 James Smart 提交于 5月 21, 2019

eq create is leaking mailbox memory if it encounters an error.

rework error path to free the memory.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

04d210c9

scsi: lpfc: Separate CQ processing for nvmet_fc upcalls · d74a89aa

由 James Smart 提交于 5月 21, 2019

Currently the driver is notified of new command frame receipt by CQEs. As
part of the CQE processing, the driver upcalls the nvmet_fc transport to
deliver the command. nvmet_fc, as part of receiving the command builds out
a context for it, where one of the first steps is to allocate memory for
the io.

When running with tests that do large ios (1MB), it was found on some
systems, the total number of outstanding I/O's, at 1MB per, completely
consumed the system's memory. Thus additional ios were getting blocked in
the memory allocator. Given that this blocked the lpfc thread processing
CQEs, there were lots of other commands that were received and which are
then held up, and given CQEs are serially processed, the aggregate delays
for an IO waiting behind the others became cummulative - enough so that the
initiator hit timeouts for the ios.

The basic fix is to avoid the direct upcall and instead schedule a work
item for each io as it is received. This allows the cq processing to
complete very quickly, and each io can then run or block on it's own.
However, this general solution hurts latency when there are few ios. As
such, implemented the fix such that the driver watches how many CQEs it has
processed sequentially in one run. As long as the count is below a
threshold, the direct nvmet_fc upcall will be made. Only when the count is
exceeded will it revert to work scheduling.

Given that debug of this showed a surprisingly long delay in cq processing,
the io timer stats were updated to better reflect the processing of the
different points.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d74a89aa

14 5月, 2019 1 次提交

scsi: lpfc: resolve lockdep warnings · e2a8be56

由 James Smart 提交于 5月 06, 2019

There were a number of erroneous comments and incorrect older lockdep
checks that were causing a number of warnings.

Resolve the following:

 - Inconsistent lock state warnings in lpfc_nvme_info_show().

 - Fixed comments and code on sequences where ring lock is now held instead
   of hbalock.

 - Reworked calling sequences around lpfc_sli_iocbq_lookup(). Rather than
   locking prior to the routine and have routine guess on what lock, take
   the lock within the routine. The lockdep check becomes unnecessary.

 - Fixed comments and removed erroneous hbalock checks.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
CC: Bart Van Assche <bvanassche@acm.org>
Tested-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e2a8be56

04 4月, 2019 4 次提交

scsi: lpfc: Change smp_processor_id() into raw_smp_processor_id() · d6d189ce

由 Bart Van Assche 提交于 3月 28, 2019

This patch avoids that a kernel warning appears when smp_processor_id() is
called with preempt debugging enabled.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d6d189ce

scsi: lpfc: Remove unused functions · d8c2040b

由 Bart Van Assche 提交于 3月 28, 2019

Remove those functions that are not called from outside the removed
functions.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d8c2040b

scsi: lpfc: Fix indentation and balance braces · ffd43814

由 Bart Van Assche 提交于 3月 28, 2019

This patch avoid that smatch complains about misleading indentation.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ffd43814

scsi: lpfc: Declare local functions static · 3999df75

由 Bart Van Assche 提交于 3月 28, 2019

This patch avoids that the compiler complains about missing declarations
when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: NBart Van Assche <bvanassche@acm.org>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3999df75

21 3月, 2019 2 次提交

scsi: lpfc: Fixup eq_clr_intr references · 92f3b327

由 James Smart 提交于 3月 20, 2019

Declaring interrupt clear routines as inline is bogus as they are used as
an indirect pointer.

Remove the inline references.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

92f3b327

scsi: lpfc: Fix build error · c88725dd

由 James Bottomley 提交于 3月 20, 2019

You can't declare a function inline in a header if it doesn't have a body
available to the compiler. So realistically you either don't declare it
inline or you make it a static inline in the header. I think the latter
applies in this case, so this should be the fix
Signed-off-by: NJames Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c88725dd

20 3月, 2019 9 次提交

scsi: lpfc: Specify node affinity for queue memory allocation · c1a21ebc

由 James Smart 提交于 3月 12, 2019

Change the SLI4 queue creation code to use NUMA node based memory
allocation based on the cpu the queues will be related to.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c1a21ebc

scsi: lpfc: Reduce memory footprint for lpfc_queue · 9afbee3d

由 James Smart 提交于 3月 12, 2019

Currently the driver maintains a sideband structure which has a pointer for
each queue element. However, at 8 bytes per pointer, and up to 4k elements
per queue, and 100s of queues, this can take up a lot of memory.

Convert the driver to using an access routine that calculates the element
address based on its index rather than using the pointer table.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9afbee3d

scsi: lpfc: Correct boot bios information to FDMI registration · b3b4f3e1

由 James Smart 提交于 3月 12, 2019

The driver is currently reporting the firmware revision not the actual boot
bios version in FDMI data.

Modify the driver to obtain the boot bios version from the adapter and use
that data in the FMDI data sent to the switch.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b3b4f3e1

scsi: lpfc: Fix mailbox hang on adapter init · e8869f5b

由 James Smart 提交于 3月 12, 2019

The adapter initialization sequence enables interrupts, initializes the
adapter link_state to LINK_DOWN, then issues commands to initialize the
adapter. The interrupt handler on the adapter validates the link_state (has
to be at least LINK_DOWN) and if invalid, will discard the interrupting
event.

In most cases, there is not a command completion, thus an interrupt until
the initialization commands have been sent which is post the setting of
state to LINK_DOWN. However, in cases of firmware reset, the reset will
modify the link_state to an invalid value (indicating a reset of the
adapter) and there occasionally are cases where the adapter will generate
an asynchronous event which shares the eq/cq used for mailbox commands. In
the failure case, an interrupt is generated immediately after enabling them
due to the async event. As link_state is invalid, the eq is list and the
CQ not serviced. At this point link_state is initialized and the mailbox
command sent. As the CQ has not been serviced, it is not armed, so no
interrupt event is generated when the mailbox command completes.

Modify the initialization sequence so that interrupts are enabled after
link_state is properly initialized, which avoids the race condition with
the async event.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e8869f5b

scsi: lpfc: Convert bootstrap mbx polling from msleep to udelay · e2ffe4d5

由 James Smart 提交于 3月 12, 2019

Current code is using msleep when polling for hw ready. Unfortunately the
msleep routine isn't very accurate on rescheduling. In fact, on a busy
systems which reset the adapter, it became 10s of seconds before it was
rescheduled.

Fix by busy waiting using udelay. As we're now busy waiting, significantly
reduce the wait time so that we can exit the pool loop as soon as possible.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e2ffe4d5

scsi: lpfc: Coordinate adapter error handling with offline handling · 4645f7b5

由 James Smart 提交于 3月 12, 2019

The driver periodically checks for adapter error in a background thread. If
the thread detects an error, the adapter will be reset including the
deletion and reallocation of workqueues on the adapter. Simultaneously,
there may be a user-space request to offline the adapter which may try to
do many of the same steps, in parallel, on a different thread. As memory
was deallocated while unexpected, the parallel offline request hit a bad
pointer.

Add coordination between the two threads. The error recovery thread has
precedence. So, when an error is detected, a flag is set on the adapter to
indicate the error thread is terminating the adapter. But, before doing
that work, it will look for a flag that is set by the offline flow, and if
set, will wait for it to complete before then processing the error handling
path. Similarly, in the offline thread, it first checks for whether the
error thread is resetting the adapter, and if so, will then wait for the
error thread to finish. Only after it has finished, will it set its flag
and offline the adapter.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4645f7b5

scsi: lpfc: Stop adapter if pci errors detected · 32a93100

由 James Smart 提交于 3月 12, 2019

In a couple of cases, the driver detected a pci error (via pci device state
or via failed register reads) but didn't take any action to disable the
device.  Additionally, the driver is ignoring the status of pci
configuration space reads.

Having the driver take the adapter offline whenever the pci error is
detected.  Pay attention to pci_config_space_read status and return failure
if an error is seen.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

32a93100

scsi: lpfc: Fix nvmet handling of first burst cmd · 22b738ac

由 James Smart 提交于 3月 12, 2019

With negative test injection, the driver is receiving a command with first
burst enabled, meaning Sequence initiative is not passed with the command
frame. The driver notes the condition and discards the frame.  However the
driver calls the incorrect buffer free routine, resulting in a NULL pointer
reference.

For hbq buffer free, convert to using lpfc_rq_buf_free().
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

22b738ac

scsi: lpfc: Fix use-after-free mailbox cmd completion · 9b164068

由 James Smart 提交于 3月 12, 2019

When unloading the driver, mailbox commands may be sent without holding a
reference on the ndlp. By the time the mailbox command completes, the ndlp
may have reduced its ref counts and been freed.  The problem was reported
by KASAN.

While unregistering due to driver unload, have the completion noop'd by
setting the ndlp context NULL'd. Due to the unload, no further action was
necessary.  Also, while reviewing this path, the generic nulling of the
context after handling should be slightly moved.

Reported by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9b164068

07 3月, 2019 1 次提交

scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check · cda7fa18

由 James Smart 提交于 3月 04, 2019

The outer routine lpfc_sli_issue_iocb(), which decomposes into the
SLI3 (s3) or SLI4 (s4) subroutines takes out the locks. For s3, it takes
out the hbalock. For s4, it takes out the ring_lock. The lockdep check in
the s3 and s4 subroutines both check hbalock, which is incorrect for s4.

Revise the s4 subroutine to lockdep check the ring_lock.
Reported-by: NBart Van Assche <bvanassche@acm.org>
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

cda7fa18

14 2月, 2019 1 次提交

scsi: lpfc: fix a handful of indentation issues · 258f84fa

由 Colin Ian King 提交于 2月 12, 2019

There are a handful of statements that are indented incorrectly. Fix these.
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NJames Smart <james.smart@broadcom.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

258f84fa

06 2月, 2019 8 次提交

scsi: lpfc: Update 12.2.0.0 file copyrights to 2019 · 0d041215

由 James Smart 提交于 1月 28, 2019

For files modified as part of 12.2.0.0 patches, update copyright to 2019
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0d041215

scsi: lpfc: Rework locking on SCSI io completion · c2017260

由 James Smart 提交于 1月 28, 2019

A scsi host lock is taken on every io completion to check whether the abort
handler is waiting on the io completion. This is an expensive lock to take
on all completion when rarely in an abort condition.

Replace scsi host lock with command-specific lock. Synchronize completion
and abort paths by new cmd lock. Ensure all flag changing and nulling of
context pointers taken under lock. When adding lock to task management
abort, realized it was missing other synchronization locks. Added that
synchronization to match normal paths.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c2017260

scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescing · 32517fc0

由 James Smart 提交于 1月 28, 2019

When driving high iop counts, auto_imax coalescing kicks in and drives the
performance to extremely small iops levels.

There are two issues:

 1) auto_imax is enabled by default. The auto algorithm, when iops gets
    high, divides the iops by the hdwq count and uses that value to
    calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether
    they have load or not. The EQ_delay is only manipulated every 5s (a
    long time). Thus there were large 5s swings of no interrupt delay
    followed by large/maximum delay, before repeating.

 2) When processing a CQ, the driver got mixed up on the rate of when
    to ring the doorbell to keep the chip appraised of the eqe or cqe
    consumption as well as how how long to sit in the thread and
    process queue entries. Currently, the driver capped its work at
    64 entries (very small) and exited/rearmed the CQ.  Thus, on heavy
    loads, additional overheads were taken to exit and re-enter the
    interrupt handler. Worse, if in the large/maximum coalescing
    windows,k it could be a while before getting back to servicing.

The issues are corrected by the following:

 - A change in defaults. Auto_imax is turned OFF and fcp_imax is set
   to 0. Thus all interrupts are immediate.

 - Cleanup of field names and their meanings. Existing names were
   non-intuitive or used for duplicate things.

 - Added max_proc_limit field, to control the length of time the
   handlers would service completions.

 - Reworked EQ handling:
    Added common routine that walks eq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after eqe
      processing.
    After rework, xx_release routines are now DB write functions. Renamed
      the routines as such.
    Moved lpfc_sli4_eq_flush(), which does similar action, to same area.
    Replaced the 2 individual loops that walk an eq with a call to the
      common routine.
    Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax.
    Added per-cpu counters to detect interrupt rates and scale
      interrupt coalescing values.

 - Reworked CQ handling:
    Added common routine that walks cq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after cqe
      processing.
    After rework, xx_release routines are now DB write functions.  Renamed
      the routines as such.
    Replaced the 3 individual loops that walk a cq with a call to the
      common routine.
    Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with
      queue reference. Add increment for mbox completion to handler.

 - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow
   dynamic changing of the CQ max_proc_limit value being used.

Although this leaves an EQ as an immediate interrupt, that interrupt will
only occur if a CQ bound to it is in an armed state and has cqe's to
process.  By staying in the cq processing routine longer, high loads will
avoid generating more interrupts as they will only rearm as the processing
thread exits. The immediately interrupt is also beneficial to idle or
lower-processing CQ's as they get serviced immediately without being
penalized by sharing an EQ with a more loaded CQ.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

32517fc0

scsi: lpfc: cleanup: convert eq_delay to usdelay · cb733e35

由 James Smart 提交于 1月 28, 2019

Review of the eq coalescing logic showed the code was a bit fragmented.
Sometimes it would save/set via an interrupt max value, while in others it
would do so via a usdelay. There were also two places changing eq delay,
one place that issued mailbox commands, and another that changed via
register writes if supported.

Clean this up by:

 - Standardizing the operation of lpfc_modify_hba_eq_delay() routine so
   that it is always told of a us delay to impose. The routine then chooses
   the best way to set that - via register or via mbx.

 - Rather than two value types stored in eq->q_mode (usdelay if change via
   register, imax if change via mbox) - q_mode always contains usdelay.
   Before any value change, old vs new value is compared and only if
   different is a change done.

 - Revised the dmult calculation. dmult is not set based on overall imax
   divided by hardware queues - instead imax applies to a single cpu and
   the value will be replicated to all cpus.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

cb733e35

scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues · 6a828b0f

由 James Smart 提交于 1月 28, 2019

So far MSIX vector allocation assumed it would be 1:1 with hardware
queues. However, there are several reasons why fewer MSIX vectors may be
allocated than hardware queues such as the platform being out of vectors or
adapter limits being less than cpu count.

This patch reworks the MSIX/EQ relationships with the per-cpu hardware
queues so they can function independently. MSIX vectors will be equitably
split been cpu sockets/cores and then the per-cpu hardware queues will be
mapped to the vectors most efficient for them.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

6a828b0f

scsi: lpfc: Allow override of hardware queue selection policies · 45aa312e

由 James Smart 提交于 1月 28, 2019

Default behavior is to use the information from the upper IO stacks to
select the hardware queue to use for IO submission.  Which typically has
good cpu affinity.

However, the driver, when used on some variants of the upstream kernel, has
found queuing information to be suboptimal for FCP or IO completion locked
on particular cpus.

For command submission situations, the lpfc_fcp_io_sched module parameter
can be set to specify a hardware queue selection policy that overrides the
os stack information.

For IO completion situations, rather than queing cq processing based on the
cpu servicing the interrupting event, schedule the cq processing on the cpu
associated with the hardware queue's cq.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

45aa312e

scsi: lpfc: Adapt partitioned XRI lists to efficient sharing · c490850a

由 James Smart 提交于 1月 28, 2019

The XRI get/put lists were partitioned per hardware queue. However, the
adapter rarely had sufficient resources to give a large number of resources
per queue. As such, it became common for a cpu to encounter a lack of XRI
resource and request the upper io stack to retry after returning a BUSY
condition. This occurred even though other cpus were idle and not using
their resources.

Create as efficient a scheme as possible to move resources to the cpus that
need them. Each cpu maintains a small private pool which it allocates from
for io. There is a watermark that the cpu attempts to keep in the private
pool. The private pool, when empty, pulls from a global pool from the
cpu. When the cpu's global pool is empty it will pull from other cpu's
global pool. As there many cpu global pools (1 per cpu or hardware queue
count) and as each cpu selects what cpu to pull from at different rates and
at different times, it creates a radomizing effect that minimizes the
number of cpu's that will contend with each other when the steal XRI's from
another cpu's global pool.

On io completion, a cpu will push the XRI back on to its private pool. A
watermark level is maintained for the private pool such that when it is
exceeded it will move XRI's to the CPU global pool so that other cpu's may
allocate them.

On NVME, as heartbeat commands are critical to get placed on the wire, a
single expedite pool is maintained. When a heartbeat is to be sent, it will
allocate an XRI from the expedite pool rather than the normal cpu
private/global pools. On any io completion, if a reduction in the expedite
pools is seen, it will be replenished before the XRI is placed on the cpu
private pool.

Statistics are added to aid understanding the XRI levels on each cpu and
their behaviors.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c490850a

scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting. · 1fbf9742

由 James Smart 提交于 1月 28, 2019

SLI4 nvme functions are passing the SLI3 ring number when posting wqe to
hardware. This should be indicating the hardware queue to use, not the ring
number.

Replace ring number with the hardware queue that should be used.

Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb
routine that properly adapts.
Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Reviewed-by: NHannes Reinecke <hare@suse.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

1fbf9742

openeuler / Kernel 12 个月 前同步成功

openeuler / Kernel
12 个月前同步成功