提交 · 1f02beff224e6176c1a0aacced7fb5127b240996 · openeuler / Kernel

16 4月, 2021 17 次提交

scsi: pm80xx: Remove global lock from outbound queue processing · 1f02beff

由 Viswas G 提交于 4月 15, 2021

Introduce spin lock for outbound queue. With this, driver need not acquire
HBA global lock for outbound queue processing.

Link: https://lore.kernel.org/r/20210415103352.3580-9-Viswas.G@microchip.comAcked-by: NJack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NAshokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

1f02beff

scsi: pm80xx: Reset PI and CI memory during re-initialization · b431472b

由 Viswas G 提交于 4月 15, 2021

Producer index(PI) outbound queue and consumer index(CI) for Outbound queue
are in DMA memory. During resume(), the stale PI and CI Values will lead to
unexpected behavior. These values should be reset to 0 during driver
reinitialization.

Link: https://lore.kernel.org/r/20210415103352.3580-8-Viswas.G@microchip.comAcked-by: NJack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NAshokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b431472b

scsi: pm80xx: Completing pending I/O after fatal error · 4f5deeb4

由 Ruksar Devadi 提交于 4月 15, 2021

When controller runs into fatal error, I/Os get stuck with no response,
handler event is defined to complete the pending I/Os (SAS task and
internal task) and also perform the cleanup for the drives.

Link: https://lore.kernel.org/r/20210415103352.3580-7-Viswas.G@microchip.comAcked-by: NJack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NAshokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4f5deeb4

scsi: pm80xx: Add sysfs attribute to track iop1 count · b0c306e6

由 Vishakha Channapattan 提交于 4月 15, 2021

A new sysfs variable 'ctl_iop1_count' is being introduced that tells if
the controller is alive by indicating controller ticks. If on subsequent
run we see the ticks changing that indicates that controller is not
dead.

Using the 'ctl_iop1_count' sysfs variable we can see ticks incrementing:

    linux-9saw:~# cat  /sys/class/scsi_host/host*/ctl_iop1_count
    0x00000069
    0x0000006b
    0x0000006d
    0x00000072

Link: https://lore.kernel.org/r/20210415103352.3580-6-Viswas.G@microchip.comAcked-by: NJack Wang <jinpu.wang@ionos.com>
Signed-off-by: NVishakha Channapattan <vishakhavc@google.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NAshokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: NRadha Ramachandran <radha@google.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b0c306e6

scsi: pm80xx: Add sysfs attribute to track iop0 count · 0602624a

由 Vishakha Channapattan 提交于 4月 15, 2021

A new sysfs variable 'ctl_iop0_count' is being introduced that tells if
the controller is alive by indicating controller ticks. If on subsequent
run we see the ticks changing that indicates that controller is not
dead.

Using the 'ctl_iop0_count' sysfs variable we can see ticks incrementing:

    linux-9saw:~# cat /sys/class/scsi_host/host*/ctl_iop0_count
    0x000000a3
    0x000001db
    0x000001e4
    0x000001e7

Link: https://lore.kernel.org/r/20210415103352.3580-5-Viswas.G@microchip.comAcked-by: NJack Wang <jinpu.wang@ionos.com>
Signed-off-by: NVishakha Channapattan <vishakhavc@google.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NAshokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: NRadha Ramachandran <radha@google.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

0602624a

scsi: pm80xx: Add sysfs attribute to track RAAE count · dd49ded8

由 Vishakha Channapattan 提交于 4月 15, 2021

A new sysfs variable 'ctl_raae_count' is being introduced that tells if the
controller is alive by indicating controller ticks. If on subsequent run we
see the ticks changing in RAAE count that indicates that controller is not
dead.

Using the 'ctl_raae_count' sysfs variable we can see ticks incrementing:

    linux-9saw:~# cat /sys/class/scsi_host/host*/ctl_raae_count
    0x00002245
    0x00002253
    0x0000225e

Link: https://lore.kernel.org/r/20210415103352.3580-4-Viswas.G@microchip.comAcked-by: NJack Wang <jinpu.wang@ionos.com>
Signed-off-by: NVishakha Channapattan <vishakhavc@google.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NAshokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: NRadha Ramachandran <radha@google.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

dd49ded8

scsi: pm80xx: Add sysfs attribute to check controller hmi error · a4c55e16

由 Vishakha Channapattan 提交于 4月 15, 2021

A new sysfs variable 'ctl_hmi_error' is being introduced to give the error
details if the MPI initialization fails

Using the 'ctl_hmi_error' sysfs variable we can check the error details:

    linux-2dq0:~# cat /sys/class/scsi_host/host*/ctl_hmi_error
    0x00000000
    0x00000000
    0x00000000

Link: https://lore.kernel.org/r/20210415103352.3580-3-Viswas.G@microchip.comSigned-off-by: NVishakha Channapattan <vishakhavc@google.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a4c55e16

scsi: pm80xx: Add sysfs attribute to check MPI state · 4ddbea1b

由 Vishakha Channapattan 提交于 4月 15, 2021

A new sysfs variable 'ctl_mpi_state' is being introduced to check the state
of MPI.

Using the 'ctl_mpi_state' sysfs variable we can check the MPI state:

    linux-2dq0:~# cat /sys/class/scsi_host/host*/ctl_mpi_state
    MPI is successfully initialized

Link: https://lore.kernel.org/r/20210415103352.3580-2-Viswas.G@microchip.comReported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NVishakha Channapattan <vishakhavc@google.com>
Signed-off-by: NViswas G <Viswas.G@microchip.com>
Signed-off-by: NRuksar Devadi <Ruksar.devadi@microchip.com>
Signed-off-by: NAshokkumar N <Ashokkumar.N@microchip.com>
Signed-off-by: NRadha Ramachandran <radha@google.com>
Signed-off-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4ddbea1b

scsi: zfcp: Lift Request Queue tasklet & timer from qdio · b3f0a1ee

由 Julian Wiedmann 提交于 4月 14, 2021

The qdio layer currently provides its own infrastructure to scan for
Request Queue completions & to report them to the device driver. This
comes with several drawbacks - having an async tasklet & timer construct in
qdio introduces additional lifetime complexity, and makes it harder to
integrate them with the rest of the device driver. The timeouts are also
currently hard-coded, and can't be tweaked without affecting other qdio
drivers (ie. qeth).

But due to recent enhancements to the qdio layer, zfcp can actually take
full control of the Request Queue completion processing. It merely needs to
opt-out from the qdio layer mechanisms by setting the scan_threshold to 0,
and then use qdio_inspect_queue() to scan for completions.

So re-implement the tasklet & timer mechanism in zfcp, while initially
copying the scan conditions from qdio's handle_outbound() and
qdio_outbound_tasklet(). One minor behavioural change is that
zfcp_qdio_send() will unconditionally reduce the timeout to 1 HZ, rather
than leaving it at 10 Hz if it was last armed by the tasklet. This just
makes things more consistent. Also note that we can drop a lot of the
accumulated cruft in qdio_outbound_tasklet(), as zfcp doesn't even use PCI
interrupt requests any longer.

This also slightly touches the Response Queue processing, as
qdio_get_next_buffers() will no longer implicitly scan for Request Queue
completions. So complete the migration to qdio_inspect_queue() here as well
and make the tasklet_schedule() visible.

Link: https://lore.kernel.org/r/018d3ddd029f8d6ac00cf4184880288c637c4fd1.1618417667.git.bblock@linux.ibm.comReviewed-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b3f0a1ee

scsi: zfcp: Move the position of put_device() · be46e39a

由 Qinglang Miao 提交于 4月 14, 2021

Place the put_device() call after device_unregister() in both
zfcp_unit_remove() and zfcp_sysfs_port_remove_store() to make it more
natural. put_device() ought to be the last time we touch the object in both
functions.

Add comments after put_device() to make code clearer.

Link: https://lore.kernel.org/r/0a568c7733ba0f1dde28b0c663b90270d44dd540.1618417667.git.bblock@linux.ibm.comSuggested-by: NSteffen Maier <maier@linux.ibm.com>
Suggested-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NQinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

be46e39a

scsi: zfcp: Clean up sysfs code for SFP diagnostics · 20540a56

由 Julian Wiedmann 提交于 4月 14, 2021

The error path from zfcp_adapter_enqueue() no longer attempts to remove the
diagnostics attributes if they haven't been created yet.

So remove the manual 'sysfs_established' guard for this case, and use
device_add_groups() to add all adapter-related sysfs attributes in one go.

Link: https://lore.kernel.org/r/37a97537f675d643006271f37723c346189b6eec.1618417667.git.bblock@linux.ibm.comReviewed-by: NBenjamin Block <bblock@linux.ibm.com>
Reviewed-by: NSteffen Maier <maier@linux.ibm.com>
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

20540a56

scsi: zfcp: Fix sysfs roll-back on error in zfcp_adapter_enqueue() · ab1fa880

由 Julian Wiedmann 提交于 4月 14, 2021

When zfcp_adapter_enqueue() fails to create the zfcp_sysfs_adapter_attrs
group, it calls zfcp_adapter_unregister() to tear down the adapter state
again. This then unconditionally attempts to remove the
zfcp_sysfs_adapter_attrs group, resulting in a "group not found" WARN from
sysfs code.

Avoid this by copying most of zfcp_adapter_unregister() into the error
path, allowing for more fine-granular roll-back. Then skip the sysfs
tear-down steps if we haven't progressed this far in the initialization.

Link: https://lore.kernel.org/r/790922cc3af075795fff9a4b787e6bda19bdb3be.1618417667.git.bblock@linux.ibm.comReviewed-by: NBenjamin Block <bblock@linux.ibm.com>
Reviewed-by: NSteffen Maier <maier@linux.ibm.com>
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ab1fa880

scsi: zfcp: Fix indentation coding style issue · 8824db89

由 Yevhen Viktorov 提交于 4月 14, 2021

Code indentation should use tabs where possible.

Link: https://lore.kernel.org/r/e8a15a2f3d64e2e76a214647cfd4fe23d370b165.1618417667.git.bblock@linux.ibm.comSigned-off-by: NYevhen Viktorov <yevhen.viktorov@virginmedia.com>
Signed-off-by: NSteffen Maier <maier@linux.ibm.com>
Signed-off-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8824db89

scsi: zfcp: Remove unneeded INIT_LIST_HEAD() for FSF requests · 91cf21ec

由 Julian Wiedmann 提交于 4月 14, 2021

INIT_LIST_HEAD() is only needed for actual list heads, while req->list is
used as a list entry.

Note that when the error path in zfcp_fsf_req_send() removes the request
from the adapter's list of pending requests, it actually looks up the
request from the zfcp_reqlist - rather than just calling list_del(). So
there's no risk of us calling list_del() on a request that hasn't been
added to any list yet.

Link: https://lore.kernel.org/r/254dc0ae28dccc43ab0b1079ef2c8dcb5fe1d2e4.1618417667.git.bblock@linux.ibm.comReviewed-by: NBenjamin Block <bblock@linux.ibm.com>
Reviewed-by: NSteffen Maier <maier@linux.ibm.com>
Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: NBenjamin Block <bblock@linux.ibm.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

91cf21ec

scsi: qla2xxx: Reserve extra IRQ vectors · f02d4086

由 Roman Bolshakov 提交于 4月 12, 2021

Commit a6dcfe08 ("scsi: qla2xxx: Limit interrupt vectors to number of
CPUs") lowers the number of allocated MSI-X vectors to the number of CPUs.

That breaks vector allocation assumptions in qla83xx_iospace_config(),
qla24xx_enable_msix() and qla2x00_iospace_config(). Either of the functions
computes maximum number of qpairs as:

  ha->max_qpairs = ha->msix_count - 1 (MB interrupt) - 1 (default
                   response queue) - 1 (ATIO, in dual or pure target mode)

max_qpairs is set to zero in case of two CPUs and initiator mode. The
number is then used to allocate ha->queue_pair_map inside
qla2x00_alloc_queues(). No allocation happens and ha->queue_pair_map is
left NULL but the driver thinks there are queue pairs available.

qla2xxx_queuecommand() tries to find a qpair in the map and crashes:

  if (ha->mqenable) {
          uint32_t tag;
          uint16_t hwq;
          struct qla_qpair *qpair = NULL;

          tag = blk_mq_unique_tag(cmd->request);
          hwq = blk_mq_unique_tag_to_hwq(tag);
          qpair = ha->queue_pair_map[hwq]; # <- HERE

          if (qpair)
                  return qla2xxx_mqueuecommand(host, cmd, qpair);
  }

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 0 PID: 72 Comm: kworker/u4:3 Tainted: G        W         5.10.0-rc1+ #25
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
  Workqueue: scsi_wq_7 fc_scsi_scan_rport [scsi_transport_fc]
  RIP: 0010:qla2xxx_queuecommand+0x16b/0x3f0 [qla2xxx]
  Call Trace:
   scsi_queue_rq+0x58c/0xa60
   blk_mq_dispatch_rq_list+0x2b7/0x6f0
   ? __sbitmap_get_word+0x2a/0x80
   __blk_mq_sched_dispatch_requests+0xb8/0x170
   blk_mq_sched_dispatch_requests+0x2b/0x50
   __blk_mq_run_hw_queue+0x49/0xb0
   __blk_mq_delay_run_hw_queue+0xfb/0x150
   blk_mq_sched_insert_request+0xbe/0x110
   blk_execute_rq+0x45/0x70
   __scsi_execute+0x10e/0x250
   scsi_probe_and_add_lun+0x228/0xda0
   __scsi_scan_target+0xf4/0x620
   ? __pm_runtime_resume+0x4f/0x70
   scsi_scan_target+0x100/0x110
   fc_scsi_scan_rport+0xa1/0xb0 [scsi_transport_fc]
   process_one_work+0x1ea/0x3b0
   worker_thread+0x28/0x3b0
   ? process_one_work+0x3b0/0x3b0
   kthread+0x112/0x130
   ? kthread_park+0x80/0x80
   ret_from_fork+0x22/0x30

The driver should allocate enough vectors to provide every CPU it's own HW
queue and still handle reserved (MB, RSP, ATIO) interrupts.

The change fixes the crash on dual core VM and prevents unbalanced QP
allocation where nr_hw_queues is two less than the number of CPUs.

Link: https://lore.kernel.org/r/20210412165740.39318-1-r.bolshakov@yadro.com
Fixes: a6dcfe08 ("scsi: qla2xxx: Limit interrupt vectors to number of CPUs")
Cc: Daniel Wagner <daniel.wagner@suse.com>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Quinn Tran <qutran@marvell.com>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org # 5.11+
Reported-by: NAleksandr Volkov <a.y.volkov@yadro.com>
Reported-by: NAleksandr Miloserdov <a.miloserdov@yadro.com>
Reviewed-by: NDaniel Wagner <dwagner@suse.de>
Reviewed-by: NHimanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: NRoman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

f02d4086

scsi: smartpqi: Fix device pointer variable reference static checker issue · 5cad5a50

由 Don Brace 提交于 4月 15, 2021

Dan Carpenter found a possible NULL pointer dereference issue in function
pqi_sas_port_add_rphy():

   drivers/scsi/smartpqi/smartpqi_sas_transport.c:97
   pqi_sas_port_add_rphy() warn: variable dereferenced before
   check 'pqi_sas_port->device' (see line 95)

Correct issue by moving reference of pqi_sas_port->device after the check
for the device pointer being non-NULL.

Link: https://www.mail-archive.com/kbuild@lists.01.org/msg06329.html
Link: https://lore.kernel.org/r/161850493026.7302.10032784239320437353.stgit@brunhilda
Fixes: ec504b23 ("scsi: smartpqi: Add phy ID support for the physical drives")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Reported-by: Nkernel test robot <lkp@intel.com>
Reviewed-by: NScott Benesh <scott.benesh@microchip.com>
Reviewed-by: NScott Teel <scott.teel@microchip.com>
Reviewed-by: NMike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: NDon Brace <don.brace@microchip.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

5cad5a50

scsi: smartpqi: Fix blocks_per_row static checker issue · 667298ce

由 Don Brace 提交于 4月 15, 2021

Dan Carpenter found a possible divide by 0 issue in the smartpqi driver in
functions pci_get_aio_common_raid_map_values() and pqi_calc_aio_r5_or_r6().
The variable rmd->blocks_per_row is used as a divisor and could be 0.

Using rmd->blocks_per_row as a divisor without checking
it for 0 first.

Correct these possible divide by 0 conditions by insuring that
rmd->blocks_per_row is not zero before usage. The check for non-0 was too
late to prevent a divide by 0 condition. Add in a comment to explain why
the check for non-zero is necessary. If the member is 0, return
PQI_RAID_BYPASS_INELIGIBLE before any division is performed.

Link: https://lore.kernel.org/linux-scsi/YG%2F5kWHHAr7w5dU5@mwanda/
Link: https://lore.kernel.org/r/161850492435.7302.392780350442938047.stgit@brunhilda
Fixes: 6702d2c4 ("scsi: smartpqi: Add support for RAID5 and RAID6 writes")
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Reported-by: Nkernel test robot <lkp@intel.com>
Reviewed-by: NScott Benesh <scott.benesh@microchip.com>
Reviewed-by: NScott Teel <scott.teel@microchip.com>
Reviewed-by: NMike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: NKevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: NDon Brace <don.brace@microchip.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

667298ce

13 4月, 2021 23 次提交

scsi: ibmvfc: Fix invalid state machine BUG_ON() · 15cfef86

由 Brian King 提交于 4月 12, 2021

This fixes an issue hitting the BUG_ON() in ibmvfc_do_work(). When going
through a host action of IBMVFC_HOST_ACTION_RESET, we change the action to
IBMVFC_HOST_ACTION_TGT_DEL, then drop the host lock, and reset the CRQ,
which changes the host state to IBMVFC_NO_CRQ. If, prior to setting the
host state to IBMVFC_NO_CRQ, ibmvfc_init_host() is called, it can then end
up changing the host action to IBMVFC_HOST_ACTION_INIT.  If we then change
the host state to IBMVFC_NO_CRQ, we will then hit the BUG_ON().

Make a couple of changes to avoid this. Leave the host action to be
IBMVFC_HOST_ACTION_RESET or IBMVFC_HOST_ACTION_REENABLE until after we drop
the host lock and reset or reenable the CRQ. Also harden the host state
machine to ensure we cannot leave the reset / reenable state until we've
finished processing the reset or reenable.

Link: https://lore.kernel.org/r/20210413001009.902400-1-tyreld@linux.ibm.com
Fixes: 73ee5d86 ("[SCSI] ibmvfc: Fix soft lockup on resume")
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
[tyreld: added fixes tag]
Signed-off-by: NTyrel Datwyler <tyreld@linux.ibm.com>
[mkp: fix comment checkpatch warnings]
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

15cfef86

scsi: lpfc: Copyright updates for 12.8.0.9 patches · cf270817

由 James Smart 提交于 4月 11, 2021

Update copyrights to 2021 for files modified in the 12.8.0.9 patch set.

Link: https://lore.kernel.org/r/20210412013127.2387-17-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

cf270817

scsi: lpfc: Update lpfc version to 12.8.0.9 · 3ebd25b0

由 James Smart 提交于 4月 11, 2021

Update lpfc version to 12.8.0.9

Link: https://lore.kernel.org/r/20210412013127.2387-16-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3ebd25b0

scsi: lpfc: Eliminate use of LPFC_DRIVER_NAME in lpfc_attr.c · 5b1f5089

由 James Smart 提交于 4月 11, 2021

During code inspection, several cases of creating a dynamic attribute names
in logs messages using a define was found. This is unnecessary.

Place the native symbol name in the log messages.

Link: https://lore.kernel.org/r/20210412013127.2387-15-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

5b1f5089

scsi: lpfc: Standardize discovery object logging format · f1156125

由 James Smart 提交于 4月 11, 2021

Code inspection showed lpfc was using three different pointer formats when
logging discovery object pointers.

Standardize the pointer format to x%px.

Note: %px use is limited to discovery objects in order to aid core
analysis.

Link: https://lore.kernel.org/r/20210412013127.2387-14-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

f1156125

scsi: lpfc: Fix various trivial errors in comments and log messages · 3bfab8a0

由 James Smart 提交于 4月 11, 2021

Clean up minor issues spotted by tools and code review:

 - Spelling Errors

 - Spurious characters and errors in function headers

 - nvme_info wqerr and err fields source data reversed

 - Extraneous new line in log message 0466

 - Spacing error in log message 0109

 - Messages 0140 and 0141 have portname and nodename reversed

 - Incorrect function labelling in comment

Link: https://lore.kernel.org/r/20210412013127.2387-13-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3bfab8a0

scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logic · b62232ba

由 James Smart 提交于 4月 11, 2021

SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3
does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have
code to issue the command. The command will always fail.

Remove the code for the mailbox command and leave only the resulting
"failure path" logic.

Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b62232ba

scsi: lpfc: Fix lpfc_hdw_queue attribute being ignored · d3de0d11

由 James Smart 提交于 4月 11, 2021

The lpfc_hdw_queue attribute is to set the number of hardware queues to be
created on the adapter. Normally, the value is set to a default, which
allows the hw queue count to be sized dynamically based on adapter
capabilities, CPU/platform architecture, or CPU type. Currently, when
lpfc_hdw_queue is set to a specific value, is has no effect and the dynamic
sizing occurs.

The routine checking whether parameters are default or not ignores the
lpfc_hdw_queue setting and invokes the dynamic logic.

Fix the routine to additionally check the lpfc_hdw_queue attribute value
before using dynamic scaling. Additionally, SLI-3 supports only a small
number of queues with dedicated functions, thus it needs to be exempted
from the variable scaling and set to the expected values.

Link: https://lore.kernel.org/r/20210412013127.2387-11-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d3de0d11

scsi: lpfc: Fix missing FDMI registrations after Mgmt Svc login · a314dec3

由 James Smart 提交于 4月 11, 2021

FDMI registration needs to be performed after every login with the FC Mgmt
service. The flag the driver is using to track registration is cleared on
link up, but never on Mgmt service logout/re-login.

Fix by clearing the flag whenever a new login is completed with the FC Mgmt
service.

While perusing the flag use, logging was performed as if FDMI registration
occurred on vports. However, it is limited to the physical port only.
Revise the logging to reflect physical port based.

Link: https://lore.kernel.org/r/20210412013127.2387-10-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a314dec3

scsi: lpfc: Fix silent memory allocation failure in lpfc_sli4_bsg_link_diag_test() · a1a553e3

由 James Smart 提交于 4月 11, 2021

In the unlikely case of a failure to allocate an LPFC_MBOXQ_t structure, no
return status is set, thus the routine never logs an error and returns
success to the callee.

Fix by setting a return code on failure.

Link: https://lore.kernel.org/r/20210412013127.2387-9-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a1a553e3

scsi: lpfc: Fix use-after-free on unused nodes after port swap · 724f6b43

由 James Smart 提交于 4月 11, 2021

During target port swap, the swap logic ignores the DROPPED flag in the
nodes. As a node then moves into the UNUSED state, the reference count will
be dropped. If a node is later reused and moved out of the UNUSED state, an
access can result in a use-after-free assert.

Fix by having the port swap logic propagate the DROPPED flag when switching
nodes. This will avoid reference from being dropped.

Link: https://lore.kernel.org/r/20210412013127.2387-8-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

724f6b43

scsi: lpfc: Fix error handling for mailboxes completed in MBX_POLL mode · 304ee432

由 James Smart 提交于 4月 11, 2021

In SLI-4, when performing a mailbox command with MBX_POLL, the driver uses
the BMBX register to send the command rather than the MQ. A flag is set
indicating the BMBX register is active and saves the mailbox job struct
(mboxq) in the mbox_active element of the adapter. The routine then waits
for completion or timeout. The mailbox job struct is not freed by the
routine. In cases of timeout, the adapter will be reset. The
lpfc_sli_mbox_sys_flush() routine will clean up the mbox in preparation for
the reset. It clears the BMBX active flag and marks the job structure as
MBX_NOT_FINISHED. But, it never frees the mboxq job structure. Expectation
in both normal completion and timeout cases is that the issuer of the mbx
command will free the structure.  Unfortunately, not all calling paths are
freeing the memory in cases of error.

All calling paths were looked at and updated, if missing, to free the mboxq
memory regardless of completion status.

Link: https://lore.kernel.org/r/20210412013127.2387-7-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

304ee432

scsi: lpfc: Fix lack of device removal on port swaps with PRLIs · 4e76d4a9

由 James Smart 提交于 4月 11, 2021

During target port-swap testing with link flips, the initiator could
encounter PRLI errors. If the target node disappears permanently, the ndlp
is found stuck in UNUSED state with ref count of 1. The rmmod of the driver
will hang waiting for this node to be freed.

While handling a link error in PRLI completion path, the code intends to
skip triggering the discovery state machine. However this is causing the
final reference release path to be skipped. This causes the node to be
stuck with ref count of 1

Fix by ensuring the code path triggers the device removal event on the node
state machine.

Link: https://lore.kernel.org/r/20210412013127.2387-6-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

4e76d4a9

scsi: lpfc: Fix NMI crash during rmmod due to circular hbalock dependency · a789241e

由 James Smart 提交于 4月 11, 2021

Remove hbalock dependency for lpfc_abts_els_sgl_list and
lpfc_abts_nvmet_ctx_list. The lists are adaquately synchronized with the
sgl_list_lock and abts_nvmet_buf_list_lock.

Link: https://lore.kernel.org/r/20210412013127.2387-5-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a789241e

scsi: lpfc: Fix reference counting errors in lpfc_cmpl_els_rsp() · f866eb06

由 James Smart 提交于 4月 11, 2021

Call traces are being seen that result from a nodelist structure ref
counting error. They are typically seen after transmission of an LS_RJT ELS
response.

Aged code in lpfc_cmpl_els_rsp() calls lpfc_nlp_not_used() which, if the
ndlp reference count is exactly 1, will decrement the reference count.
Previously lpfc_nlp_put() was within lpfc_els_free_iocb(), and the 'put'
within the free would only be invoked if cmdiocb->context1 was not NULL.
Since the nodelist structure reference count is decremented when exiting
lpfc_cmpl_els_rsp() the lpfc_nlp_not_used() calls are no longer required.
Calling them is causing the reference count issue.

Fix by removing the lpfc_nlp_not_used() calls.

Link: https://lore.kernel.org/r/20210412013127.2387-4-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

f866eb06

scsi: lpfc: Fix crash when a REG_RPI mailbox fails triggering a LOGO response · fffd18ec

由 James Smart 提交于 4月 11, 2021

Fix a crash caused by a double put on the node when the driver completed an
ACC for an unsolicted abort on the same node. The second put was executed
by lpfc_nlp_not_used() and is wrong because the completion routine executes
the nlp_put when the iocbq was released. Additionally, the driver is
issuing a LOGO then immediately calls lpfc_nlp_set_state to put the node
into NPR. This call does nothing.

Remove the lpfc_nlp_not_used call and additional set_state in the
completion routine. Remove the lpfc_nlp_set_state post issue_logo. Isn't
necessary.

Link: https://lore.kernel.org/r/20210412013127.2387-3-jsmart2021@gmail.comCo-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

fffd18ec

scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotag · 078c68b8

由 James Smart 提交于 4月 11, 2021

Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in
lpfc_els_free_iocb().

A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the
changes was to convert from building/sending an abort within the routine to
using a common routine. The reworked routine passes, without modification,
the pring ptr to the new common routine. The older routine had logic to
check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were
passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine
is missing this check and adapt, so the SLI-3 ring pointers are being used
in SLI-4 paths.

Fix by cleaning up the calling routines. In review, there is no need to
pass the ring ptr argument to abort_iocb at all. The routine can look at
the adapter type itself and reference the proper ring.

Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com
Fixes: db7531d2 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers")
Cc: <stable@vger.kernel.org> # v5.11+
Co-developed-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJustin Tee <justin.tee@broadcom.com>
Signed-off-by: NJames Smart <jsmart2021@gmail.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

078c68b8

scsi: isci: Remove unnecessary struct declaration · 8350e196

由 Wan Jiabing 提交于 4月 06, 2021

struct sci_phy_proto was already defined on line 142. The declaration here
is unnecessary. Remove it.

Link: https://lore.kernel.org/r/20210406105913.676746-1-wanjiabing@vivo.comSigned-off-by: NWan Jiabing <wanjiabing@vivo.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

8350e196

scsi: message: fusion: Remove unused local variable 'vtarget' · cf17ff26

由 Jiapeng Chong 提交于 4月 12, 2021

Fix the following gcc warning:

drivers/message/fusion/mptsas.c:783:14: warning: variable ‘vtarget’ set
but not used [-Wunused-but-set-variable].

Link: https://lore.kernel.org/r/1618207146-96542-1-git-send-email-jiapeng.chong@linux.alibaba.comReported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

cf17ff26

scsi: message: fusion: Remove unused local variable 'status' · c436b41a

由 Jiapeng Chong 提交于 4月 08, 2021

Fix the following gcc warning:

drivers/message/fusion/mptbase.c:3087:9: warning: variable ‘status’ set
but not used [-Wunused-but-set-variable].

Link: https://lore.kernel.org/r/1617872780-126448-1-git-send-email-jiapeng.chong@linux.alibaba.comReported-by: NAbaci Robot <abaci@linux.alibaba.com>
Signed-off-by: NJiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c436b41a

scsi: message: fusion: Remove unused local variable 'port' · 30264737

由 Zhen Lei 提交于 4月 08, 2021

Fixes the following W=1 kernel build warning:

drivers/message/fusion/mptctl.c: In function ‘mptctl_gettargetinfo
drivers/message/fusion/mptctl.c:1372:7: warning: variable ‘port’ set but not used [-Wunused-but-set-variable]

Link: https://lore.kernel.org/r/20210408061851.3089-3-thunder.leizhen@huawei.comReported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

30264737

scsi: message: fusion: Remove unused local variable 'time_count' · 039cf381

由 Zhen Lei 提交于 4月 08, 2021

Fixes the following W=1 kernel build warning:

drivers/message/fusion/mptctl.c: In function ‘mptctl_do_taskmgmt:
drivers/message/fusion/mptctl.c:324:17: warning: variable ‘time_count’ set but not used [-Wunused-but-set-variable]

Link: https://lore.kernel.org/r/20210408061851.3089-2-thunder.leizhen@huawei.com
Fixes: 7d757f18 ("[SCSI] mptfusion: Updated SCSI IO IOCTL error handling.")
Reported-by: NHulk Robot <hulkci@huawei.com>
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

039cf381

scsi: qla4xxx: Remove unneeded if-null-free check · eb5a3e3b

由 Qiheng Lin 提交于 4月 09, 2021

Eliminate the following coccicheck warning:

drivers/scsi/qla4xxx/ql4_os.c:4175:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:4196:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:4215:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6400:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6402:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6555:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:6557:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:7838:2-7: WARNING:
 NULL check before some freeing functions is not needed.
drivers/scsi/qla4xxx/ql4_os.c:7840:2-7: WARNING:
 NULL check before some freeing functions is not needed.

Link: https://lore.kernel.org/r/20210409120345.6447-1-linqiheng@huawei.comSigned-off-by: NQiheng Lin <linqiheng@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

eb5a3e3b

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功