提交 · 4e87eb2f46ea547d12a276b2e696ab934d16cfb6 · openeuler / Kernel

19 12月, 2018 1 次提交

scsi: flip the default on use_clustering · 2a3d4eb8

由 Christoph Hellwig 提交于 12月 13, 2018

Most SCSI drivers want to enable "clustering", that is merging of
segments so that they might span more than a single page.  Remove the
ENABLE_CLUSTERING define, and require drivers to explicitly set
DISABLE_CLUSTERING to disable this feature.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

2a3d4eb8

13 12月, 2018 2 次提交

scsi: hisi_sas: Make sg_tablesize consistent value · 6db831f4

由 Xiang Chen 提交于 12月 06, 2018

Sht->sg_tablesize is set in the driver, and it will be assigned to
shost->sg_tablesize in SCSI mid-layer. So it is not necessary to assign
shost->sg_table one more time in the driver.

In addition to the change, change each scsi_host_template.sg_tablesize
to HISI_SAS_SGE_PAGE_CNT instead of SG_ALL.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

6db831f4

scsi: hisi_sas: Fix warnings detected by sparse · 735bcc77

由 John Garry 提交于 12月 06, 2018

This patchset fixes some warnings detected by the sparse tool, like these:
drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52: warning: incorrect type in assignment (different base types)
drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52: expected unsigned short [unsigned] [assigned] [usertype] tag_of_task_to_be_managed
drivers/scsi/hisi_sas/hisi_sas_main.c:1469:52: got restricted __le16 [usertype] <noident>
drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52: warning: incorrect type in assignment (different base types)
drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52: expected unsigned short [unsigned] [assigned] [usertype] tag_of_task_to_be_managed
drivers/scsi/hisi_sas/hisi_sas_main.c:1723:52: got restricted __le16 [usertype] <noident>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

735bcc77

16 11月, 2018 5 次提交

scsi: hisi_sas: change the time of SAS SSP connection · 15bc43f3

由 Xiang Chen 提交于 11月 09, 2018

Currently the time of SAS SSP connection is 1ms, which means the link
connection will fail if no IO response after this period.

For some disks handling large IO (such as 512k), 1ms is not enough, so
change it to 5ms.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

15bc43f3

scsi: hisi_sas: Add support for interrupt coalescing for v3 hw · 37359798

由 Xiang Chen 提交于 11月 09, 2018

If INT_COAL_EN is enabled, configure time and count of interrupt
coalescing.  Then if CQ collects count of CQ entries in time, it will
report the interrupt. Or if CQ doesn't collect enough CQ entries in time,
it will report the interrupt at timeout.

As all the registers are not supported to be changed dynamically, we need
to config those register between disable and enable PHYs.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

37359798

scsi: hisi_sas: Add support for interrupt converge for v3 hw · 488cf558

由 Xiang Chen 提交于 11月 09, 2018

If CQ_INT_CONVERGE_EN is enabled, the interrupts of all the 16 CQ queues
will be reported by CQ0.

So we need to change the process of CQ tasklet for this situation.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

488cf558

scsi: hisi_sas: Create separate host attributes per HBA · c3566f9a

由 Xiang Chen 提交于 11月 09, 2018

Currently all the three HBA (v1/v2/v3 HW) share the same host attributes.

To support each HBA having separate attributes in future, create per-HBA
attributes.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c3566f9a

scsi: hisi_sas: use dma_set_mask_and_coherent · e4db40e7

由 Christoph Hellwig 提交于 10月 18, 2018

The driver currently uses pci_set_dma_mask despite otherwise using the
generic DMA API.  Switch it over to the better generic DMA API.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e4db40e7

16 10月, 2018 4 次提交

scsi: hisi_sas: Update v3 hw AIP_LIMIT and CFG_AGING_TIME register values · 3bccfba8

由 Xiang Chen 提交于 9月 24, 2018

Update registers as follows:
- Default value of AIP timer is 1ms, and it is easy for some expanders to
  cause IO error. Change the value to max value 65ms to avoid IO error for
  those expanders.

- A CQ completion will be reported by HW when 4 CQs have occurred or the
  aging timer expires, whichever happens first. Sor serial IO scenario, it
  will still wait 8us for every IO before it is reported. So in the
  situation, the performance is poor. So to improve it, change the limit
  time to the least value.
  For other scenario, it does little affect to the performance.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3bccfba8

scsi: hisi_sas: Use block layer tag instead for IPTT · 784b46b7

由 Xiang Chen 提交于 9月 24, 2018

Currently we use the IPTT defined in LLDD to identify IOs. Actually for
IOs which are from the block layer, they have tags to identify them. So
for those IOs, use tag of the block layer directly, and for IOs which is
not from the block layer (such as internal IOs from libsas/LLDD), reserve
96 IPTTs for them.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

784b46b7

scsi: hisi_sas: unmask interrupts ent72 and ent74 · 6ecf5ba1

由 Xiang Chen 提交于 9月 24, 2018

The interrupts of ent72 and ent74 are not processed by PCIe AER handling,
so we need to unmask the interrupts and process them first in the driver.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

6ecf5ba1

scsi: hisi_sas: Free slot later in slot_complete_vx_hw() · 3e178f3e

由 Xiang Chen 提交于 9月 24, 2018

If an SSP/SMP IO times out, it may be actually in reality be
simultaneously processing completion of the slot in
slot_complete_vx_hw().

Then if the slot is freed in slot_complete_vx_hw() (this IPTT is freed
and it may be re-used by other slot), and we may abort the wrong slot in
hisi_sas_abort_task().

So to solve the issue, free the slot after the check of
SAS_TASK_STATE_ABORTED in slot_complete_vx_hw().
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3e178f3e

20 7月, 2018 6 次提交

scsi: hisi_sas: Add SATA FIS check for v3 hw · f4e34f2a

由 Xiang Chen 提交于 7月 18, 2018

Add a check ERR bit of status to decide whether there is something wrong
with initial register-D2H FIS. If error exist, PHY link reset the channel
to restart OOB.

Directly call work HISI_PHYE_LINK_RESET replacing disable_phy_vx_hw() and
enable_phy_vx_hw().
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

f4e34f2a

scsi: hisi_sas: add memory barrier in task delivery function · 1c09b663

由 Xiaofei Tan 提交于 7月 18, 2018

In task start delivery function, we need to add a memory barrier to prevent
re-ordering of reading memory by hardware. Because the slot data is set in
task prepare function and it could be running in another CPU.

This patch adds an memory barrier after s->ready is read in the task start
delivery function, and uses WRITE_ONCE() in the places where s->ready is
set to ensure that the compiler does not re-order.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

1c09b663

scsi: hisi_sas: Implement handlers of PCIe FLR for v3 hw · e5ea4801

由 Xiaofei Tan 提交于 7月 18, 2018

This patch implements handlers of PCIe FLR for v3 hw, reset_prepare() and
reset_done().

User can issue FLR through sysfs interface, as v3 hw support PCIe FLR.
Then if we don't implement these two handlers, our SAS controller will not
work after executing FLR.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e5ea4801

scsi: hisi_sas: relocate some common code for v3 hw · e8ce775e

由 Xiaofei Tan 提交于 7月 18, 2018

Much code of PM suspend function also exists in soft reset function. This
is not concise. So, this patch relocates the common code of these two
functions to a separate function.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e8ce775e

scsi: hisi_sas: Fix the failure of recovering PHY from STP link timeout · 25908cac

由 Xiaofei Tan 提交于 7月 18, 2018

There is an issue that link reset can't recover PHY when STP link timeout.
Because current process of enabling PHY for v3 hw will wait last
transmission done. The time of one transmission depends IO size, disk model
and so on. Normally, it should be shorter than 50ms. But the last
transmission could be never done for some abnormal scenarios, such as STP
link timeout.

This patch is to fix the issue. Check PHY status after starting process of
enabling PHY for 50ms. If the PHY is still active, we disable it forcibly
by PHY reset. Of course, we need to clear the PHY reset bit when enable
PHY.

Besides, the function disable_phy_v3_hw() should not be suitable to call in
interrupts for hilink bug for this 50ms delay. Then, we do link reset for
hilink bug directly. The change is that we don't clear the invalid dword
count register. This is better. Because we should not clear such error
count while not saved.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

25908cac

scsi: hisi_sas: tidy channel interrupt handler for v3 hw · d9d51e0c

由 Xiaofei Tan 提交于 7月 18, 2018

The ISR of channel interrupt of v3 hw is a little long and messy. This
patch tidies it by relocating CHL_INT1 and CHL_INT2 handling to new
function separately.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

d9d51e0c

20 6月, 2018 4 次提交

scsi: hisi_sas: Update a couple of register settings for v3 hw · 7931cd91

由 John Garry 提交于 5月 31, 2018

Update CFG_1US_TIMER_TRSH and CON_CFG_DRIVER settings.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

7931cd91

scsi: hisi_sas: Add a flag to filter PHY events during reset · ed99e1d9

由 Xiaofei Tan 提交于 5月 31, 2018

During reset, we don't want PHY events reported to libsas for PHYs which
were previously attached prior to reset.

So check hisi_hba->flags for HISI_SAS_RESET_BIT to filter PHY events during
reset.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

ed99e1d9

scsi: hisi_sas: Adjust task reject period during host reset · 214e702d

由 Xiaofei Tan 提交于 5月 31, 2018

After soft_reset() for host reset, we should not be allowed to send
commands to the HW before the PHYs have come up and the port ids have been
refreshed.

Prior to this point, any commands cannot be successfully completed.

This exclusion is achieved by grabbing the host reset semaphore.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

214e702d

scsi: hisi_sas: Only process broadcast change in phy_bcast_v3_hw() · 1324ae1c

由 Xiaofei Tan 提交于 5月 31, 2018

There are many BROADCAST primitives generated by the host. We are only
interested in BROADCAST (CHANGE) primitives currently, so only process
this.

We have applied this processing for v2 hw before, and it is also needed for
v3 hw.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

1324ae1c

29 5月, 2018 6 次提交

scsi: hisi_sas: Mark PHY as in reset for nexus reset · 3e1fb1b8

由 Xiang Chen 提交于 5月 21, 2018

When issuing a nexus reset for directly attached device, we want to ignore
the PHY down events so libsas will not deform and reform the port.

In the case that the attached SAS changes for the reset, libsas will deform
and form a port.

For scenario that the PHY does not come up after a timeout period, then
report the PHY down to libsas.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

3e1fb1b8

scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot · 78bd2b4f

由 Xiaofei Tan 提交于 5月 21, 2018

In future scenarios we will want to use the TMF struct for more task types
than SSP.

As such, we can add struct hisi_sas_tmf_task directly into struct
hisi_sas_slot, and this will mean we can remove the TMF parameters from the
task prep functions.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

78bd2b4f

scsi: hisi_sas: Try wait commands before before controller reset · a865ae14

由 Xiaofei Tan 提交于 5月 21, 2018

We may reset the controller in many scenarios, such as SCSI EH and HW
errors. There should be no IO which returns from target when SCSI EH is
active. But for other scenarios, there may be.  It is not necessary to make
such IOs fail.

This patch adds an function of trying to wait for any commands, or IO, to
complete before host reset. If no more CQ returned from host controller in
100ms, we assume no more IO can return, and then stop waiting. We wait 5s
at most.

The HW has a register CQE_SEND_CNT to indicate the total number of CQs that
has been reported to driver. We can use this register and it is reliable to
resd this register in such scenarios that require host reset.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a865ae14

scsi: hisi_sas: Create a scsi_host_template per HW module · 235bfc7f

由 Xiang Chen 提交于 5月 21, 2018

When a SCSI host is registered, the SCSI mid-layer takes a reference to a
module in Scsi_host.hostt.module. In doing this, we are prevented from
removing the driver module for the host in dangerous scenario, like when a
disk is mounted.

Currently there is only one scsi_host_template (sht) for all HW versions,
and this is the main.c module. So this means that we can possibly remove
the HW module in this dangerous scenario, as SCSI mid-layer is only
referencing the main.c module.

To fix this, create a sht per module, referencing that same module to
create the Scsi host.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

235bfc7f

scsi: hisi_sas: Add LED feature for v3 hw · 428f1b34

由 Xiaofei Tan 提交于 5月 21, 2018

This patch implements LED feature of directly attached disk for v3 hw.

In fact, this hw has created an SGPIO component for LED feature, and we can
control LEDs just by internal registers.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

428f1b34

scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate() · 757db2da

由 John Garry 提交于 5月 21, 2018

There is much common code and functionality between the HW versions to set
the PHY linkrate.

As such, this patch factors out the common code into a generic function
hisi_sas_phy_set_linkrate().
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

757db2da

18 5月, 2018 4 次提交

scsi: hisi_sas: Use device lock to protect slot alloc/free · e85d93b2

由 Xiang Chen 提交于 5月 09, 2018

The IPTT of a slot is unique, and we currently use hisi_hba lock to
protect it.

Now slot is managed on hisi_sas_device.list, so use DQ lock to protect
for allocating and freeing the slot.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

e85d93b2

scsi: hisi_sas: Don't lock DQ for complete task sending · fa222db0

由 Xiang Chen 提交于 5月 09, 2018

Currently we lock the DQ to protect whole delivery process.  So this
stops us building slots for the same queue in parallel, and can affect
performance.

To optimise it, only lock the DQ during special periods, specifically
when allocating a slot from the DQ and when delivering a slot to the HW.

This approach is now safe, thanks to the previous patches to ensure that
we always deliver a slot to the HW once allocated.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

fa222db0

scsi: hisi_sas: make return type of prep functions void · a2b3820b

由 Xiang Chen 提交于 5月 09, 2018

Since the task prep functions now should not fail, adjust the return
types to void.

In addition, some checks in the task prep functions are relocated to the
main module; this is specifically the check for the number of elements
in an sg list exceeded the HW SGE limit.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a2b3820b

scsi: hisi_sas: relocate smp sg map · 7eee4b92

由 Xiang Chen 提交于 5月 09, 2018

Currently we use DQ lock to protect delivery of DQ entry one by one.

To optimise to allow more than one slot to be built for a single DQ in
parallel, we need to remove the DQ lock when preparing slots, prior to
delivery.

To achieve this, we rearrange the slot build order to ensure that once
we allocate a slot for a task, we do cannot fail to deliver the task.

In this patch, we rearrange the slot building for SMP tasks to ensure
that sg mapping part (which can fail) happens before we allocate the
slot in the DQ.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

7eee4b92

08 5月, 2018 8 次提交

scsi: hisi_sas: workaround a v3 hw hilink bug · f70c1251

由 Xiaofei Tan 提交于 5月 02, 2018

There is an SoC bug of v3 hw development version. When hot- unplugging a
directly attached disk, the PHY down interrupt may not happen. It is
very easy to appear on some boards.

When this issue occurs, the controller will receive many invalid dword
frames, and the "alos" fields of register HILINK_ERR_DFX can indicate
that disk was unplugged.

As an workaround solution, this patch detects this issue in the channel
interrupt, and workaround it by following steps:

 - Disable the PHY
 - Clear error code and interrupt
 - Enable the PHY

Then the HW will reissue PHY down interrupt.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

f70c1251

scsi: hisi_sas: add readl poll timeout helper wrappers · 9b8addf3

由 John Garry 提交于 5月 02, 2018

It is common to use readl poll timeout helpers in the driver, so create
custom wrappers.
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

9b8addf3

scsi: hisi_sas: remove redundant handling to event95 for v3 · bf081d5d

由 Xiaofei Tan 提交于 5月 02, 2018

Event95 is used for DFX purpose. The relevant bit for this interrupt in
the ENT_INT_SRC_MSK3 register has been disabled, so remove the
processing.
Signed-off-by: NXiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

bf081d5d

scsi: hisi_sas: config ATA de-reset as an constrained command for v3 hw · 94135327

由 Xiang Chen 提交于 5月 02, 2018

As a unconstrained command, a command can be sent to SATA disk even if
SATA disk status is BUSY, ERR or DRQ.

If an ATA reset assert is successful but ATA reset de-assert fails, then
it will retry the reset de-assert. If reset de- assert retry is
successful, we think it is okay to probe the device but actually it
still has Err status.

Apparently we need to retry the ATA reset assertion and de- assertion
instead for this mentioned scenario.

As such, we config ATA reset assert as a constrained command, if ATA
reset de-assert fails, then ATA reset de-assert retry will also
fail. Then we will retry the proper process of ATA reset assert and
de-assert again.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

94135327

scsi: hisi_sas: update PHY linkrate after a controller reset · c2c1d9de

由 Xiang Chen 提交于 5月 02, 2018

After the controller is reset, we currently may not honour the PHY max
linkrate set via sysfs, in that after a reset we always revert to max
linkrate of 12Gbps, ignoring the value set via sysfs.

This patch modifies to policy to set the programmed PHY linkrate,
honouring the max linkrate programmed via sysfs.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

c2c1d9de

scsi: hisi_sas: check host frozen before calling "done" function · cd938e53

由 Xiang Chen 提交于 5月 02, 2018

When the host is frozen in SCSI EH state, at any point after the LLDD
sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free
the task; see sas_scsi_find_task().

This puts the LLDD in a difficult position, in that once it sets
SAS_TASK_STATE_DONE for the task state it should not reference the
sas_task again. But the LLDD needs will check the sas_task indirectly in
calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done()
(to check if the host is frozen state actually).

And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after
task->task_done() is called (as the sas_task is free'd at this point).

This situation would seem to be a problem made by libsas.

To work around, check in the LLDD whether the host is in frozen state to
ensure it is ok to call task->task_done() function. If in the frozen
state, we rely on SCSI EH and libsas to free the sas_task directly.

We do not do this for the following IO types:

 - SMP - they are managed in libsas directly, outside SCSI EH
 - Any internally originated IO, for similar reason
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

cd938e53

scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twice · b81b6cce

由 Xiang Chen 提交于 5月 02, 2018

If the SCSI host enters EH, any pending IO will be processed by SCSI
EH. However it is possible that SCSI EH will try to abort the IO and
also at the same time the IO completes in the driver. In this situation
there is a small chance of freeing the sas_task twice.

Then if another IO re-uses freed sas_task before the second time of
free'ing sas_task, it is possible to free incorrect sas_task.

To avoid this situation, add some checks to increase reliability. The
sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually
protect the LLDD and libsas freeing the task.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

b81b6cce

scsi: hisi_sas: optimise the usage of DQ locking · 24cf4361

由 Xiang Chen 提交于 5月 02, 2018

In the DQ tasklet processing it is not necessary to take the DQ lock, as
there is no contention between adding slots to the CQ and removing slots
from the matching DQ.

In addition, since we run each DQ in a separate tasklet context, there
would be no possible contention between DQ processing running for the
same queue in parallel.

It is still necessary to take hisi_hba lock when free'ing slots.
Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: NJohn Garry <john.garry@huawei.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

24cf4361

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功