- 02 9月, 2021 15 次提交
-
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit e8a4d0da category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e8a4d0daaef6fc8f965ca0b8e9585aa9698a0f24 ------------------------------------------------------------------------ If an internal task abort timeout occurs, the controller has developed a fault, and needs to be reset to be recovered. When this occurs during error handling, the current policy is to allow error handling to continue, and the inevitable nexus ha reset will handle the required reset. However various steps of error handling need to taken before this happens. These also involve some level of HW interaction, which will also fail with various timeouts. Speed up this process by recording a HW fault bit for an internal abort timeout - when this is set, just automatically error any HW interaction, and essentially go straight to clear nexus ha (to reset the controller). Link: https://lore.kernel.org/r/1623058179-80434-6-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit 63ece9eb category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=63ece9eb350312ee33327269480482dfac8555db ------------------------------------------------------------------------ If an internal task abort timeout occurs, the controller has developed a fault, and needs to be reset to be recovered. However if a timeout occurs during SCSI error handling, issuing a controller reset immediately may conflict with the error handling. To handle internal abort in these two scenarios, only queue the reset when not in an error handling function. In the case of a timeout during error handling, do nothing and rely on the inevitable ha nexus reset to reset the controller. Link: https://lore.kernel.org/r/1623058179-80434-5-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit 2f12a499 category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2f12a499511f40c268d6dfa4bf7fbe2344d2e6d3 ------------------------------------------------------------------------ Include HZ in timer macros to make the code more concise. Link: https://lore.kernel.org/r/1623058179-80434-4-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit 0f757339 category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0f757339919d31533aeadbbfd62f2dd4a49e851f ------------------------------------------------------------------------ For a clear nexus reset operation, the I_T nexus resets are executed serially for each device. For devices attached through an expander, this may take 2s per device; so, in total, could take a long time. Reduce the total time by running the I_T nexus resets in parallel through async operations. Link: https://lore.kernel.org/r/1623058179-80434-3-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit 366da0da category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=366da0da1f5fe4e7e702f5864a557e57f485431f ------------------------------------------------------------------------ If an OOB event is received but the phy still fails to come up, a link reset will be issued repeatedly at an interval of 20s until the phy comes up. Set a limit for link reset issue retries to avoid printing the timeout message endlessly. Link: https://lore.kernel.org/r/1623058179-80434-2-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit f4df167a category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f4df167ad5a2274c12680ba3e7d816d32d1fc375 ------------------------------------------------------------------------ Add (pseudo) SAS address for ATA software reset failure log to assist in debugging. Link: https://lore.kernel.org/r/1617709711-195853-7-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Jianqin Xie 提交于
mainline inclusion from mainline-master commit 2c74cb1f category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2c74cb1f9222ebfcc204c02018275ad167d25212 ------------------------------------------------------------------------ The debugfs snapshot should be executed before the reset occurs to ensure that the register contents are saved properly. As such, it is incorrect to queue the debugfs dump when running a reset as the reset will occur prior to the snapshot work item is handler. Therefore, directly snapshot registers in the reset work handler. Link: https://lore.kernel.org/r/1617709711-195853-5-git-send-email-john.garry@huawei.comSigned-off-by: NJianqin Xie <xiejianqin@hisilicon.com> Signed-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Xiang Chen 提交于
mainline inclusion from mainline-master commit f4676665 category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f467666504bf0c7eae95b929d0c86f77ff9b4356 ------------------------------------------------------------------------ Function sas_unregister_ha() needs to be called to roll back if hisi_hba->hw->hw_init() fails in function hisi_sas_probe() or hisi_sas_v3_probe(). Make that change. Link: https://lore.kernel.org/r/1617709711-195853-4-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit 1dbe61bf category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1dbe61bf7d760547d16ccf057572e641a653ad4a ------------------------------------------------------------------------ Add a config option to enable debugfs support by default. And if debugfs support is enabled by default, dump count default value is increased to 50 as generally users want something bigger than the current default in that situation. Link: https://lore.kernel.org/r/1611659068-131975-4-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-master commit 69bfa5fd category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=69bfa5fd7b448b2cd0cce6a301cf3fba8133ca0f ------------------------------------------------------------------------ Now that v2 and v3 hw expose their HW queues (and so shost.nr_hw_queues is set), remove the conditional checks in hisi_sas_task_prep(). This change would affect v1 HW performance (as it does not expose HW queues), but nobody uses it and support may be dropped soon. Link: https://lore.kernel.org/r/1611659068-131975-3-git-send-email-john.garry@huawei.comReviewed-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ahmed S. Darwish 提交于
mainline inclusion from mainline-master commit 872a90b5 category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=872a90b5b46646c6d4cdc15a265a55b1adb25b49 ------------------------------------------------------------------------ libsas event notifiers required an extension where gfp_t flags must be explicitly passed. For bisectability, a temporary _gfp() variant of such functions were added. All call sites then got converted use the _gfp() variants and explicitly pass GFP context. Having no callers left, the original libsas notifiers were then modified to accept gfp_t flags by default. Switch back to the original libas API, while still passing GFP context. The libsas _gfp() variants will be removed afterwards. Link: https://lore.kernel.org/r/20210118100955.1761652-14-a.darwish@linutronix.deReviewed-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ahmed S. Darwish 提交于
mainline inclusion from mainline-master commit 26c7efc3 category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=26c7efc3f95260fd90e6cb268b47a58cf27ffc64 ------------------------------------------------------------------------ Use the new libsas event notifiers API, which requires callers to explicitly pass the gfp_t memory allocation flags. Below are the context analysis for modified functions: => hisi_sas_bytes_dmaed(): Since it is invoked from both process and atomic contexts, let its callers pass the gfp_t flags: * hisi_sas_main.c: ------------------ hisi_sas_phyup_work(): workqueue context -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_controller_reset_done(): has an msleep() -> hisi_sas_rescan_topology() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) hisi_sas_debug_I_T_nexus_reset(): calls wait_for_completion_timeout() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_KERNEL) * hisi_sas_v1_hw.c: ------------------- int_abnormal_v1_hw(): irq handler -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) * hisi_sas_v[23]_hw.c: ---------------------- int_phy_updown_v[23]_hw(): irq handler -> phy_down_v[23]_hw() -> hisi_sas_phy_down() -> hisi_sas_bytes_dmaed(..., GFP_ATOMIC) => int_bcast_v1_hw() and phy_bcast_v3_hw(): Both are invoked exclusively from irq handlers. Pass GFP_ATOMIC. Link: https://lore.kernel.org/r/20210118100955.1761652-12-a.darwish@linutronix.deReviewed-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 John Garry 提交于
mainline inclusion from mainline-master commit 74a29219 category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=74a2921948ed8c0e7f079a98442ec3493168cc85 ------------------------------------------------------------------------ As a performance enhancement, make the completion queue interrupts managed. In addition, in commit bf0beec0 ("blk-mq: drain I/O when all CPUs in a hctx are offline"), CPU hotplug for MQ devices using managed interrupts is made safe. So expose HW queues to blk-mq to take advantage of this. Flag Scsi_host.host_tagset is also set to ensure that the HBA is not sent more commands than it can handle. However the driver still does not use request tag for IPTT as there are many HW bugs means that special rules apply for IPTT allocation. Link: https://lore.kernel.org/r/1606905417-183214-6-git-send-email-john.garry@huawei.comSigned-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Ahmed S. Darwish 提交于
mainline inclusion from mainline-master commit 18577cdc category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18577cdcaeeb7a1ca5c3adc4d92ed2ba75699625 ------------------------------------------------------------------------ hisi_sas_task_exec() uses preemptible() to see if it's safe to block. This does not work for CONFIG_PREEMPT_COUNT=n kernels in which preemptible() always returns 0. The problem is masked when enabling some of the common Kconfig.debug options (like CONFIG_DEBUG_ATOMIC_SLEEP), as they implicitly enable the preemption counter. In general, driver leaf functions should not make logic decisions based on the context they're called from. The caller should be the entity responsible for explicitly indicating context. Since hisi_sas_task_exec() already has a gfp_t flags parameter, use it as the explicit context marker. Link: https://lore.kernel.org/r/20201126132952.2287996-3-bigeasy@linutronix.de Fixes: 214e702d ("scsi: hisi_sas: Adjust task reject period during host reset") Fixes: 550c0d89 ("scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()") Cc: Xiaofei Tan <tanxiaofei@huawei.com> Cc: Xiang Chen <chenxiang66@hisilicon.com> Cc: John Garry <john.garry@huawei.com> Acked-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Luo Jiaxing 提交于
mainline inclusion from mainline-master commit 623a4b6d category: bugfix bugzilla: 175270 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=623a4b6d5c2a7595f677fa17348dbca6b461f16a ------------------------------------------------------------------------ Relocate all the debugfs code for DFX to v3 hw since no other versions support it. Link: https://lore.kernel.org/r/1606207594-196362-4-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Reviewed-by: NOuyangdelong <ouyangdelong@huawei.com> Signed-off-by: NNifujia <nifujia1@hisilicon.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 13 4月, 2021 1 次提交
-
-
由 John Garry 提交于
stable inclusion from stable-5.10.26 commit 18c3c04e8e53ee6008375cec1fb006a19f991746 bugzilla: 51363 -------------------------------- [ Upstream commit 121181f3 ] LLDDs report events to libsas with .notify_port_event and .notify_phy_event callbacks. These callbacks are fixed and so there is no reason why the functions cannot be called directly, so do that. This neatens the code slightly, makes it more obvious, and reduces function pointer usage, which is generally a good thing. Downside is that there are 2x more symbol exports. [a.darwish@linutronix.de: Remove the now unused "sas_ha" local variables] Link: https://lore.kernel.org/r/20210118100955.1761652-3-a.darwish@linutronix.deReviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NAhmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NSasha Levin <sashal@kernel.org> Signed-off-by: NChen Jun <chenjun102@huawei.com> Acked-by: N Weilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
- 08 12月, 2020 1 次提交
-
-
由 Xiang Chen 提交于
For when managed interrupts are used (and shost->nr_hw_queues is set), a fixed queue - set per-device - is still used for internal I/Os. If all the CPUs mapped to that queue are offlined, then the completions for that queue are not serviced and any internal I/Os will time out. Fix by selecting a queue for internal I/Os from the queue mapped from the current CPU in this scenario. This is still not ideal as it does not deal with CPU hotplug for inflight internal I/Os, and needs proper support from [0]. [0] https://lore.kernel.org/linux-scsi/20200703130122.111448-1-hare@suse.de/T/#m7d77d049b18f33a24ef206af69ebb66d07440556 Link: https://lore.kernel.org/r/1607347855-59091-1-git-send-email-john.garry@huawei.com Fixes: 8d98416a ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 27 10月, 2020 1 次提交
-
-
由 John Garry 提交于
In commit 8d98416a ("scsi: hisi_sas: Switch v3 hw to MQ"), the dispatch function was changed to choose the delivery queue based on the request tag HW queue index. This heavily degrades performance for v2 hw, since the HW queues are not exposed there, and, as such, HW queue #0 is used for every command. Revert to previous behaviour for when nr_hw_queues is not set, that being to choose the HW queue based on target device index. Link: https://lore.kernel.org/r/1602750425-240341-1-git-send-email-john.garry@huawei.com Fixes: 8d98416a ("scsi: hisi_sas: Switch v3 hw to MQ") Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 07 10月, 2020 2 次提交
-
-
由 Xiang Chen 提交于
Currently the PHY state is set according to the state of the PHYs after reset. This is invalid as the PHYs are already re-initialized. Set PHY state according to the state before the reset instead of after. Link: https://lore.kernel.org/r/1601649038-25534-8-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Xiang Chen 提交于
Currently sas_resume_ha() is called while resuming the controller to wait for all suspended PHYs to come up and all the libsas events to be completed. There is a scenario which will cause task hung: For direct attach with two disks connected with two PHYs, disable phy0 before suspending the disk on phy1 and the controller, then enable phy0 and resume the controller, and task hung occurs as follows: [ 591.901463] hisi_sas_v3_hw 0000:b4:02.0: resuming from operating state [D0] [ 593.113525] hisi_sas_v3_hw 0000:b4:02.0: neither _PS0 nor _PR0 is defined [ 593.120301] hisi_sas_v3_hw 0000:b4:02.0: waiting up to 25 seconds for 1 phy to resume [ 593.120836] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.134680] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy1 link_rate=10(sata) [ 593.134733] sas: phy-2:0 added to port-2:0, phy_mask:0x1 (5000000000000200) [ 593.148350] sas: DOING DISCOVERY on port 0, pid:948 [ 593.153227] hisi_sas_v3_hw 0000:b4:02.0: dev[3:5] found [ 593.159840] sas: Enter sas_scsi_recover_host busy: 0 failed: 0 [ 593.165663] sas: ata7: end_device-2:0: dev error handler [ 593.165730] sas: ata2: end_device-2:1: dev error handler [ 593.172532] hisi_sas_v3_hw 0000:b4:02.0: phydown: phy0 phy_state=0x2 [ 593.182570] hisi_sas_v3_hw 0000:b4:02.0: ignore flutter phy0 down [ 593.331277] hisi_sas_v3_hw 0000:b4:02.0: phyup: phy0 link_rate=10(sata) [ 593.498956] ata7.00: ATA-11: SAMSUNG MZ7LH960HAJR-00005, HXT7404Q, max UDMA/133 [ 593.506235] ata7.00: 1875385008 sectors, multi 16: LBA48 NCQ (depth 32) [ 593.514295] ata7.00: configured for UDMA/133 [ 593.518557] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 [ 593.528613] sas: ata7: end_device-2:0: model:SAMSUNG MZ7LH960HAJR-00005 serial:S45NNA0M712225 [ 593.537520] device_link_add 316: dev=2:0:2:0 supplier:2 consumer:0 [ 593.543674] device_link_add 324 [ 593.546801] device_link_add 352 [ 593.549930] device_link_add 406 [ 593.553058] device_link_add 440: dev=2:0:2:0 supplier:2 consumer:0 [ 593.559208] device_link_add 444 [ 593.562335] device_link_add 455 [ 593.565517] scsi 2:0:2:0: Direct-Access ATA SAMSUNG MZ7LH960 404Q PQ: 0 ANSI: 5 [ 620.057464] phy-2:1: resume timeout [ 738.841445] INFO: task kworker/u256:0:8 blocked for more than 120 seconds. [ 738.848295] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.854361] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.862155] kworker/u256:0 D 0 8 2 0x00000028 [ 738.867626] Workqueue: 0000:b4:02.0_event_q sas_port_event_worker [ 738.873693] Call trace: [ 738.876133] __switch_to+0xf4/0x148 [ 738.879613] __schedule+0x270/0x5d8 [ 738.883091] schedule+0x78/0x110 [ 738.886307] schedule_timeout+0x1ac/0x280 [ 738.890299] wait_for_completion+0x94/0x138 [ 738.894472] flush_workqueue+0x114/0x438 [ 738.898377] sas_porte_bytes_dmaed+0x400/0x500 [ 738.902801] sas_port_event_worker+0x28/0x40 [ 738.907053] process_one_work+0x1e8/0x360 [ 738.911046] worker_thread+0x44/0x478 [ 738.914698] kthread+0x150/0x158 [ 738.917915] ret_from_fork+0x10/0x1c [ 738.921534] INFO: task kworker/u256:1:948 blocked for more than 120 seconds. [ 738.928550] Not tainted 5.8.0-rc1-76154-g0d52b59-dirty #744 [ 738.934614] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 738.942408] kworker/u256:1 D 0 948 2 0x00000028 [ 738.947873] Workqueue: 0000:b4:02.0_disco_q sas_discover_domain [ 738.953766] Call trace: [ 738.956203] __switch_to+0xf4/0x148 [ 738.959678] __schedule+0x270/0x5d8 [ 738.963152] schedule+0x78/0x110 [ 738.966368] rpm_resume+0xcc/0x550 [ 738.969757] __pm_runtime_resume+0x3c/0x88 [ 738.973836] rpm_get_suppliers+0x50/0x148 [ 738.977829] __pm_runtime_set_status+0x124/0x2f0 [ 738.982427] scsi_sysfs_add_sdev+0x1a0/0x2a8 [ 738.986679] scsi_probe_and_add_lun+0x888/0xab0 [ 738.991190] __scsi_scan_target+0xec/0x520 [ 738.995268] scsi_scan_target+0x11c/0x128 [ 738.999261] sas_rphy_add+0x15c/0x1e8 [ 739.002907] sas_probe_devices+0xe4/0x150 [ 739.006899] sas_discover_domain+0x33c/0x588 [ 739.011150] process_one_work+0x1e8/0x360 [ 739.015143] worker_thread+0x44/0x478 [ 739.018789] kthread+0x150/0x158 [ 739.022003] ret_from_fork+0x10/0x1c ... If an extra phy0 up happens during resume of the SAS controller, it will emit a new libsas event (event PORTE_BYTES_DMAED and event DISCE_DISCOVER_DOMAIN). We will call function scsi_sysfs_add_sdev() in event DISCE_DISCOVER_DOMAIN, which will call __pm_runtime_set_status() to resume supplier (host controller). For runtime PM core, if device is in the resuming state, the later resume request of the device will wait for previous resume request to complete synchronously. At that point in time the state of the controller is still resuming as it waits for all libsas events to be completed, while libsas event DISCE_DISCOVER_DOMAIN is blocked as the state of the controller is resuming which causes a deadlock. To avoid the issue, filter out new PHY up events while the controller is suspended. Link: https://lore.kernel.org/r/1601649038-25534-7-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 06 10月, 2020 1 次提交
-
-
由 John Garry 提交于
Now that the block layer provides a shared tag, we can switch the driver to expose all HW queues. Signed-off-by: NJohn Garry <john.garry@huawei.com> Tested-by: NDouglas Gilbert <dgilbert@interlog.com> Acked-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 9月, 2020 5 次提交
-
-
由 Luo Jiaxing 提交于
Remove extra blank lines and add spaces around operators. Link: https://lore.kernel.org/r/1598958790-232272-9-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Xiang Chen 提交于
Newline is missing from some printk() statements. Add them. Link: https://lore.kernel.org/r/1598958790-232272-8-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
Through the new debugfs interface the user can select fixed code patterns. Add two new interfaces fixed_code and fixed_code1. Link: https://lore.kernel.org/r/1598958790-232272-7-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
Add BIST support for phy FFE (Feed forward equalizer) setting. The user can configure FFE through the new debugfs interface. FFE is a parameter used for link layer control. It will affect the link quality between the SAS controller and the backplane. In the BIST test, the FFE interface is provided to assist board testers in optimizing link parameters. The modification of the FFE parameter will affect the test after BIST or the normal running of the board. The user should save the initial FFE values and restore them after BIST test is complete. Link: https://lore.kernel.org/r/1598958790-232272-6-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Xiang Chen 提交于
hisi_sas_slot_task_free() attempts to dereference SSP task for non-ATA tasks. If the task is SMP, the code may reference the wrong structure although this may not cause any problems. To avoid this, only access to SSP task when slot->n_elem_dif is not 0 which indicates this is an SSP task. Link: https://lore.kernel.org/r/1598958790-232272-2-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 24 8月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 20 5月, 2020 1 次提交
-
-
由 Luo Jiaxing 提交于
We found out that after phy up, the hardware reports another oob interrupt but did not follow a phy up interrupt: oob ready -> phy up -> DEV found -> oob read -> wait phy up -> timeout We run link reset when wait phy up timeout, and it send a normal disk into reset processing. So we made some circumvention action in the code, so that this abnormal oob interrupt will not start the timer to wait for phy up. Link: https://lore.kernel.org/r/1589552025-165012-2-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 21 1月, 2020 4 次提交
-
-
由 John Garry 提交于
In future we will want to use hisi_sas_cq.pci_irq_mask for non-pci interrupt masks, so rename to be more general. Link: https://lore.kernel.org/r/1579522957-4393-7-git-send-email-john.garry@huawei.comSigned-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
The trigger_dump file is only used to manually trigger the dump, and did not provide a read callback function for it, so its file permission setting to 600 is wrong,and should be changed to 200. Link: https://lore.kernel.org/r/1579522957-4393-5-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Xiang Chen 提交于
After changing tasklet to workqueue or threaded irq, some critical resources are only used on threads (not in interrupt or bottom half of interrupt), so replace spin_lock_irqsave/spin_unlock_restore with spin_lock/spin_unlock to protect those critical resources. Link: https://lore.kernel.org/r/1579522957-4393-3-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Xiang Chen 提交于
Currently IRQ_EFFECTIVE_AFF_MASK is enabled for ARM_GIC and ARM_GIC3, so it only allows a single target CPU in the affinity mask to process interrupts and also interrupt thread, and the performance of using threaded irq is almost the same as tasklet. But if the config is not enabled, the interrupt thread will be allowed all the CPUs in the affinity mask. At that situation it improves the performance (about 20%). Note: IRQ_EFFECTIVE_AFF_MASK is configured differently for different architecture chip, and it seems to be better to make it be configured easily. Link: https://lore.kernel.org/r/1579522957-4393-2-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 13 11月, 2019 2 次提交
-
-
由 John Garry 提交于
The !! operator on a bool is pointless, so remove an example in hisi_sas_rescan_topology(). Link: https://lore.kernel.org/r/1573551059-107873-5-git-send-email-john.garry@huawei.comSigned-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Xiang Chen 提交于
Need to check the structure sas_port before using it. Link: https://lore.kernel.org/r/1573551059-107873-2-git-send-email-john.garry@huawei.comSigned-off-by: NXiang Chen <chenxiang66@hisilicon.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 25 10月, 2019 6 次提交
-
-
由 Luo Jiaxing 提交于
The number of phy down reflects the quality of the link between SAS controller and disk. In order to allow the user to confirm the link quality of the system, we record the number of phy down for each phy. The user can check the current phy down count by reading the debugfs file corresponding to the specific phy, or clear the phy down count by writing 0 to the debugfs file. Link: https://lore.kernel.org/r/1571926105-74636-19-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
Although if the debugfs initialization fails, we will delete the debugfs folder of hisi_sas, but we did not consider the scenario where debugfs was successfully initialized, but the probe failed for other reasons. We found out that hisi_sas folder is still remain after the probe failed. When probe fail, we should delete debugfs folder to avoid the above issue. Link: https://lore.kernel.org/r/1571926105-74636-18-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
We use the module parameter debugfs_dump_count to manage the upper limit of the memory block for multiple dumps. Link: https://lore.kernel.org/r/1571926105-74636-17-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
We still only use dump index #0 however. Link: https://lore.kernel.org/r/1571926105-74636-16-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
We add multiple dumps for debugfs, but only allocate memory this time and only dump #0. Link: https://lore.kernel.org/r/1571926105-74636-15-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Luo Jiaxing 提交于
Create a file structure which was used to save the memory address for ITCT cache at debugfs. This structure is bound to the corresponding debugfs file, it can help callback function of debugfs file to get what it needs. Link: https://lore.kernel.org/r/1571926105-74636-14-git-send-email-john.garry@huawei.comSigned-off-by: NLuo Jiaxing <luojiaxing@huawei.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-