- 17 12月, 2020 1 次提交
-
-
由 Martin K. Petersen 提交于
This reverts commit 1a0e1943. Commit b3c6a599 ("block: Fix a lockdep complaint triggered by request queue flushing") has been reverted and commit fb01a293 has been introduced in its place. Consequently, it is now safe to reinstate the megaraid_sas tagset changes that led to boot problems in 5.10. Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 09 12月, 2020 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit 103fbf8e. It turns out that it causes long boot-time latencies (to the point of timeouts and failed boots). The cause is the increase in request queues, and a fix for that is queued up for 5.11, but we're reverting this commit that triggered the problem for now. Reported-and-tested-by: NJohn Garry <john.garry@huawei.com> Reported-and-tested-by: NJulia Lawall <julia.lawall@inria.fr> Reported-by: NQian Cai <cai@redhat.com> Acked-by: NJens Axboe <axboe@kernel.dk> Acked-by: NMartin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/linux-scsi/fe3dff7dae4494e5a88caffbb4d877bbf472dceb.camel@redhat.com/ Link: https://lore.kernel.org/lkml/alpine.DEB.2.22.394.2012081813310.2680@hadrien/ Link: https://lore.kernel.org/linux-block/20201203012638.543321-1-ming.lei@redhat.com/Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 10月, 2020 1 次提交
-
-
由 Kashyap Desai 提交于
Fusion adapters can steer completions to individual queues, and we now have support for shared host-wide tags. So we can enable multiqueue support for fusion adapters. Once driver enable shared host-wide tags, cpu hotplug feature is also supported as it was enabled using below patchsets - commit bf0beec0 ("blk-mq: drain I/O when all CPUs in a hctx are offline") Currently driver has provision to disable host-wide tags using "host_tagset_enable" module parameter. Once we do not have any major performance regression using host-wide tags, we will drop the hand-crafted interrupt affinity settings. Performance is also meeting the expecatation - (used both none and mq-deadline scheduler) 24 Drive SSD on Aero with/without this patch can get 3.1M IOPs 3 VDs consist of 8 SAS SSD on Aero with/without this patch can get 3.1M IOPs. Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NHannes Reinecke <hare@suse.com> Signed-off-by: NJohn Garry <john.garry@huawei.com> Tested-by: NDouglas Gilbert <dgilbert@interlog.com> Acked-by: NMartin K. Petersen <martin.petersen@oracle.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 03 9月, 2020 1 次提交
-
-
由 Tomas Henzl 提交于
disable_irq() might sleep. Replace it with disable_irq_nosync() which is sufficient as irq_poll_scheduled protects against concurrently running complete_cmd_fusion() from megasas_irqpoll() and megasas_isr_fusion(). Link: https://lore.kernel.org/r/20200827165332.8432-1-thenzl@redhat.com Fixes: a6ffd5bf scsi: megaraid_sas: Call disable_irq from process IRQ poll Signed-off-by: NTomas Henzl <thenzl@redhat.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 24 8月, 2020 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 16 7月, 2020 1 次提交
-
-
由 Chandrakanth Patil 提交于
As the ENABLE_IRQ_POLL macro is undefined, the check for ENABLE_IRQ_POLL macro in ISR will always be false. This leads to irq polling being non-functional. Remove ENABLE_IRQ_POLL check from ISR. Link: https://lore.kernel.org/r/20200715120153.20512-1-chandrakanth.patil@broadcom.com Fixes: a6ffd5bf ("scsi: megaraid_sas: Call disable_irq from process IRQ") Cc: <stable@vger.kernel.org> # v5.3+ Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 08 7月, 2020 2 次提交
-
-
由 Damien Le Moal 提交于
Move function declarations to megaraid_sas.h to avoid warnings such as: warning: no previous prototype for ‘xxx' No functional changes. Link: https://lore.kernel.org/r/20200706123346.451827-1-damien.lemoal@wdc.comSigned-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Damien Le Moal 提交于
Fix kernel documentation comments to avoid various warnings when compiling with W=1. No functional changes. Link: https://lore.kernel.org/r/20200706123345.451783-1-damien.lemoal@wdc.comSigned-off-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 12 5月, 2020 2 次提交
-
-
由 Sumit Saxena 提交于
When TM command times out, driver invokes the controller reset. Post reset, driver re-fires pended TM commands which leads to firmware crash. Post controller reset, return pended TM commands back to OS. Link: https://lore.kernel.org/r/20200508085242.23406-1-chandrakanth.patil@broadcom.com Cc: stable@vger.kernel.org Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Sumit Saxena 提交于
As blk_queue_virt_boundary() API in slave_configure ensures that no IOs will come with holes/gaps. Hence, code logic to detect the holes/gaps in IO buffer is not required. Link: https://lore.kernel.org/r/20200508083838.22778-3-chandrakanth.patil@broadcom.comSigned-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 25 4月, 2020 1 次提交
-
-
由 Jason Yan 提交于
Fix the following coccicheck warning: drivers/scsi/megaraid/megaraid_sas_fusion.c:4242:6-16: WARNING: Assignment of 0/1 to bool variable drivers/scsi/megaraid/megaraid_sas_fusion.c:4786:1-29: WARNING: Assignment of 0/1 to bool variable drivers/scsi/megaraid/megaraid_sas_fusion.c:4791:1-29: WARNING: Assignment of 0/1 to bool variable drivers/scsi/megaraid/megaraid_sas_fusion.c:4716:1-29: WARNING: Assignment of 0/1 to bool variable drivers/scsi/megaraid/megaraid_sas_fusion.c:4721:1-29: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/20200421034111.28353-1-yanaijie@huawei.comAcked-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 15 4月, 2020 1 次提交
-
-
由 Jason Yan 提交于
Fix the following sparse warning: drivers/scsi/megaraid/megaraid_sas_fusion.c:180:1: warning: symbol 'megasas_enable_intr_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:202:1: warning: symbol 'megasas_disable_intr_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:4233:6: warning: symbol 'megasas_refire_mgmt_cmd' was not declared. Should it be static? Link: https://lore.kernel.org/r/20200407092827.18074-4-yanaijie@huawei.comReported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NJason Yan <yanaijie@huawei.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 13 2月, 2020 1 次提交
-
-
由 Tomas Henzl 提交于
Add a flag to DMA memory allocation to silence a warning. This driver allocates DMA memory for IO frames. This allocation may exceed MAX_ORDER pages for few megaraid_sas controllers (controllers with very high queue depth). Consequently, the driver has logic to keep reducing the controller queue depth until the DMA memory allocation succeeds. On impacted megaraid_sas controllers there would be multiple DMA allocation failures until driver settled on an allocation that fit. These failed DMA allocation requests caused stack traces in system logs. These were not harmful and this patch silences those warnings/stack traces. [mkp: clarified commit desc] Link: https://lore.kernel.org/r/20200204152413.7107-1-thenzl@redhat.comSigned-off-by: NTomas Henzl <thenzl@redhat.com> Acked-by: NSumit Saxena <sumit.saxena@broadcom.com> Reviewed-by: NLee Duncan <lduncan@suse.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 16 1月, 2020 5 次提交
-
-
由 Anand Lodnoor 提交于
Remove usage of device_busy counter from driver. Instead of device_busy counter now driver uses 'nr_active' counter of request_queue to get the number of inflight request for a LUN. Link: https://lore.kernel.org/r/1579000882-20246-11-git-send-email-anand.lodnoor@broadcom.com Link : https://patchwork.kernel.org/patch/11249297/Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NAnand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Anand Lodnoor 提交于
IOCTLs causing firmware fault may end up in failed controller resets and finally killing the adapter. This patch fixes this problem as stated below: In OCR sequence, driver will attempt refiring pended IOCTLs upto two times. If first two attempts fail, then in third attempt driver will return pended IOCTLs with EBUSY status to application. These changes are done to ensure if any of pended IOCTLs is causing firmware fault and resulting into OCR failure, then in last attempt of OCR driver will refrain firing it to firmware and saving adapter from being killed due to faulty IOCTL. Link: https://lore.kernel.org/r/1579000882-20246-10-git-send-email-anand.lodnoor@broadcom.comSigned-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NAnand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Anand Lodnoor 提交于
Driver initiates OCR if a DCMD command times out. But there is a deadlock if the driver attempts to invoke another OCR before the mutex lock (reset_mutex) is released from the previous session of OCR. This patch takes care of the above scenario using new flag MEGASAS_FUSION_OCR_NOT_POSSIBLE to indicate if OCR is possible. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1579000882-20246-9-git-send-email-anand.lodnoor@broadcom.comSigned-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NAnand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Anand Lodnoor 提交于
After issuing a adapter reset, driver blindly used to set adprecovery flag to OPERATIONAL state. Add a check to see if the FW is operational before setting the flag and marking reset adapter successful. Link: https://lore.kernel.org/r/1579000882-20246-7-git-send-email-anand.lodnoor@broadcom.comSigned-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NAnand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Anand Lodnoor 提交于
At the time of firmware initialization, if JBOD map or RAID map is not available, driver can function without these features in a limited functionality mode. Link: https://lore.kernel.org/r/1579000882-20246-6-git-send-email-anand.lodnoor@broadcom.comSigned-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NAnand Lodnoor <anand.lodnoor@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 08 8月, 2019 2 次提交
-
-
由 Qian Cai 提交于
The commit de516379 ("scsi: megaraid_sas: changes to function prototypes") introduced a comilation warning due to it changed the function prototype of read_fw_status_reg() to take an instance pointer instead, but forgot to remove an unused variable. drivers/scsi/megaraid/megaraid_sas_fusion.c: In function 'megasas_fusion_update_can_queue': drivers/scsi/megaraid/megaraid_sas_fusion.c:326:39: warning: variable 'reg_set' set but not used [-Wunused-but-set-variable] struct megasas_register_set __iomem *reg_set; ^~~~~~~ Fixes: de516379 ("scsi: megaraid_sas: changes to function prototypes") Signed-off-by: NQian Cai <cai@lca.pw> Acked-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 YueHaibing 提交于
Fix sparse warnings: drivers/scsi/megaraid/megaraid_sas_fusion.c:3369:1: warning: symbol 'complete_cmd_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3535:6: warning: symbol 'megasas_sync_irqs' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3554:1: warning: symbol 'megasas_complete_cmd_dpc_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3573:13: warning: symbol 'megasas_isr_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3604:1: warning: symbol 'build_mpt_mfi_pass_thru' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3661:40: warning: symbol 'build_mpt_cmd' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3688:1: warning: symbol 'megasas_issue_dcmd_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3881:5: warning: symbol 'megasas_wait_for_outstanding_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:4005:6: warning: symbol 'megasas_refire_mgmt_cmd' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:4525:25: warning: symbol 'megasas_get_peer_instance' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:4825:7: warning: symbol 'megasas_fusion_crash_dump' was not declared. Should it be static? Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Acked-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 24 7月, 2019 1 次提交
-
-
由 YueHaibing 提交于
Fix sparse warnings: drivers/scsi/megaraid/megaraid_sas_fusion.c:541:1: warning: symbol 'megasas_alloc_cmdlist_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:580:1: warning: symbol 'megasas_alloc_request_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:661:1: warning: symbol 'megasas_alloc_reply_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:738:1: warning: symbol 'megasas_alloc_rdpq_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:920:1: warning: symbol 'megasas_alloc_cmds_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:1740:1: warning: symbol 'megasas_init_adapter_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:1966:1: warning: symbol 'map_cmd_status' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:2379:1: warning: symbol 'megasas_set_pd_lba' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:2718:1: warning: symbol 'megasas_build_ldio_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3215:1: warning: symbol 'megasas_build_io_fusion' was not declared. Should it be static? drivers/scsi/megaraid/megaraid_sas_fusion.c:3328:6: warning: symbol 'megasas_prepare_secondRaid1_IO' was not declared. Should it be static? Reported-by: NHulk Robot <hulkci@huawei.com> Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Acked-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 27 6月, 2019 13 次提交
-
-
由 Chandrakanth Patil 提交于
For Aero adapters, driver provides three different performance modes controlled through module parameter named 'perf_mode'. Below are those performance modes: 0: Balanced - Additional high IOPS reply queues will be enabled along with low latency queues. Interrupt coalescing will be enabled only for these high IOPS reply queues. 1: IOPS - No additional high IOPS queues are enabled. Interrupt coalescing will be enabled on all reply queues. 2: Latency - No additional high IOPS queues are enabled. Interrupt coalescing will be disabled on all reply queues. This is a legacy behavior similar to Ventura & Invader Series. Default performance mode settings: - Performance mode set to 'Balanced', if Aero controller is working in 16GT/s PCIe speed. - Performance mode will be set to 'Latency' mode for all other cases. Through module parameter 'perf_mode', user can override default performance mode to desired one. Captured some performance numbers with these performance modes. 4k Random Read IO performance numbers on 24 SAS SSD drives for above three performance modes. Performance data is from Intel Skylake and HGST SS300 (drive model SDLL1DLR400GCCA1). IOPS: ----------------------------------------------------------------------- |perf_mode | qd = 1 | qd = 64 | note | |-------------|--------|---------|------------------------------------- |balanced | 259K | 3061k | Provides max performance numbers | | | | | both on lower QD workload & | | | | | also on higher QD workload | |-------------|--------|---------|------------------------------------- |iops | 220K | 3100k | Provides max performance numbers | | | | | only on higher QD workload. | |-------------|--------|---------|------------------------------------- |latency | 246k | 2226k | Provides good performance numbers | | | | | only on lower QD worklaod. | ----------------------------------------------------------------------- Average Latency: ----------------------------------------------------- |perf_mode | qd = 1 | qd = 64 | |-------------|--------------|----------------------| |balanced | 92.05 usec | 501.12 usec | |-------------|--------------|----------------------| |iops | 108.40 usec | 498.10 usec | |-------------|--------------|----------------------| |latency | 97.10 usec | 689.26 usec | ----------------------------------------------------- Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
The driver will use round-robin method for IO submission in batches within the high IOPS queues when the number of in-flight ios on the target device is larger than 8. Otherwise the driver will use low latency reply queues. Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Driver should enable interrupt coalescing (during driver load and after Controller Reset) for High IOPS queues by masking appropriate bits in IOC INIT frame. Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Aero controllers support balanced performance mode through the ability to configure queues with different properties. Reply queues with interrupt coalescing enabled are called "high iops reply queues" and reply queues with interrupt coalescing disabled are called "low latency reply queues". The driver configures a combination of high iops and low latency reply queues if: - HBA is an AERO controller; - MSI-X vectors supported by the HBA is 128; - Total CPU count in the system more than high iops queue count; - Driver is loaded with default max_msix_vectors module parameter; and - System booted in non-kdump mode. Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Added driver support to allow passthrough MPI toolbox type MFI commands to firmware based on firmware capability. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
For RAID5/RAID6 volumes configured behind Aero, driver will be doing 64bit division operations on behalf of firmware as controller's ARM CPU is very slow in this division. Later, driver calculates Q-ARM, P-ARM and Log-ARM and passes those values to firmware by writing these values to RAID_CONTEXT. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
RAID1 PCI bandwidth limit algorithm is not applicable to Aero as it's PCIe Gen4 adapter. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Issue: This issue is applicable to scenario when JBOD sequence map is unavailable (memory allocation for JBOD sequence map failed) to driver but feature is supported by firmware. If the driver sends a JBOD IO by not adding 255 (MAX_PHYSICAL_DEVICES - 1) to device ID when underlying firmware supports JBOD sequence map, it will lead to the IO failure. Fix: For JBOD IOs, driver will not use the RAID map to fetch the devhandle if JBOD sequence map is unavailable. Driver will set Devhandle to 0xffff and Target ID to 'device ID + 255 (MAX_PHYSICAL_DEVICES - 1)'. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Firmware does not expect FastPath IO sent through Region Lock Bypass queue. Though firmware never exposes such settings when fastpath IO can be sent to RL bypass queue but it's safer to remove dead code which directs fastpath IO to RL Bypass queue. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Issue: Under certain conditions, controller goes in FAULT state after IOC INIT fired to firmware. Such Fault can be recovered through controller reset. Fix: In driver probe context, if firmware fault is observed post IOC INIT, driver would do controller reset followed by retry logic for IOC INIT command. Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
On PowerPC architecture, calling disable_irq_nosync from IRQ context is not providing the required effect. In current megaraid_sas driver, disable_irq_nosync is being called from IRQ context before enabling IRQ poll. But due to the issue seen on PPC, after IRQ poll disable and legacy ISR is enabled, we are not seeing our ISR getting called. Fix: Call disable_irq from IRQ poll thread context instead of IRQ context. Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Chandrakanth Patil 提交于
Aero adapters provides Atomic Request Descriptor as an alternative method for posting an entry onto a request queue. The posting of an Atomic Request Descriptor is an atomic operation, providing a safe mechanism for multiple processors on the host to post requests without synchronization. This Atomic Request Descriptor format is identical to first 32 bits of Default Request Descriptor and uses only 32 bits. If Aero adapters support Atomic descriptor, driver should use it for posting IOs and DCMDs to firmware. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NChandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 21 6月, 2019 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct MR_PD_CFG_SEQ_NUM_SYNC { ... struct MR_PD_CFG_SEQ seq[1]; } __packed; Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. So, replace the following form: sizeof(struct MR_PD_CFG_SEQ_NUM_SYNC) + (sizeof(struct MR_PD_CFG_SEQ) * (MAX_PHYSICAL_DEVICES - 1)) with: struct_size(pd_sync, seq, MAX_PHYSICAL_DEVICES - 1) This code was detected with the help of Coccinelle. Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 19 6月, 2019 5 次提交
-
-
由 Shivasharan S 提交于
Add a print to dump the interrupt status in system log for debugging. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Shivasharan S 提交于
When driver detects a firmware fault during load, dump additional information on fault code and subcode that will help in debugging. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Shivasharan S 提交于
This patch enhances the existing debug prints in reset and task management path. These debug prints in adapter reset path helps with debugging issues related to IO timeouts that are seen frequently in the field. Add additional debug prints to dump the pending command frames before initiating an adapter reset. Also, print FastPath IOs that are outstanding. Signed-off-by: NSumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Shivasharan S 提交于
Driver will use "reply descriptor post queues" in round robin fashion when the combined MSI-X mode is not enabled. With this IO completions are distributed and load balanced across all the available reply descriptor post queues equally. This is enabled only if combined MSI-X mode is not enabled in firmware. This improves performance and also fixes soft lockups. When load balancing is enabled, IRQ affinity from driver needs to be disabled. Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Shivasharan S 提交于
Issue Description: We have seen cpu lock up issues from field if system has a large (more than 96) logical cpu count. SAS3.0 controller (Invader series) supports max 96 MSI-X vector and SAS3.5 product (Ventura) supports max 128 MSI-X vectors. This may be a generic issue (if PCI device support completion on multiple reply queues). Let me explain it w.r.t megaraid_sas supported h/w just to simplify the problem and possible changes to handle such issues. MegaRAID controller supports multiple reply queues in completion path. Driver creates MSI-X vectors for controller as "minimum of (FW supported Reply queues, Logical CPUs)". If submitter is not interrupted via completion on same CPU, there is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO timeout, system sluggish etc. Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU (e.g. CPU B) is busy with processing the corresponding IO's reply descriptors from reply descriptor queue upon receiving the interrupts from HBA. If CPU A is continuously pumping the IOs then always CPU B (which is executing the ISR) will see the valid reply descriptors in the reply descriptor queue and it will be continuously processing those reply descriptor in a loop without quitting the ISR handler. megaraid_sas driver will exit ISR handler if it finds unused reply descriptor in the reply descriptor queue. Since CPU A will be continuously sending the IOs, CPU B may always see a valid reply descriptor (posted by HBA Firmware after processing the IO) in the reply descriptor queue. In worst case, driver will not quit from this loop in the ISR handler. Eventually, CPU lockup will be detected by watchdog. Above mentioned behavior is not common if "rq_affinity" set to 2 or affinity_hint is honored by irqbalancer as "exact". If rq_affinity is set to 2, submitter will be always interrupted via completion on same CPU. If irqbalancer is using "exact" policy, interrupt will be delivered to submitter CPU. Problem statement: If CPU count to MSI-X vectors (reply descriptor Queues) count ratio is not 1:1, we still have exposure of issue explained above and for that we don't have any solution. Exposure of soft/hard lockup is seen if CPU count is more than MSI-X supported by device. If CPUs count to MSI-X vectors count ratio is not 1:1, (Other way, if CPU counts to MSI-X vector count ratio is something like X:1, where X > 1) then 'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU hard/soft lockups. There won't be any one to one mapping between CPU to MSI-X vector instead one MSI-X interrupt (or reply descriptor queue) is shared with group/set of CPUs and there is a possibility of having a loop in the IO path within that CPU group and may observe lockups. For example: Consider a system having two NUMA nodes and each node having four logical CPUs and also consider that number of MSI-X vectors enabled on the HBA is two, then CPUs count to MSI-X vector count ratio as 4:1. e.g. MSI-X vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and MSI-X vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1. numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 --> MSI-X 0 node 0 size: 65536 MB node 0 free: 63176 MB node 1 cpus: 4 5 6 7 --> MSI-X 1 node 1 size: 65536 MB node 1 free: 63176 MB Assume that user started an application which uses all the CPUs of NUMA node 0 for issuing the IOs. Only one CPU from affinity list (it can be any cpu since this behavior depends upon irqbalance) CPU0 will receive the interrupts from MSI-X 0 for all the IOs. Eventually, CPU 0 IO submission percentage will be decreasing and ISR processing percentage will be increasing as it is more busy with processing the interrupts. Gradually IO submission percentage on CPU 0 will be zero and it's ISR processing percentage will be 100% as IO loop has already formed within the NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with submitting the heavy IOs and only CPU 0 is busy in the ISR path as it always find the valid reply descriptor in the reply descriptor queue. Eventually, we will observe the hard lockup here. Chances of occurring of hard/soft lockups are directly proportional to value of X. If value of X is high, then chances of observing CPU lockups is high. Solution: Use IRQ poll interface defined in "irq_poll.c". megaraid_sas driver will execute ISR routine in softirq context and it will always quit the loop based on budget provided in IRQ poll interface. Driver will switch to IRQ poll only when more than a threshold number of reply descriptors are handled in one ISR. Currently threshold is set as 1/4th of HBA queue depth. In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is X:1 (where X > 1)), IRQ poll interface will avoid CPU hard lockups due to voluntary exit from the reply queue processing based on budget. Note - Only one MSI-X vector is busy doing processing. Select CONFIG_IRQ_POLL from driver Kconfig for driver compilation. Signed-off-by: NKashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: NShivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-