- 25 7月, 2020 1 次提交
-
-
由 Lee Jones 提交于
Fixes the following W=1 kernel build warning(s): drivers/scsi/lpfc/lpfc_init.c: In function ‘lpfc_dbg_print’: drivers/scsi/lpfc/lpfc_init.c:14212:6: warning: function ‘lpfc_dbg_print’ might be a candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format] 14212 | sizeof(phba->dbg_log[idx].log), fmt, args); | ^~~~~~ Link: https://lore.kernel.org/r/20200723122446.1329773-16-lee.jones@linaro.org Cc: James Smart <james.smart@broadcom.com> Cc: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NLee Jones <lee.jones@linaro.org> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 16 7月, 2020 1 次提交
-
-
由 Anton Blanchard 提交于
On a big box the lpfc driver emits a few thousand "Set Affinity" lines to the console. Reduce the priority of these from KERN_ERR to KERN_INFO, and also fix a few printks that had no log level. Link: https://lore.kernel.org/r/20200713083908.1104927-1-anton@ozlabs.orgReviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NAnton Blanchard <anton@ozlabs.org> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 08 7月, 2020 2 次提交
-
-
由 Dick Kennedy 提交于
The expression start_idx - dbg_cnt is evaluated using unsigned int arthithmetic (since these variables are unsigned ints) and hence can never be less than zero, so the less than comparison is never true. Rewrite the expression to check for start_idx being less than dbg_cnt. After the logic was corrected, temp_idx wasn't working correctly. So fix it as well. Link: https://lore.kernel.org/r/20200706204246.130416-1-jsmart2021@gmail.com Fixes: 372c187b ("scsi: lpfc: Add an internal trace log buffer") CC: Colin Ian King <colin.king@canonical.com> Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com> Addresses-Coverity: ("Unsigned compared against 0")
-
由 Dick Kennedy 提交于
With certain platforms its possible pci_alloc_irq_vectors() may affinitize irq vectors to multiple (all?) CPUs. The driver is currently assuming exclusivity and vectors being doled out to different CPUs and is assigning primary ownership of each vector to the first CPU in the mask. The code doesn't bother to check if the CPU already owns a vector and will unconditionally overwrite the CPU to vector mapping. This causes the relationships between eq's and cq's to get confused and gets worse when CPUs start to offline. The net results are interrupts are skipped resulting in mailbox timeouts and there are oopses in CPU offling flows. Fix this changing up the primary vector assignment. Now assign the eq to a CPU only if it is the CPU in the mask that does not have a prior assignment. And once the primary ownership is assigned, break from the loop. For CPUs that may have been set before but not the primary owner, the lpfc_cpu_affinity_check() routine will balance the CPU to eq assignment. Link: https://lore.kernel.org/r/20200706204230.130363-1-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 03 7月, 2020 3 次提交
-
-
由 Dick Kennedy 提交于
The current logging methods typically end up requesting a reproduction with a different logging level set to figure out what happened. This was mainly by design to not clutter the kernel log messages with things that were typically not interesting and the messages themselves could cause other issues. When looking to make a better system, it was seen that in many cases when more data was wanted was when another message, usually at KERN_ERR level, was logged. And in most cases, what the additional logging that was then enabled was typically. Most of these areas fell into the discovery machine. Based on this summary, the following design has been put in place: The driver will maintain an internal log (256 elements of 256 bytes). The "additional logging" messages that are usually enabled in a reproduction will be changed to now log all the time to the internal log. A new logging level is defined - LOG_TRACE_EVENT. When this level is set (it is not by default) and a message marked as KERN_ERR is logged, all the messages in the internal log will be dumped to the kernel log before the KERN_ERR message is logged. There is a timestamp on each message added to the internal log. However, this timestamp is not converted to wall time when logged. The value of the timestamp is solely to give a crude time reference for the messages. Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Dick Kennedy 提交于
Although the existing implementation is very good at high I/O load, on tests involving light load, especially on only a few hardware queues, latency was a little higher than it can be due to using workqueue scheduling. Other tasks in the system can delay handling. Change the lower level to use irq_poll by default which uses a softirq for I/O completion. This gives better latency as variance in when the cq is processed is reduced over the workqueue interface. However, as high load is better served by not being in softirq when the CPU is loaded, work queues are still used under high I/O load. Link: https://lore.kernel.org/r/20200630215001.70793-13-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Dick Kennedy 提交于
When using DUMP on SLI3 to read VPD and Port status data (config region 23), the adapter is overruning the kmalloc'd buffer causing havoc on other consumers of the allocation pools. Rework the loops processing the dump data and validate/size memory lengths before performing bcopy. Link: https://lore.kernel.org/r/20200630215001.70793-6-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 10 5月, 2020 3 次提交
-
-
由 James Smart 提交于
The last step of commonization is to remove the 'T' suffix from state and flag field definitions. This is minor, but removes the mental association that it solely applies to nvmet use. Signed-off-by: NPaul Ely <paul.ely@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
To support FC-NVME-2 support (actually FC-NVME (rev 1) with Ammendment 1), both the nvme (host) and nvmet (controller/target) sides will need to be able to receive LS requests. Currently, this support is in the nvmet side only. To prepare for both sides supporting LS receive, rename lpfc_nvmet_rcv_ctx to lpfc_async_xchg_ctx and commonize the definition. Signed-off-by: NPaul Ely <paul.ely@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 James Smart 提交于
A lot of files in lpfc include nvme headers, building up relationships that require a file to change for its headers when there is no other change necessary. It would be better to localize the nvme headers. There is also no need for separate nvme (initiator) and nvmet (tgt) header files. Refactor the inclusion of nvme headers so that all nvme items are included by lpfc_nvme.h Merge lpfc_nvmet.h into lpfc_nvme.h so that there is a single header used by both the nvme and nvmet sides. This prepares for structure sharing between the two roles. Prep to add shared function prototypes for upcoming shared routines. Signed-off-by: NPaul Ely <paul.ely@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 08 5月, 2020 1 次提交
-
-
由 Dick Kennedy 提交于
By default, the driver attempts to allocate a hdwq per logical cpu in order to provide good cpu affinity. Some systems have extremely high cpu counts and this can significantly raise memory consumption. In testing on x86 platforms (non-AMD) it is found that sharing of a hdwq by a physical cpu and its HT cpu can occur with little performance degredation. By sharing, the hdwq count can be halved, significantly reducing the memory overhead. Change the default behavior of the driver on non-AMD x86 platforms to share a hdwq by the cpu and its HT cpu. Link: https://lore.kernel.org/r/20200501214310.91713-6-jsmart2021@gmail.comReviewed-by: NHannes Reinecke <hare@suse.de> Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 30 3月, 2020 1 次提交
-
-
由 James Smart 提交于
The cpu io statistics were capped by a hard define limit of 128. This effectively was a max number of CPUs, not an actual CPU count, nor actual CPU numbers which can be even larger than both of those values. This made stats off/misleading and on large CPU count systems, wrong. Fix the stats so that all CPUs can have a stats struct. Fix the looping such that it loops by hdwq, finds CPUs that used the hdwq, and sum the stats, then display. Link: https://lore.kernel.org/r/20200322181304.37655-9-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 27 3月, 2020 2 次提交
-
-
由 James Smart 提交于
SCSI layer sends driver IOs with more s/g segments than driver can handle. This results in "Too many sg segments from dma_map_sg. Config 64, seg_cnt 219" error messages from the lpfc_scsi_prep_dma_buf_s3() routine. The was due to use the driver using individual templates for pport and vport, host reset enabled or not, nvme vs scsi, etc. In the end, there was a combination for a vport that didn't match the pport. Rather than enumerating more templates and more discretionary assignments, revert to a base template that is copied to a template specific to the pport/vport. Then, based on role, attributes and sli type, modify the fields that are different for that port. Added a log message to lpfc_create_port to validate values. Link: https://lore.kernel.org/r/20200322181304.37655-5-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
The following lockdep error was reported when unloading the lpfc driver: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. ... Call Trace: dump_stack+0x96/0xe0 register_lock_class+0x8b8/0x8c0 ? lockdep_hardirqs_on+0x190/0x280 ? is_dynamic_key+0x150/0x150 ? wait_for_completion_interruptible+0x2a0/0x2a0 ? wake_up_q+0xd0/0xd0 __lock_acquire+0xda/0x21a0 ? register_lock_class+0x8c0/0x8c0 ? synchronize_rcu_expedited+0x500/0x500 ? __call_rcu+0x850/0x850 lock_acquire+0xf3/0x1f0 ? del_timer_sync+0x5/0xb0 del_timer_sync+0x3c/0xb0 ? del_timer_sync+0x5/0xb0 lpfc_pci_remove_one.cold.102+0x8b7/0x935 [lpfc] ... Unloading the driver resulted in a call to del_timer_sync for the cpuhp_poll_timer. However the call to setup the timer had never been made, so the timer structures used by lockdep checking were not initialized. Unconditionally call setup_timer for the cpuhp_poll_timer during driver initialization. Calls to start the timer remain "as needed". Link: https://lore.kernel.org/r/20200322181304.37655-3-jsmart2021@gmail.comSigned-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 11 2月, 2020 4 次提交
-
-
由 James Smart 提交于
Update copyrights to 2020 for files modified in the 12.6.0.4 patch set. Link: https://lore.kernel.org/r/20200128002312.16346-13-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
The current code does some odd +1 over maximum xri count checks and requires that the lun_queue_count can't be bigger than maximum xri count divided by 8. These items are bogus. Clean the code up to cap lun_queue_count to maximum xri count. Link: https://lore.kernel.org/r/20200128002312.16346-10-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
The following error is see from the compiler: drivers/scsi/lpfc/lpfc_init.c: In function ‘lpfc_cpuhp_get_eq’: drivers/scsi/lpfc/lpfc_init.c:12660:1: error: the frame size of 1032 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] The issue is due to allocating a cpumask on the stack. Fix by converting to a dynamical allocation of the cpu mask. Link: https://lore.kernel.org/r/20200128002312.16346-7-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
When performing reset testing, the eq's list for related hwqs was getting corrupted. In cases where there is not a 1:1 eq to hwq, the eq is shared. The eq maintains a list of hwqs utilizing it in case of cpu offlining and polling. During the reset, the hwqs are being torn down so they can be recreated. The recreation was getting confused by seeing a non-null eq assignment on the eq and the eq list became corrupt. Correct by clearing the hdwq eq assignment when the hwq is cleaned up. Link: https://lore.kernel.org/r/20200128002312.16346-6-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 22 12月, 2019 3 次提交
-
-
由 James Smart 提交于
When unattaching, the driver did not unmap the DPP bar. This caused the next load of the driver, which attempts to enable wc, to not work correctly and wc to be disabled due to an address mapping overlap. Fix by unmapping on unattach. Link: https://lore.kernel.org/r/20191218235808.31922-8-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
The order of the flags/checks for adapters where FC-AL is supported erroneously excluded lpe35000 adapter models. Also noted that the G7 flags for Loop and Persistent topology are incorrect. They should follow the rules as G6. Rework the logic to enable LPe35000 FC-AL support. Collapse G7 support logic to the same rules as G6. Link: https://lore.kernel.org/r/20191218235808.31922-7-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
There are reports of multiple ports on the same system displaying different hostnames in fabric FDMI displays. Currently, the driver registers the hostname at initialization and obtains the hostname via init_utsname()->nodename queried at the time the FC link comes up. Unfortunately, if the machine hostname is updated after initialization, such as via DHCP or admin command, the value registered initially will be incorrect. Fix by having the driver save the hostname that was registered with FDMI. The driver then runs a heartbeat action that will check the hostname. If the name changes, reregister the FMDI data. The hostname is used in RSNN_NN, FDMI RPA and FDMI RHBA. Link: https://lore.kernel.org/r/20191218235808.31922-5-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 20 12月, 2019 1 次提交
-
-
由 Colin Ian King 提交于
There are spelling mistakes of asynchronous in a lpfc_printf_log message and comments. Fix these. Link: https://lore.kernel.org/r/20191218084301.627555-1-colin.king@canonical.comSigned-off-by: NColin Ian King <colin.king@canonical.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 22 11月, 2019 1 次提交
-
-
由 James Smart 提交于
Currently the lpfc driver sizes its cpu_map array based on num_possible_cpus(). However, that can be a value that is less than the highest cpu id bit that is set. As such, if a thread runs on a cpu with a larger cpu id, or for_each_possible_cpu() is used, the driver could index off the end of the array and return garbage or GPF. The driver maintains its own internal copy of the "num_possible" cpu value and sizes arrays by it. Fix by setting the driver's value to the value of the last cpu id bit set in the possible_mask - plus 1. Thus cpu_map will be sized to allow access by any cpu id possible. Link: https://lore.kernel.org/r/20191121175556.18953-1-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NEwan D. Milne <emilne@redhat.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 13 11月, 2019 1 次提交
-
-
由 James Smart 提交于
Currently, cpu_map[cpu#]->hdwq is left to equal LPFC_VECTOR_MAP_EMPTY for not present CPUs. If a CPU is dynamically hot-added, it is possible we may crash due to not assigning an allocated hdwq. Correct by assigning a hdwq at initialization for all not-present CPUs. Fixes: dcaa2136 ("scsi: lpfc: Change default IRQ model on AMD architectures") Link: https://lore.kernel.org/r/20191111230401.12958-5-jsmart2021@gmail.comReviewed-by: NEwan D. Milne <emilne@redhat.com> Signed-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 09 11月, 2019 2 次提交
-
-
由 Bart Van Assche 提交于
Fix the following kernel warning: cpumask_of_node(-1): (unsigned)node >= nr_node_ids(1) Fixes: dcaa2136 ("scsi: lpfc: Change default IRQ model on AMD architectures") Link: https://lore.kernel.org/r/20191108225947.1395-1-jsmart2021@gmail.comSigned-off-by: NBart Van Assche <bvanassche@acm.org> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Bart Van Assche 提交于
Fix the following lockdep warning: ============================================ WARNING: possible recursive locking detected 5.4.0-rc6-dbg+ #2 Not tainted -------------------------------------------- systemd-udevd/130 is trying to acquire lock: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: irq_calc_affinity_vectors+0x63/0x90 but task is already holding lock: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by systemd-udevd/130: #0: ffff8880d53fe210 (&dev->mutex){....}, at: __device_driver_lock+0x4a/0x70 #1: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc] stack backtrace: CPU: 1 PID: 130 Comm: systemd-udevd Not tainted 5.4.0-rc6-dbg+ #2 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0xa5/0xe6 __lock_acquire.cold+0xf7/0x23a lock_acquire+0x106/0x240 cpus_read_lock+0x41/0xe0 irq_calc_affinity_vectors+0x63/0x90 __pci_enable_msix_range+0x10a/0x950 pci_alloc_irq_vectors_affinity+0x144/0x210 lpfc_sli4_enable_intr+0x4b2/0xd50 [lpfc] lpfc_pci_probe_one+0x1411/0x22b0 [lpfc] local_pci_probe+0x7c/0xc0 pci_device_probe+0x25d/0x390 really_probe+0x170/0x510 driver_probe_device+0x127/0x190 device_driver_attach+0x98/0xa0 __driver_attach+0xb6/0x1a0 bus_for_each_dev+0x100/0x150 driver_attach+0x31/0x40 bus_add_driver+0x246/0x300 driver_register+0xe0/0x170 __pci_register_driver+0xde/0xf0 lpfc_init+0x134/0x1000 [lpfc] do_one_initcall+0xda/0x47e do_init_module+0x10a/0x3b0 load_module+0x4318/0x47c0 __do_sys_finit_module+0x134/0x1d0 __x64_sys_finit_module+0x47/0x50 do_syscall_64+0x6f/0x2e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: dcaa2136 ("scsi: lpfc: Change default IRQ model on AMD architectures") Link: https://lore.kernel.org/r/20191107052158.25788-4-bvanassche@acm.orgSigned-off-by: NBart Van Assche <bvanassche@acm.org> Reviewed-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 06 11月, 2019 2 次提交
-
-
由 James Smart 提交于
The current driver attempts to allocate an interrupt vector per cpu using the systems managed IRQ allocator (flag PCI_IRQ_AFFINITY). The system IRQ allocator will either provide the per-cpu vector, or return fewer vectors. When fewer vectors, they are evenly spread between the numa nodes on the system. When run on an AMD architecture, if interrupts occur to a cpu that is not in the same numa node as the adapter generating the interrupt, there are extreme costs and overheads in performance. Thus, if 1:1 vector allocation is used, or the "balanced" vectors in the other numa nodes, performance can be hit significantly. A much more performant model is to allocate interrupts only on the cpus that are in the numa node where the adapter resides. I/O completion is still performed by the cpu where the I/O was generated. Unfortunately, there is no flag to request the managed IRQ subsystem allocate vectors only for the CPUs in the numa node as the adapter. On AMD architecture, revert the irq allocation to the normal style (non-managed) and then use irq_set_affinity_hint() to set the cpu affinity and disable user-space rebalancing. Tie the support into CPU offline/online. If the cpu being offlined owns a vector, the vector is re-affinitized to one of the other CPUs on the same numa node. If there are no more CPUs on the numa node, the vector has all affinity removed and lets the system determine where it's serviced. Similarly, when the cpu that owned a vector comes online, the vector is reaffinitized to the cpu. Link: https://lore.kernel.org/r/20191105005708.7399-10-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
The recent affinitization didn't address cpu offlining/onlining. If an interrupt vector is shared and the low order cpu owning the vector is offlined, as interrupts are managed, the vector is taken offline. This causes the other CPUs sharing the vector will hang as they can't get io completions. Correct by registering callbacks with the system for Offline/Online events. When a cpu is taken offline, its eq, which is tied to an interrupt vector is found. If the cpu is the "owner" of the vector and if the eq/vector is shared by other CPUs, the eq is placed into a polled mode. Additionally, code paths that perform io submission on the "sharing CPUs" will check the eq state and poll for completion after submission of new io to a wq that uses the eq. Similarly, when a cpu comes back online and owns an offlined vector, the eq is taken out of polled mode and rearmed to start driving interrupts for eq. Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 29 10月, 2019 2 次提交
-
-
由 Saurav Girepunje 提交于
mempool_destroy has taken null pointer check into account. Remove the redundant check. Link: https://lore.kernel.org/r/20191026194712.GA22249@sauravSigned-off-by: NSaurav Girepunje <saurav.girepunje@gmail.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
convert MAGIC_NUMER_xxx to MAGIC_NUMBER_xxx Link: https://lore.kernel.org/r/20191025184342.6623-1-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Reviewed-by: NEwan D. Milne <emilne@redhat.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 25 10月, 2019 6 次提交
-
-
由 James Smart 提交于
In the past, the lpe32000 models, based their main support being for 32G, and as FC-AL is not supported in the FC standards past 8G, did not support FC-AL operation. This patch adds private-loop FC-AL support for the LPE32000 adapters when a link is 8G or below. To avoid conditions where link rate may change, which would cause non-connectivity to the AL device, FC-AL mode must become a persistent setting and the link kept at a speed supporting FC-AL. The patch: - Adds a pls attribute indicating whether the adapter properly supports FC-AL. - Adds support for the adapter to indicate that topology should be fixed and the topology types to be configured. - Adds a pt attribute to report the persistent topology if present. Link: https://lore.kernel.org/r/20191018211832.7917-15-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
Add decode support for adapter Async Events which report FA-WWN configuration errors. Link: https://lore.kernel.org/r/20191018211832.7917-14-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
The existing "auto eq delay" mechanism was sometimes skipping over an EQ, not ramping the coalescing down under light load fast enough, and in other cases never kicked in as cpu sharing by multiple vectors didn't quite add up right. Tweak the interrupt mechanism such that: - Add a flag to the EQ to force checking for colaescing values when being serviced in the interrupt handler. The flag will be set by any CQ bound to the EQ whenever the number of CQ elements process in a single scan meets or exceeds the hardware queue notify level. E.g. there's a significant number of completions happening. - In the heartbeat work item that checks coalescing: - Replace the structure that was counting the number of EQs that interrupted on a single cpu with a new structure that looks at the EQ to see whether EQ currently has a coalescing value (thus it should be re-evaluate) or was marked by the new flag indicating heavy completions. - When a cpu, which may be servicing multiple vectors, had at least 1 EQ that should be checked, a new coalescing delay is calculated based on the number of interrupts that occurred on the cpu. - The new coalescing value is then applied to the EQs that had interrupted on the cpu. Link: https://lore.kernel.org/r/20191018211832.7917-11-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
Lower IOps performance with write operations. Perf tool shows lock contention in dma_pool_alloc and dma_pool_free related to the txrdy_payload_pool. The allocations are for dma buffers for XFER_RDY's, which actually are not needed for the FCP_TRECEIVE command as the command contents are used by the adapter to generate the IU. Remove the allocations and the associated buffer pool. Rather than leaving NULLs in buffer pointer locations, set command and sgl to indicate skipped SGLE indexes. Link: https://lore.kernel.org/r/20191018211832.7917-10-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
When the adapter FW is administratively set to RO mode, a FW update triggered by the driver's sysfs attribute will fail. Currently, the driver's logging mechanism does not properly parse the adapter return codes and print a meaningful message. This oversight prevents quick diagnosis in the field. Parse the adapter return codes for Write_Object and write an appropriate message to the system console. [mkp: typo] Link: https://lore.kernel.org/r/20191018211832.7917-3-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
Currently, lpfc_nvmet_mrq is always scaled back to the min(lpfc_nvmet_mrq, lpfc_irq_chann). There's no reason to reduce it to the number of interrupt vectors. Rather, it should be scaled down based on the number of hardware queues for the system (if lower than max of 16). Change scaling to use hardware queue count rather than interrupt vector count. Link: https://lore.kernel.org/r/20191018211832.7917-2-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 18 10月, 2019 1 次提交
-
-
由 Hannes Reinecke 提交于
The BUILD_NVME define never got defined anywhere, causing NVMe commands to be treated as SCSI commands when freeing the buffers. This was causing a stuck discovery and a horrible crash in lpfc_set_rrq_active() later on. Link: https://lore.kernel.org/r/20191017150019.75769-1-hare@suse.de Fixes: c00f62e6 ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair") Signed-off-by: NHannes Reinecke <hare@suse.com> Reviewed-by: NJames Smart <james.smart@broadcom.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 01 10月, 2019 3 次提交
-
-
由 James Smart 提交于
This patch updates ACQE handling for: - an EEPROM failure error reported by the adapter. - ensures that all data for any ACQE, recognized or not, is logged. - Given that all data is now logged unconditionally, the default case (unrecognized) data can be reduced. Link: https://lore.kernel.org/r/20190922035906.10977-18-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
When target-side fault injections are made, the driver isn't reconnecting to the remote port. The driver is logging "2753" error messages which state: "PLOGI failure DID:1B2400 Status:x3/xf0240008" The failures status is indicating a Illegal field error, which points to the Temporary RPI field being used for the ELS. This error typically means the driver used an RPI that was already registered (shouldn't be registered if using it in this context). Study has found that if the driver were in discovery attempts and encountered an error, it wouldn't flag the temporary rpi in error. Yet the rpi was released for reallocation in these error paths and another ELS could allocate the rpi. In the failure situation a retry was done on an ELS that had encountered an error, and as the rpi wasn't marked in error, the ELS reused the rpi it originally allocated. But that rpi had been allocated by a different ELS issued after the original error and before the retry attempt. The different ELS had succeeded and the RPI was registered. Fix by marking the rpi state for the node to be in error, aka as needing reallocation, upon an error in the els processing. Error state marking is always done prior to release back to the internal rpi free list, which the driver wasn't doing in cases prior. Also enhanced some of the logging to help in the next case of problem troubleshooting. Link: https://lore.kernel.org/r/20190922035906.10977-7-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 James Smart 提交于
The nvme-fc transport may call to abort an io on controller reset. If the driver is out of resources to issue an abort command, it just gives up and does nothing. The transport expects the lldd to always be able to terminate an io it has issued. At that point, the controller hangs waiting for aborted ios to be returned. Note: flaged by "6136" and "6176" error messages. Root issue was the adapter mis-allocated the number resources it allocated for command entries for the adapter. Convert the driver to allocate command resources based on the number of xris supported by the FC port - 1 resource for the original command and 1 resource for the abort request. Link: https://lore.kernel.org/r/20190922035906.10977-5-jsmart2021@gmail.comSigned-off-by: NDick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: NJames Smart <jsmart2021@gmail.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-