- 01 12月, 2016 11 次提交
-
-
由 Matthew R. Ochs 提交于
The send_tmf() routine includes some copy/paste cruft that can be removed as well as the setting of an AFU command-specific while holding the tmf_slock. While not a bug, it is out of place and should be shifted down alongside the other command initialization statements for clarity. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
The original design of the cxlflash driver required AFU commands to convey state information across multiple threads. The IOASA "host use" byte was used to track if a command was done, errored, or timed out. A per-command spin lock was used to serialize access to this byte. As this is no longer required with the introduction of completions and various refactoring over time, the spin lock, state tracking, and associated code can be removed. To support the simplification, the wait_resp() routine is refactored to return a success or failure. Additionally, as the simplification to the AFU internal command routine, explicit assignments of AFU command fields to zero are removed as the memory is zeroed upon allocation. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
With the removal of the static private command pool, the ability to 'complete' outstanding commands was lost. While not an issue for the commands originating outside the driver, internal AFU commands are synchronous and therefore have a timeout associated with them. To avoid a stale memory access, the tear down sequence needs to ensure that there are not any active commands before proceeding. As these internal AFU commands are rare events, the simplest way to accomplish this is detecting the activity and waiting for it to timeout. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
Clean up and remove the remaining private command pool infrastructure that is no longer required. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
Instead of using a private pool of AFU commands, use cmd_size to prime the private pool of SCSI commands such that they are allocated with a size large enough to contain an aligned AFU command. Use scsi_cmd_priv() to derive the aligned/zeroed private command on queuecommand and TMF paths. Remove cmd_checkout() as it is no longer required. The remaining AFU private command infrastructure will be removed in a cleanup commit. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
As staging for the removal of the AFU command pool, remove the reliance upon the pool for the internal AFU sync command. Instead of obtaining an AFU command from the pool, dynamically allocate memory with the appropriate alignment requirements. Since the AFU sync service is only executed from the process environment, blocking is acceptable. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
The cxlflash driver originally required a per-command 4K buffer that hosted data passed to the AFU. When the routines that initiate AFU and internal SCSI commands were refactored to use scsi_execute(), the need for this buffer became obsolete. As it is no longer necessary, the buffer is removed. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
During test, a command room violation interrupt is occasionally seen for the master context when the CXL flash devices are stressed. After studying the code, there could be gaps in the way command room value is being cached in cxlflash. When the cached command room is zero the thread attempting to send becomes burdened with updating the cached value with the actual value from the AFU. Today, this is handled with an atomic set operation of the raw value read. Following the atomic update, the thread proceeds to send. This behavior is incorrect on two counts: - The update fails to take into account the current thread and its consumption of one of the hardware commands. - The update does not take into account other threads also atomically updating. Per design, a worker thread updates the cached value when a send thread times out. By not protecting the update with a lock, the cached value can be incorrectly clobbered. To correct these issues, the update of the cached command room has been simplified and also protected using a spin lock which is held until the MMIO is complete. This ensures the command room is properly consumed by the same thread. Update of cached value also takes into account the current thread consuming a hardware command. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
Currently, the context reset routine waits for command room to be available before sending the reset request. Per review of the SISLite specification and clarifications from the CXL Flash AFU designers, this wait is unnecessary. The reset request can be sent anytime regardless of command room, so long as only a single reset request is active at any one point in time. This commit simplifies the reset routine by removing the wait for command room. Additionally it adds a debug trace to help pinpoint hardware errors when a context reset does not complete. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
During test, the following crash was observed: [34538.981505] Faulting instruction address: 0xd000000007c9c870 cpu 0x9: Vector: 300 (Data Access) at [c0000007f1e8f590] pc: d000000007c9c870: cxlflash_restore_luntable+0x70/0x1d0 [cxlflash] lr: d000000007c9c84c: cxlflash_restore_luntable+0x4c/0x1d0 [cxlflash] sp: c0000007f1e8f810 msr: 9000000100009033 dar: c00000171d637438 dsisr: 40000000 current = 0xc0000007f1e43f90 paca = 0xc000000007b25100 softe: 0 irq_happened: 0x01 pid = 493, comm = eehd enter ? for help [c0000007f1e8f8a0] d000000007c940b0 init_afu+0xd60/0x1200 [cxlflash] [c0000007f1e8f9a0] d000000007c945a8 cxlflash_pci_slot_reset+0x58/0xe0 [cxlflash] [c0000007f1e8fa20] d00000000715f790 cxl_pci_slot_reset+0x230/0x340 [cxl] [c0000007f1e8fae0] c000000000040dd4 eeh_report_reset+0x144/0x180 [c0000007f1e8fb20] c00000000003f708 eeh_pe_dev_traverse+0x98/0x170 [c0000007f1e8fbb0] c000000000041618 eeh_handle_normal_event+0x328/0x410 [c0000007f1e8fc30] c000000000041db8 eeh_handle_event+0x178/0x330 [c0000007f1e8fce0] c000000000042118 eeh_event_handler+0x1a8/0x1b0 [c0000007f1e8fd80] c00000000011420c kthread+0xec/0x100 [c0000007f1e8fe30] c00000000000a47c ret_from_kernel_thread+0x5c/0xe0 When superpipe mode is disabled for a LUN, the references for the local lun are deleted but the LUN is still identified as being present in the LUN table. This mismatched state can result in the above crash when the LUN table is restored during an error recovery operation. To fix this issue, the local LUN information structure is updated to reflect the LUN is no longer in the LUN table once all references to the LUN are gone. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
The following Oops is encountered when blk_mq is enabled with the cxlflash driver: [ 2960.817172] Oops: Kernel access of bad area, sig: 11 [#5] [ 2960.817309] NIP __blk_mq_run_hw_queue+0x278/0x4c0 [ 2960.817313] LR __blk_mq_run_hw_queue+0x2bc/0x4c0 [ 2960.817314] Call Trace: [ 2960.817320] __blk_mq_run_hw_queue+0x2bc/0x4c0 (unreliable) [ 2960.817324] blk_mq_run_hw_queue+0xd8/0x100 [ 2960.817329] blk_mq_insert_requests+0x14c/0x1f0 [ 2960.817333] blk_mq_flush_plug_list+0x150/0x190 [ 2960.817338] blk_flush_plug_list+0x11c/0x2b0 [ 2960.817344] blk_finish_plug+0x58/0x80 [ 2960.817348] __do_page_cache_readahead+0x1c0/0x2e0 [ 2960.817352] force_page_cache_readahead+0x68/0xd0 [ 2960.817356] generic_file_read_iter+0x43c/0x6a0 [ 2960.817359] blkdev_read_iter+0x68/0xa0 [ 2960.817361] __vfs_read+0x11c/0x180 [ 2960.817364] vfs_read+0xa4/0x1c0 [ 2960.817366] SyS_read+0x6c/0x110 [ 2960.817369] system_call+0x38/0xb4 The SCSI blk_mq stack assumes that sg_tablesize is always a non-zero value with scsi_mq_setup_tags() allocating tags using sg_tablesize. The cxlflash driver currently uses SG_NONE (0) for the sg_tablesize as the devices it supports are not capable of scatter gather. This mismatch of values results in the Oops above. To resolve this issue, sg_tablesize for cxlflash can simply be set to 1, a value which satisfies the constraints in cxlflash and the lack of support of SG_NONE in SCSI blk_mq. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 15 9月, 2016 4 次提交
-
-
由 Matthew R. Ochs 提交于
Commit 888baf06 ("scsi: cxlflash: Add kref to context") introduced a kref to the context. In particular, the detach routine was updated to use the kref services for managing the removal and destruction of a context. As part of this change, the tracking mechanism internal to the detach handler was refactored. This introduced a bug that can cause the tracking state to be lost. This can lead to a situation where exclusive access to a context is prematurely [and unknowingly] relinquished for the executing thread. To remedy, only update the tracking state when the kref operation indicates the context was removed. Fixes: 888baf06 ("scsi: cxlflash: Add kref to context") Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
Commit 964497b3 ("cxlflash: Remove dual port online dependency") logically removed the ability for the WWPN setup routine afu_set_wwpn() to return a non-success value. This routine can safely be made a void to simplify the code as there is no longer a need to report a failure. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
When an EEH occurs during device initialization, the port timeout logic can cause excessive delays as MMIO reads will fail. Depending on where they are experienced, these delays can lead to a prolonged reset, causing an unnecessary triggering of other timeout logic in the SCSI stack or user applications. To expedite recovery, the port timeout logic is updated to decay the timeout at a much faster rate when in the presence of a likely EEH frozen event. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
The EEH reset handler is ignorant to the current state of the driver when processing a frozen event and initiating a device reset. This can be an issue if an EEH event occurs while a user or stack initiated reset is executing. More specifically, if an EEH occurs while the SCSI host reset handler is active, the reset initiated by the EEH thread will likely collide with the host reset thread. This can leave the device in an inconsistent state, or worse, cause a system crash. As a remedy, the EEH handler is updated to evaluate the device state and take appropriate action (proceed, wait, or disconnect host). The host reset handler is also updated to handle situations where an EEH occurred during a host reset. In such situations, the host reset handler will delay reporting back a success to give the EEH reset an opportunity to complete. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 09 9月, 2016 2 次提交
-
-
由 Uma Krishnan 提交于
Commit 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash cards") was recently introduced to notify the AFU when a system is going down. Due to the position of the cxlflash driver in the device stack, cxlflash devices are _always_ removed during a reboot/shutdown. This can lead to a crash if the cxlflash shutdown hook is invoked _after_ the shutdown hook for the owning virtual PHB. Furthermore, the current implementation of shutdown/remove hooks for cxlflash are not tolerant to being invoked when the device is not enabled. This can also lead to a crash in situations where the remove hook is invoked after the device has been removed via the vPHBs shutdown hook. An example of this scenario would be an EEH reset failure while a reboot/shutdown is in progress. To solve both problems, the shutdown hook for cxlflash is updated to simply remove the device. This path already includes the AFU notification and thus this solution will continue to perform the original intent. At the same time, the remove hook is updated to protect against being called when the device is not enabled. Fixes: 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash cards") Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
When a port link is established, the AFU sends a 'link up' interrupt. After the link is up, corresponding initialization steps are performed on the card. Following that, when the card is ready for I/O, the AFU sends 'login succeeded' interrupt. Today, cxlflash invokes scsi_scan_host() upon receipt of both interrupts. SCSI commands sent to the port prior to the 'login succeeded' interrupt will fail with 'port not available' error. This is not desirable. Moreover, when async_scan is active for the host, subsequent scan calls are terminated with error. Due to this, the scsi_scan_host() call performed after 'login succeeded' interrupt could portentially return error and the devices may not be scanned properly. To avoid this problem, scsi_scan_host() should be called only after the 'login succeeded' interrupt. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 24 8月, 2016 2 次提交
-
-
由 Matthew R. Ochs 提交于
The adapter file descriptor was previously cached within the kernel for a given context in order to support performing a close on behalf of an application. This is no longer needed as applications are now required to perform a close on the adapter file descriptor. Inspired-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
Caching the adapter file descriptor and performing a close on behalf of an application is a poor design. This is due to the fact that once a file descriptor in installed, it is free to be altered without the knowledge of the cxlflash driver. This can lead to inconsistencies between the application and kernel. Furthermore, the nature of the former design is more exploitable and thus should be abandoned. To support applications performing a close on the adapter file that is associated with a context, a new flag is introduced to the user API to indicate to applications that they are responsible for the close following the cleanup (detach) of a context. The documentation is also updated to reflect this change in behavior. Inspired-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 19 8月, 2016 3 次提交
-
-
由 Matthew R. Ochs 提交于
Currently, context user references are tracked via the list of LUNs that have attached to the context. While convenient, this is not intuitive without a deep study of the code and is inconsistent with the existing reference tracking patterns within the kernel. This design choice can lead to future bug injection. To improve code comprehension and better protect against future bugs, add explicit reference counting to contexts and migrate the context removal code to the kref release handler. Inspired-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
The context removal routine requires access to the owning adapter structure to reset the context within the AFU as part of the tear down sequence. In order to support kref adoption, the owning adapter must be accessible from the release handler. As the kref framework only provides the kref reference as the sole parameter, another means is needed to derive the owning adapter. As a remedy, the owning adapter reference is saved off within the context during initialization. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
Context information structures are protected by a mutex that is held when accessing/manipulating the context. When the code that manages these structures was authored, a decision was made to include taking the mutex as part of the allocation/initialization sequence and also handle the scenario where the mutex was already held when freeing the context. While not a problem outright, this design decision has been deemed as too flexible and the code should be made more rigid to avoid future bugs. In addition, further review of the code yields that the existing mutex manipulations in both of these context management paths are superfluous. This commit removes the obtaining of the context mutex in the context initialization routine and assumes the mutex is not held in the context free path. Inspired-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 27 7月, 2016 1 次提交
-
-
由 Uma Krishnan 提交于
If an EEH or some other hard error occurs while the adapter instance was being initialized, on the subsequent shutdown of the device, the system could crash with: [c000000f1da03b60] c0000000005eccfc pci_device_shutdown+0x6c/0x100 [c000000f1da03ba0] c0000000006d67d4 device_shutdown+0x1b4/0x2c0 [c000000f1da03c40] c0000000000ea30c kernel_restart_prepare+0x5c/0x80 [c000000f1da03c70] c0000000000ea48c kernel_restart+0x2c/0xc0 [c000000f1da03ce0] c0000000000ea970 SyS_reboot+0x1c0/0x2d0 [c000000f1da03e30] c000000000009204 system_call+0x38/0xb4 This crash is due to the AFU not being mapped when the shutdown notification routine is called and is a regression that was inserted recently with Commit 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash cards"). As a fix, shutdown notification should only occur when the AFU is mapped. Fixes: 704c4b0d ("cxlflash: Shutdown notify support for CXL Flash cards") Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 19 7月, 2016 1 次提交
-
-
由 Andrew Donnellan 提交于
Remove the CXL_KERNEL_API and CXL_EEH Kconfig options, as they were only needed to coordinate the merging of the cxlflash driver. Also remove the stub implementation of cxl_perst_reloads_same_image() in cxlflash which is only used if CXL_EEH isn't defined (i.e. never). Suggested-by: NIan Munsie <imunsie@au1.ibm.com> Signed-off-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: NIan Munsie <imunsie@au1.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 13 7月, 2016 3 次提交
-
-
由 Uma Krishnan 提交于
Some CXL Flash cards need notification of device shutdown in order to flush pending I/Os. A PCI notification hook for shutdown has been added where the driver notifies the card and returns. When the device is removed in the PCI remove path, notification code will wait for shutdown processing to complete. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
Device dependent flags are needed to support functions that are specific to a particular device. One such case is - some CXL Flash cards need to be notified of device shutdown. For other CXL devices, this feature does not prove to be useful yet. Such distinct features need to be identified in the driver to bypass or invoke specific functionality. In this patch, a member 'flags' has been added to device dependent values. These flags will be used and expanded in the future to support various device specific functions. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Manoj N. Kumar 提交于
While running 'sg_reset -H' in a loop with a user-space application active, hit the following exception: cpu 0x2: Vector: 300 (Data Access) pc: : afu_attach+0x50/0x240 [cxlflash] lr: : cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash] pid = 20365, comm = run_block_fvt Linux version 4.5.0-491-26f710d+ cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash] cxlflash_ioctl+0x5a8/0x6f0 [cxlflash] scsi_ioctl+0x3b0/0x4c0 sd_ioctl+0x110/0x190 blkdev_ioctl+0x28c/0xc20 block_ioctl+0xa4/0xd0 do_vfs_ioctl+0xd8/0x8c0 SyS_ioctl+0xd4/0xf0 system_call+0x38/0xb4 The problem here is that the problem space area is unmapped while the application issues the DK_CXLFLASH_RECOVER_AFU ioctl. This is the order I observe: proc1 proc2 1) sg_reset 2) ioctl(DK_CXLFLASH_RECOVER_AFU) 3) sg_reset again causing a PSA unmap 4) continues RECOVER_AFU processing The resolution to this problem is to have the reset handler drain all outstanding user space initiated ioctls before proceeding. It is safe to drain after the state has been changed to STATE_RESET. Also since drain_ioctls() was static, it had to be moved up a bit to be before cxlflash_eh_host_reset_handler(). Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 06 5月, 2016 1 次提交
-
-
由 Manoj N. Kumar 提交于
When a cxlflash adapter goes into EEH recovery and multiple processes (each having established its own context) are active, the EEH recovery can hang if the processes attempt to recover in parallel. The symptom logged after a couple of minutes is: INFO: task eehd:48 blocked for more than 120 seconds. Not tainted 4.5.0-491-26f710d+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. eehd 0 48 2 Call Trace: __switch_to+0x2f0/0x410 __schedule+0x300/0x980 schedule+0x48/0xc0 rwsem_down_write_failed+0x294/0x410 down_write+0x88/0xb0 cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash] cxl_vphb_error_detected+0x88/0x110 [cxl] cxl_pci_error_detected+0xb0/0x1d0 [cxl] eeh_report_error+0xbc/0x130 eeh_pe_dev_traverse+0x94/0x160 eeh_handle_normal_event+0x17c/0x450 eeh_handle_event+0x184/0x370 eeh_event_handler+0x1c8/0x1d0 kthread+0x110/0x130 ret_from_kernel_thread+0x5c/0xa4 INFO: task blockio:33215 blocked for more than 120 seconds. Not tainted 4.5.0-491-26f710d+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. blockio 0 33215 33213 Call Trace: 0x1 (unreliable) __switch_to+0x2f0/0x410 __schedule+0x300/0x980 schedule+0x48/0xc0 rwsem_down_read_failed+0x124/0x1d0 down_read+0x68/0x80 cxlflash_ioctl+0x70/0x6f0 [cxlflash] scsi_ioctl+0x3b0/0x4c0 sg_ioctl+0x960/0x1010 do_vfs_ioctl+0xd8/0x8c0 SyS_ioctl+0xd4/0xf0 system_call+0x38/0xb4 INFO: task eehd:48 blocked for more than 120 seconds. The hang is because of a 3 way dead-lock: Process A holds the recovery mutex, and waits for eehd to complete. Process B holds the semaphore and waits for the recovery mutex. eehd waits for semaphore. The fix is to have Process B above release the semaphore before attempting to acquire the recovery mutex. This will allow eehd to proceed to completion. Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Reviewed-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 29 3月, 2016 2 次提交
-
-
由 Manoj N. Kumar 提交于
While profiling the cxlflash_queuecommand() path under a heavy load it was found that number of retries to find cmd_room was fairly high. There are two problems with the current back-off: a) It starts with a udelay of 0 b) It backs-off linearly Tried several approaches (a higher multiple 10*n, 100*n, as well as n^2, 2^n) and found that the exponential back-off(2^n) approach had the least overall cost. Cost as being defined as overall time spent waiting. The fix is to change the linear back-off to an exponential back-off. This solution also takes care of the problem with the initial delay (starts with 1 usec). Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Manoj N. Kumar 提交于
While running 'sg_reset -H' back to back the following exception was seen: [ 735.115695] Faulting instruction address: 0xd0000000098c0864 cpu 0x0: Vector: 300 (Data Access) at [c000000ffffafa80] pc: d0000000098c0864: cxlflash_async_err_irq+0x84/0x5c0 [cxlflash] lr: c00000000013aed0: handle_irq_event_percpu+0xa0/0x310 sp: c000000ffffafd00 msr: 9000000000009033 dar: 2010000 dsisr: 40000000 current = 0xc000000001510880 paca = 0xc00000000fb80000 softe: 0 irq_happened: 0x01 pid = 0, comm = swapper/0 Linux version 4.5.0-491-26f710d+ enter ? for help [c000000ffffafe10] c00000000013aed0 handle_irq_event_percpu+0xa0/0x310 [c000000ffffafed0] c00000000013b1a8 handle_irq_event+0x68/0xc0 [c000000ffffaff00] c0000000001404ec handle_fasteoi_irq+0xec/0x2a0 [c000000ffffaff30] c00000000013a084 generic_handle_irq+0x54/0x80 [c000000ffffaff60] c000000000011130 __do_irq+0x80/0x1d0 [c000000ffffaff90] c000000000024d40 call_do_irq+0x14/0x24 [c000000001573a20] c000000000011318 do_IRQ+0x98/0x140 [c000000001573a70] c000000000002594 hardware_interrupt_common+0x114/0x180 This exception is being hit because the async_err interrupt path performs an MMIO to read the interrupt status register. The MMIO region in this case is not available. Commit 6ded8b3c ("cxlflash: Unmap problem state area before detaching master context") re-ordered the sequence in which term_mc() and stop_afu() are called. This introduces a window for interrupts to come in with the problem space area unmapped, that did not exist previously. The fix is to separate the disabling of all AFU interrupts to a distinct function, term_intr() so that it is the first thing that is done in the tear down process. To keep the initialization process symmetric, separate the AFU interrupt setup also to a distinct function: init_intr(). Fixes: 6ded8b3c ("cxlflash: Unmap problem state area before detaching master context") Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NJohannes Thumshirn <jthumshirn@suse.de> Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 09 3月, 2016 8 次提交
-
-
由 Frederic Barrat 提交于
To read the adapter VPD, drivers can't rely on pci config APIs, as it wouldn't work on powerVM. cxl introduced a new kernel API especially for this, so start using it. Co-authored-by: NChristophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: NFrederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: NChristophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Manoj N. Kumar 提交于
With the current value of cmd_per_lun at 16, the throughput over a single adapter is limited to around 150kIOPS. Increase the value of cmd_per_lun to 256 to improve throughput. With this change a single adapter is able to attain close to the maximum throughput (380kIOPS). Also change the number of RRQ entries that can be queued. Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Manoj N. Kumar 提交于
When switching to the internal LUN defined on the IBM CXL flash adapter, there is an unnecessary scan occurring on the second port. This scan leads to the following extra lines in the log: Dec 17 10:09:00 tul83p1 kernel: [ 3708.561134] cxlflash 0008:00:00.0: cxlflash_queuecommand: (scp=c0000000fc1f0f00) 11/1/0/0 cdb=(A0000000-00000000-10000000-00000000) Dec 17 10:09:00 tul83p1 kernel: [ 3708.561147] process_cmd_err: cmd failed afu_rc=32 scsi_rc=0 fc_rc=0 afu_extra=0xE, scsi_extra=0x0, fc_extra=0x0 By definition, both of the internal LUNs are on the first port/channel. When the lun_mode is switched to internal LUN the same value for host->max_channel is retained. This causes an unnecessary scan over the second port/channel. This fix alters the host->max_channel to 0 (1 port), if internal LUNs are configured and switches it back to 1 (2 ports) while going back to external LUNs. Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
In order to support cxlflash in the PowerVM environment, underlying hypervisor APIs have imposed a kernel API ordering change. For the superpipe access to LUN, user applications need a context. The cxlflash module creates this context by making a sequence of cxl calls. In the current code, a context is initialized via cxl_dev_context_init() followed by cxl_process_element(), a function that obtains the process element id. Finally, cxl_start_work() is called to attach the process element. In the PowerVM environment, a process element id cannot be obtained from the hypervisor until the process element is attached. The cxlflash module is unable to create contexts without a valid process element id. To fix this problem, cxl_start_work() is called before obtaining the process element id. Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
The cxlflash_disk_attach() routine currently uses a cascading error gate strategy for its error cleanup path. While this strategy is commonly used to handle cleanup scenarios, it is too restrictive when function callouts need to be restructured. Problems range from inserting error path bugs in previously 'good' code to the cleanup path imposing design changes to how the normal path is structured. A less restrictive approach is needed to support ordering changes that come about when operating in different environments. To overcome this restriction, the error cleanup path is modified to have a single entrypoint and use conditional logic to cleanup where necessary. Entities that require multiple cleanup steps must be carefully vetted to ensure their APIs support state. In cases where they do not (none as of this commit) additional local variables can be used to maintain state on their behalf. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Matthew R. Ochs 提交于
Presently, context information structures are allocated and initialized in the same routine, create_context(). This imposes an ordering restriction such that all pieces of information needed to initialize a context must be known before the context is even allocated. This design point is not flexible when the order of context creation needs to be modified. Specifically, this can lead to problems when members of the context information structure are a part of an ordering dependency (i.e. - the 'work' structure embedded within the context). To remedy, the allocation is left as-is, inside of the existing create_context() routine and the initialization is transitioned to a new void routine, init_context(). At the same time, in anticipation of these routines not being called in sequence, a state boolean is added to the context information structure to track when the context has been initilized. The context teardown routine, destroy_context(), is modified to support being called with a non-initialized context. Signed-off-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Uma Krishnan 提交于
When operating in the PowerVM environment, the cxlflash module can receive an error from the hypervisor indicating that there are existing mappings in the page table for the process MMIO space. This issue exists because term_afu() currently invokes term_mc() before stop_afu(), allowing for the master context to be detached first and the problem state area to be unmapped second. To resolve this issue, stop_afu() should be called before term_mc(). Signed-off-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Manoj N. Kumar 提交于
The calls to pci_request_regions(), pci_resource_start(), pci_set_dma_mask(), pci_set_master() and pci_save_state() are all unnecessary for the IBM CXL flash adapter since data buffers are not required to be mapped to the device's memory. The use of services such as pci_set_dma_mask() are problematic on hypervisor managed systems as the IBM CXL flash adapter is operating under a virtual PCI Host Bridge (virtual PHB) which does not support these services. cxlflash 0001:00:00.0: init_pci: Failed to set PCI DMA mask rc=-5 The resolution is to simplify init_pci(), to a point where it does the bare minimum (pci_enable_device). Similarly, remove the call the pci_release_regions() from cxlflash_remove(). Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
- 07 1月, 2016 2 次提交
-
-
由 Manoj Kumar 提交于
This drop enables a future card with a device id of 0x0600 to be recognized by the cxlflash driver. As per the design, the Accelerator Function Unit (AFU) for this new IBM CXL Flash Adapter retains the same host interface as the previous generation. For the early prototypes of the new card, the driver with this change behaves exactly as the driver prior to this behaved with the earlier generation card. Therefore, no card specific programming has been added. These card specific changes can be staged in later if needed. Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Manoj Kumar 提交于
If an async error interrupt is generated, and the error requires the FC link to be reset, it cannot be performed in the interrupt context. So a work element is scheduled to complete the link reset in a process context. If either an EEH event or an escalation occurs in between when the interrupt is generated and the scheduled work is started, the MMIO space may no longer be available. This will cause an oops in the worker thread. [ 606.806583] NIP kthread_data+0x28/0x40 [ 606.806633] LR wq_worker_sleeping+0x30/0x100 [ 606.806694] Call Trace: [ 606.806721] 0x50 (unreliable) [ 606.806796] wq_worker_sleeping+0x30/0x100 [ 606.806884] __schedule+0x69c/0x8a0 [ 606.806959] schedule+0x44/0xc0 [ 606.807034] do_exit+0x770/0xb90 [ 606.807109] die+0x300/0x460 [ 606.807185] bad_page_fault+0xd8/0x150 [ 606.807259] handle_page_fault+0x2c/0x30 [ 606.807338] wait_port_offline.constprop.12+0x60/0x130 [cxlflash] To prevent the problem space area from being unmapped, when there is pending work, a mapcount (using the kref mechanism) is held. The mapcount is released only when the work is completed. The last reference release is tied to the unmapping service. Signed-off-by: NManoj N. Kumar <manoj@linux.vnet.ibm.com> Acked-by: NMatthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: NUma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-