- 05 3月, 2014 1 次提交
-
-
由 Thomas Gleixner 提交于
commit 91150af3 (powerpc/eeh: Fix unbalanced enable for IRQ) is another brilliant example of trainwreck engineering. The patch "fixes" the issue of an unbalanced call to irq_enable() which causes a prominent warning by checking the disabled state of the interrupt line and call conditionally into the core code. This is wrong in two aspects: 1) The warning is there to tell users, that they need to fix their asymetric enable/disable patterns by finding the root cause and solving it there. It's definitely not meant to work around it by conditionally calling into the core code depending on the random state of the irq line. Asymetric irq_disable/enable calls are a clear sign of wrong usage of the interfaces which have to be cured at the root and not by somehow hacking around it. 2) The abuse of core internal data structure instead of using the proper interfaces for retrieving the information for the 'hack around' irq_desc is core internal and it's clear enough stated. Replace at least the irq_desc abuse with the proper functions and add a big fat comment why this is absurd and completely wrong. Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Gavin Shan <shangw@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: ppc <linuxppc-dev@lists.ozlabs.org> Link: http://lkml.kernel.org/r/20140223212736.562906212@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
- 11 2月, 2014 1 次提交
-
-
Commit f5c57710 ("powerpc/eeh: Use partial hotplug for EEH unaware drivers") introduces eeh_rmv_device, which may grab a reference to a driver, but not release it. That prevents a driver from being removed after it has gone through EEH recovery. This patch drops the reference if it was taken. Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Acked-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 16 1月, 2014 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Race conditions are theoretically possible between the PCI device addition and removal in the PPC64 PCI error recovery driver and the generic PCI bus rescan and device removal that can be triggered via sysfs. To avoid those race conditions make PPC64 PCI error recovery driver use global PCI rescan-remove locking around PCI device addition and removal. Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 15 1月, 2014 2 次提交
-
-
由 Gavin Shan 提交于
For one PCI error relevant OPAL event, we possibly have multiple EEH errors for that. For example, multiple frozen PEs detected on different PHBs. Unfortunately, we didn't cover the case. The patch enumarates the return value from eeh_ops::next_error() and change eeh_handle_special_event() and eeh_ops::next_error() to handle all existing EEH errors. As Ben pointed out, we needn't list_for_each_entry_safe() since we are not deleting any PHB from the hose_list and the EEH serialized lock should be held while purging EEH events. The patch covers those suggestions as well. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
When EEH error comes to one specific PCI device before its driver is loaded, we will apply hotplug to recover the error. During the plug time, the PCI device will be probed and its driver is loaded. Then we wrongly calls to the error handlers if the driver supports EEH explicitly. The patch intends to fix by introducing flag EEH_DEV_NO_HANDLER and set it before we remove the PCI device. In turn, we can avoid wrongly calls the error handlers of the PCI device after its driver loaded. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 05 12月, 2013 1 次提交
-
-
由 Brian King 提交于
In order to support concurrent adapter firmware download to SR-IOV adapters on pSeries, each VF will see an EEH event where the slot will remain in the unavailable state for the duration of the adapter firmware update, which can take as long as 5 minutes. Extend the EEH recovery timeout to account for this. Signed-off-by: NBrian King <brking@linux.vnet.ibm.com> Acked-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 24 7月, 2013 3 次提交
-
-
由 Gavin Shan 提交于
The patch fixes following issue: Unbalanced enable for IRQ 23 ------------[ cut here ]------------ WARNING: at kernel/irq/manage.c:437 : NIP [c00000000016de8c] .__enable_irq+0x11c/0x140 LR [c00000000016de88] .__enable_irq+0x118/0x140 Call Trace: [c000003ea1f23880] [c00000000016de88] .__enable_irq+0x118/0x140 (unreliable) [c000003ea1f23910] [c00000000016df08] .enable_irq+0x58/0xa0 [c000003ea1f239a0] [c0000000000388b4] .eeh_enable_irq+0xc4/0xe0 [c000003ea1f23a30] [c000000000038a28] .eeh_report_reset+0x78/0x130 [c000003ea1f23ac0] [c000000000037508] .eeh_pe_dev_traverse+0x98/0x170 [c000003ea1f23b60] [c0000000000391ac] .eeh_handle_normal_event+0x2fc/0x3d0 [c000003ea1f23bf0] [c000000000039538] .eeh_handle_event+0x2b8/0x2c0 [c000003ea1f23c90] [c000000000039600] .eeh_event_handler+0xc0/0x170 [c000003ea1f23d30] [c0000000000da9a0] .kthread+0xf0/0x100 [c000003ea1f23e30] [c00000000000a1dc] .ret_from_kernel_thread+0x5c/0x80 Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
When EEH error happens to one specific PE, some devices with drivers supporting EEH won't except hotplug on the device. However, there might have other deivces without driver, or with driver without EEH support. For the case, we need do partial hotplug in order to make sure that the PE becomes absolutely quite during reset. Otherise, the PE reset might fail and leads to failure of error recovery. The current code doesn't handle that 'mixed' case properly, it either uses the error callbacks to the drivers, or tries hotplug, but doesn't handle a PE (EEH domain) composed of a combination of the two. The patch intends to support so-called "partial" hotplug for EEH: Before we do reset, we stop and remove those PCI devices without EEH sensitive driver. The corresponding EEH devices are not detached from its PE, but with special flag. After the reset is done, those EEH devices with the special flag will be scanned one by one. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
When we do normal hotplug, the PE (shadow EEH structure) shouldn't be kept around. However, we need to keep it if the hotplug an artifial one caused by EEH errors recovery. Since we remove EEH device through the PCI hook pcibios_release_device(), the flag "purge_pe" passed to various functions is meaningless. So the patch removes the meaningless flag and introduce new flag "EEH_PE_KEEP" to save the PE while doing hotplug during EEH error recovery. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 01 7月, 2013 1 次提交
-
-
由 Gavin Shan 提交于
We needn't the the whole backtrace other than one-line message in the error reporting interrupt handler. For errors triggered by access PCI config space or MMIO, we replace "WARN(1, ...)" with pr_err() and dump_stack(). The patch also adds more output messages to indicate what EEH core is doing. Besides, some printk() are replaced with pr_warning(). Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 6月, 2013 4 次提交
-
-
由 Gavin Shan 提交于
On PowerNV platform, the EEH event caused by interrupt won't have binding PE. The patch enables EEH core to handle the special event. To avoid the current logic we have, The eeh_handle_event() is renamed to eeh_handle_normal_event(), and the eeh_handle_special_event() is introduced. The function eeh_handle_event() dispatches to above two functions according to the input parameter. Besides, new backend "next_error" added to eeh_ops and it's expected to have following return values: 4 - Dead IOC 3 - Dead PHB 2 - Fenced PHB 1 - Frozen PE 0 - No error found Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
We're not expecting that one specific PE got frozen for over 5 times in last hour. Otherwise, the PE will be removed from the system upon newly coming EEH errors. The patch introduces time stamp to trace the first error on specific PE in last hour and function to update that accordingly. Besides, the time stamp is recovered during PE hotplug path as we did for frozen count. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
The patch moves the common part of EEH core into arch/powerpc/kernel directory so that we needn't PPC_PSERIES while compiling POWERNV platform: * Move the EEH common part into arch/powerpc/kernel * Move the functions for PCI hotplug from pSeries platform to arch/powerpc/kernel/pci-hotplug.c * Move CONFIG_EEH from arch/powerpc/platforms/pseries/Kconfig to arch/powerpc/platforms/Kconfig * Adjust makefile accordingly Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
Cleanup on EEH core to remove unnecessary whitespaces. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 18 9月, 2012 2 次提交
-
-
由 Gavin Shan 提交于
The EEH core is talking with the PCI device driver to determine the action (purely reset, or PCI device removal). During the period, the driver might be unloaded and in turn causes kernel crash as follows: EEH: Detected PCI bus error on PHB#4-PE#10000 EEH: This PCI device has failed 3 times in the last hour lpfc 0004:01:00.0: 0:2710 PCI channel disable preparing for reset Unable to handle kernel paging request for data at address 0x00000490 Faulting instruction address: 0xd00000000e682c90 cpu 0x1: Vector: 300 (Data Access) at [c000000fc75ffa20] pc: d00000000e682c90: .lpfc_io_error_detected+0x30/0x240 [lpfc] lr: d00000000e682c8c: .lpfc_io_error_detected+0x2c/0x240 [lpfc] sp: c000000fc75ffca0 msr: 8000000000009032 dar: 490 dsisr: 40000000 current = 0xc000000fc79b88b0 paca = 0xc00000000edb0380 softe: 0 irq_happened: 0x00 pid = 3386, comm = eehd enter ? for help [c000000fc75ffca0] c000000fc75ffd30 (unreliable) [c000000fc75ffd30] c00000000004fd3c .eeh_report_error+0x7c/0xf0 [c000000fc75ffdc0] c00000000004ee00 .eeh_pe_dev_traverse+0xa0/0x180 [c000000fc75ffe70] c00000000004ffd8 .eeh_handle_event+0x68/0x300 [c000000fc75fff00] c0000000000503a0 .eeh_event_handler+0x130/0x1a0 [c000000fc75fff90] c000000000020138 .kernel_thread+0x54/0x70 1:mon> The patch increases the reference of the corresponding driver modules while EEH core does the negotiation with PCI device driver so that the corresponding driver modules can't be unloaded during the period and we're safe to refer the callbacks. Cc: stable@vger.kernel.org Reported-by: NAlexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
Function eeh_rmv_from_parent_pe() could be called by the path of either normal PCI hotplug, or EEH recovery. For the former case, we need purge the corresponding PE on removal of the associated PE bus. The patch tries to cover that by passing more information to function pcibios_remove_pci_devices() so that we know if the corresponding PE needs to be purged or be marked as "invalid". Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 10 9月, 2012 2 次提交
-
-
由 Gavin Shan 提交于
The patch removes the eeh related statistics for eeh device since they have been maintained by the corresponding eeh PE. Also, the flags used to trace the state of eeh device and PE have been reworked for a little bit. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
The patch reworks the current implementation so that the eeh errors will be handled basing on PE instead of eeh device. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 09 3月, 2012 8 次提交
-
-
由 Gavin Shan 提交于
The original EEH implementation is heavily depending on struct pci_dn. We have to put EEH related information to pci_dn. Actually, we could split struct pci_dn so that the EEH sensitive information to form an individual struct, then EEH looks more independent. The patch replaces pci_dn with eeh_dev for EEH aux components like event and driver. Also, the eeh_event struct has been adjusted for a little bit since eeh_dev has linked the associated FDT (Flat Device Tree) node and PCI device. It's not necessary for eeh_event struct to trace FDT node and PCI device. We can just simply to trace eeh_dev in eeh_event. The patch also renames function pcid_name() to eeh_pcid_name(), which should be missed in the previous patch where the EEH aux components have been cleaned up. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
There're several EEH aux components and the patch does some cleanup for them so that they look more clean. * Duplicated comments have been removed from the header file. * Comments have been reorganized so that it looks more clean. * The leading comments of functions are adjusted for a little bit so that the result of "make pdfdocs" would be more unified. * Function calls "xxx ()" has been replaced by "xxx()". Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
In order to enable particular PCI device, which has been included in the parent PE. The involved PCI bridges should be enabled explicitly if there has. On pSeries platform, there're dedicated RTAS calls to fulfil the purpose. The patch implements the function of configuring PCI bridges through the dedicated RTAS calls. Besides, the function has been abstracted by struct eeh_ops::configure_bridge so that the EEH core components could support multiple platforms in future. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
On RTAS compliant pSeries platform, one dedicated RTAS call has been introduced to retrieve EEH temporary or permanent error log. The patch implements the function of retriving EEH error log through RTAS call. Besides, it has been abstracted by struct eeh_ops::get_log so that EEH core components could support multiple platforms in future. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
On pSeries platform, the PE state might be temporarily unavailable. In that case, the firmware will return the corresponding wait time. That means the kernel has to wait for appropriate time in order to get the PE state. The patch does the implementation for that. Besides, the function has been abstracted through struct eeh_ops::wait_state so that EEH core components could support multiple platforms in future. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
On pSeries platform, there're 2 dedicated RTAS calls introduced to retrieve the corresponding PE's state: ibm,read-slot-reset-state and ibm,read-slot-reset-state2. The patch implements the retrieval of PE's state according to the given PE address. Besides, the implementation has been abstracted by struct eeh_ops::get_state so that EEH core components could support multiple platforms in future. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
There're 4 EEH operations that are covered by the dedicated RTAS call <ibm,set-eeh-option>: enable or disable EEH, enable MMIO and enable DMA. At early stage of system boot, the EEH would be tried to enable on PCI device related device node. MMIO and DMA for particular PE should be enabled when doing recovery on EEH errors so that the PE could function properly again. The patch implements it and abstract that through struct eeh_ops::set_eeh. It would be help for EEH to support multiple platforms in future. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Gavin Shan 提交于
The EEH has been implemented on pSeries platform. The original code looks a little bit nasty. The patch does cleanup on the current EEH implementation so that it looks more clean. * Try adding prefix "eeh" for functions. * Some function names have been adjusted so that they looks shorter and meaningful. Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 06 5月, 2011 1 次提交
-
-
由 Richard A Lary 提交于
For adapters which have devices under a PCIe switch/bridge it is informative to display information for both the PCIe switch/bridge and the device on which the bus error was detected. rebased to powerpc-next Signed-off-by: NRichard A Lary <rlary@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 17 2月, 2010 1 次提交
-
-
由 Breno Leitao 提交于
During a EEH recover, the pci_dev structure can be null, mainly if an eeh event is detected during cpi config operation. In this case, the pci_dev will not be known (and will be null) the kernel will crash with the following message: Unable to handle kernel paging request for data at address 0x000000a0 Faulting instruction address: 0xc00000000006b8b4 Oops: Kernel access of bad area, sig: 11 [#1] NIP [c00000000006b8b4] .eeh_event_handler+0x10c/0x1a0 LR [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0 Call Trace: [c0000003a80dff00] [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0 [c0000003a80dff90] [c000000000031f1c] .kernel_thread+0x54/0x70 The bug occurs because pci_name() tries to access a null pointer. This patch just guarantee that pci_name() is not called on Null pointers. Signed-off-by: NBreno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: NLinas Vepstas <linasvepstas@gmail.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 09 2月, 2010 1 次提交
-
-
由 Frans Pop 提交于
Signed-off-by: NFrans Pop <elendil@planet.nl> Cc: linuxppc-dev@ozlabs.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 30 10月, 2009 1 次提交
-
-
由 Michael Ellerman 提交于
Rather than open-coding our own check, use irq_has_action() to check if an irq has an action - ie. is "in use". irq_has_action() doesn't take the descriptor lock, but it shouldn't matter - we're just using it as an indicator that the irq is in use. disable_irq_nosync() will take the descriptor lock before doing anything also. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Acked-by: NGrant Likely <grant.likely@secretlab.ca> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 17 6月, 2009 1 次提交
-
-
由 Zhang, Yanmin 提交于
Based on PCI Express AER specs, a root port might receive multiple TLP errors while it could only save a correctable error source id and an uncorrectable error source id at the same time. In addition, some root port hardware might be unable to provide a correct source id, i.e., the source id, or the bus id part of the source id provided by root port might be equal to 0. The patchset implements the support in kernel by searching the device tree under the root port. Patch 1 changes parameter cb of function pci_walk_bus to return a value. When cb return non-zero, pci_walk_bus stops more searching on the device tree. Reviewed-by: NAndrew Patterson <andrew.patterson@hp.com> Signed-off-by: NZhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
-
- 15 4月, 2009 1 次提交
-
-
由 Mike Mason 提交于
While adding native EEH support to Emulex and Qlogic drivers, it was discovered that dev->error_state was set to pci_io_channel_normal too late in the recovery process. These drivers rely on error_state to determine if they can access the device in their slot_reset callback, thus error_state needs to be set to pci_io_channel_normal in eeh_report_reset(). Below is a detailed explanation (courtesy of Richard Lary) as to why this is necessary. Background: PCI MMIO or DMA accesses to a frozen slot generate additional EEH errors. If the number of additional EEH errors exceeds EEH_MAX_FAILS the adapter will be shutdown. To avoid triggering excessive EEH errors and an undesirable adapter shutdown, some drivers use the pci_channel_offline(dev) wrapper function to return a Boolean value based on the value of pci_dev->error_state to determine if PCI MMIO or DMA accesses are safe. If the wrapper returns TRUE, drivers must not make PCI MMIO or DMA access to their hardware. The pci_dev structure member error_state reflects one of three values, 1) pci_channel_io_normal, 2) pci_channel_io_frozen, 3) pci_channel_io_perm_failure. Function pci_channel_offline(dev) returns TRUE if error_state is pci_channel_io_frozen or pci_channel_io_perm_failure. The EEH driver sets pci_dev->error_state to pci_channel_io_frozen at the point where the PCI slot is frozen. Currently, the EEH driver restores dev->error_state to pci_channel_io_normal in eeh_report_resume() before calling the driver's resume callback. However, when the EEH driver calls the driver's slot_reset callback() from eeh_report_reset(), it incorrectly indicates the error state is still pci_channel_io_frozen. Waiting until eeh_report_resume() to restore dev->error_state to pci_channel_io_normal is too late for Emulex and QLogic FC drivers and any other drivers which are designed to use common code paths in these two cases: i) those called after the driver's slot_reset callback() and ii) those called after the PCI slot is frozen but before the driver's slot_reset callback is called. Case i) all driver paths executed to reinitialize the hardware after a reset and case ii) all code paths executed by driver kernel threads that run asynchronous to the main driver thread, such as interrupt handlers and worker threads to process driver work queues. Emulex and QLogic FC drivers are designed with common code paths which require that pci_channel_offline(dev) reflect the true state of the hardware. The state transitions that the hardware takes from Normal Operations to Slot Frozen to Reset to Normal Operations are documented in the Power Architecture™ Platform Requirements+ (PAPR+) in Table 75. PE State Control. PAPR defines the following 3 states: 0 -- Not reset, Not EEH stopped, MMIO load/store allowed, DMA allowed (Normal Operations) 1 -- Reset, Not EEH stopped, MMIO load/store disabled, DMA disabled 2 -- Not reset, EEH stopped, MMIO load/store disabled, DMA disabled (Slot Frozen) An EEH error places the slot in state 2 (Frozen) and the adapter driver is notified that an EEH error was detected. If the adapter driver returns PCI_ERS_RESULT_NEED_RESET, the EEH driver calls eeh_reset_device() to place the slot into state 1 (Reset) and eeh_reset_device completes by placing the slot into State 0 (Normal Operations). Upon return from eeh_reset_device(), the EEH driver calls eeh_report_reset, which then calls the adapter's slot_reset callback. At the time the adapter's slot_reset callback is called, the true state of the hardware is Normal Operations and should be accurately reflected by setting dev->error_state to pci_channel_io_normal. The current implementation of EEH driver does not do so and requires this change to correct this deficiency. Signed-off-by: NMike Mason <mmlnx@us.ibm.com> Acked-by: NLinas Vepstas <linasvepstas@gmail.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 11 2月, 2009 1 次提交
-
-
由 Mike Mason 提交于
The EEH code disables and enables interrupts during the device recovery process. This is unnecessary for MSI and MSI-X interrupts because they are effectively disabled by the DMA Stopped state when an EEH error occurs. The current code is also incorrect for MSI-X interrupts. It doesn't take into account that MSI-X interrupts are tracked in a different way than LSI/MSI interrupts. This patch ensures only LSI interrupts are disabled/enabled. Signed-off-by: NMike Mason <mmlnx@us.ibm.com> Acked-by: NLinas Vepstas <linasvepstas@gmail.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 8月, 2008 1 次提交
-
-
由 Tony Breeds 提交于
Currently print_device_node_tree() isn't called but it can be useful for debugging. Leave the function there but hide it behind '#if 0' to save it being rewritten. If you want to call it you're already editing this file anyway. ;P Signed-off-by: NTony Breeds <tony@bakeyournoodle.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 16 6月, 2008 1 次提交
-
-
由 Andrew Morton 提交于
Fix this: /usr/src/devel/arch/powerpc/platforms/pseries/eeh_driver.c: In function 'print_device_node_tree': /usr/src/devel/arch/powerpc/platforms/pseries/eeh_driver.c:55: warning: ISO C90 forbids mixed declarations and code also make that function look like it's part of Linux. Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 11 12月, 2007 1 次提交
-
-
由 Stephen Rothwell 提交于
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 03 12月, 2007 1 次提交
-
-
由 Linas Vepstas 提交于
Do not wait for the pci slot status before reporting an error to the device driver. Some systems may take many seconds to report the slot status, and this can confuse unsuspecting device drivers. Signed-off-by: NLinas Vepstas <linas@austin.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 08 11月, 2007 2 次提交
-
-
由 Linas Vepstas 提交于
Bugfix: if a driver controlling one part of a multi-function PCI card has asked for a reset, honor that request above all others. Signed-off-by: NLinas Vepstas <linas@austin.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Linas Vepstas 提交于
Clean up commentary, remove dead code. Signed-off-by Linas Vepstas <linas@austin.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 14 6月, 2007 1 次提交
-
-
由 Linas Vepstas 提交于
Twiddle the copyright notices. Per current guidelines, the use of the (C) or (c) in source code is deprecated. Signed-off-by: NLinas Vepstas <linas@austin.ibm.com> ---- arch/powerpc/platforms/pseries/eeh.c | 6 +++++- arch/powerpc/platforms/pseries/eeh_cache.c | 3 ++- arch/powerpc/platforms/pseries/eeh_driver.c | 6 +++--- 3 files changed, 10 insertions(+), 5 deletions(-) Signed-off-by: NPaul Mackerras <paulus@samba.org>
-