- 21 9月, 2018 1 次提交
-
-
由 Keith Busch 提交于
The PCI port driver saves the PCI state after initializing the device with the applicable service devices. This was, however, before the service drivers were even registered because PCI probe happens before the device_initcall initialized those service drivers. The config space state that the services set up were not being saved. The end result would cause PCI devices to not react to events that the drivers think they did if the PCI state ever needed to be restored. Fix this by changing the service drivers from using the init calls to having the portdrv driver calling the services directly. This will get the state saved as desired, while making the relationship between the port driver and the services under it more explicit in the code. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NSinan Kaya <okaya@kernel.org>
-
- 16 8月, 2018 1 次提交
-
-
由 Alexandru Gagniuc 提交于
If the platform requests Firmware-First error handling, firmware is responsible for reading and clearing AER status bits. If OSPM also clears them, we may miss errors. See ACPI v6.2, sec 18.3.2.5 and 18.4. This race is mostly of theoretical significance, as it is not easy to reasonably demonstrate it in testing. Signed-off-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: add similar guards to pci_cleanup_aer_uncorrect_error_status() and pci_aer_clear_fatal_status()] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 01 8月, 2018 1 次提交
-
-
由 Bjorn Helgaas 提交于
PCI_EXP_AER_FLAGS was defined twice (with identical definitions), once under #ifdef CONFIG_ACPI_APEI, and again at the top level. This looks like my merge error from these commits: fd3362cb ("PCI/AER: Squash aerdrv_core.c into aerdrv.c") 41cbc9eb ("PCI/AER: Squash ecrc.c into aerdrv.c") Remove the duplicate PCI_EXP_AER_FLAGS definition. Fixes: 41cbc9eb ("PCI/AER: Squash ecrc.c into aerdrv.c") Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NOza Pawandeep <poza@codeaurora.org>
-
- 21 7月, 2018 5 次提交
-
-
由 Oza Pawandeep 提交于
In case of correctable error, the Correctable Error Detected bit in the Device Status register is set. Clear it after handling the error. Signed-off-by: NOza Pawandeep <poza@codeaurora.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Oza Pawandeep 提交于
Clear the device status bits while handling both ERR_FATAL and ERR_NONFATAL cases. Signed-off-by: NOza Pawandeep <poza@codeaurora.org> [bhelgaas: rename to pci_aer_clear_device_status(), declare internal to PCI core instead of exposing it everywhere] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Oza Pawandeep 提交于
aer_error_resume() clears all ERR_NONFATAL error status bits. This is exactly what pci_cleanup_aer_uncorrect_error_status(), so use that instead of duplicating the code. Signed-off-by: NOza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Oza Pawandeep 提交于
pci_cleanup_aer_uncorrect_error_status() is called by driver .slot_reset() methods when handling ERR_NONFATAL errors. Previously this cleared *all* the bits, including ERR_FATAL bits. Since we're only handling ERR_NONFATAL errors, clear only the ERR_NONFATAL error status bits. Signed-off-by: NOza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Bjorn Helgaas 提交于
During recovery from fatal errors, we previously called pci_cleanup_aer_uncorrect_error_status(), which cleared *all* uncorrectable error status bits (both ERR_FATAL and ERR_NONFATAL). Instead, call a new pci_aer_clear_fatal_status() that clears only the ERR_FATAL bits (as indicated by the PCI_ERR_UNCOR_SEVER register). Based-on-patch-by: NOza Pawandeep <poza@codeaurora.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 20 7月, 2018 9 次提交
-
-
由 Sinan Kaya 提交于
Rename pci_reset_bridge_secondary_bus() to pci_bridge_secondary_bus_reset() and move the declaration from linux/pci.h to drivers/pci.h to be used internally in PCI directory only. Signed-off-by: NSinan Kaya <okaya@codeaurora.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Sinan Kaya 提交于
Commit 01fd61c0 ("PCI: Add a return type for pci_reset_bridge_secondary_bus()") added a return value to the function to return if a device is accessible following a reset. Callers are not checking the value. Pass error code up high in the stack if device is not accessible. Fixes: 01fd61c0 ("PCI: Add a return type for pci_reset_bridge_secondary_bus()") Signed-off-by: NSinan Kaya <okaya@codeaurora.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Alexandru Gagniuc 提交于
According to the documentation, "pcie_ports=native", linux should use native AER and DPC services. While that is true for the _OSC method parsing, this is not the only place that is checked. Should the HEST list PCIe ports as firmware-first, linux will not use native services. This happens because aer_acpi_firmware_first() doesn't take 'pcie_ports' into account. This is wrong. DPC uses the same logic when it decides whether to load or not, so fixing this also fixes DPC not loading. Signed-off-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: return "false" from bool function (from kbuild robot)] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Rajat Jain 提交于
Add sysfs attributes for rootport statistics (that are cumulative of all the ERR_* messages seen on this PCI hierarchy). Signed-off-by: NRajat Jain <rajatja@google.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Rajat Jain 提交于
Add sysfs attributes to provide total and breakdown of the AERs seen, into different type of correctable, fatal and nonfatal errors: /sys/bus/pci/devices/<dev>/aer_dev_correctable /sys/bus/pci/devices/<dev>/aer_dev_fatal /sys/bus/pci/devices/<dev>/aer_dev_nonfatal Signed-off-by: NRajat Jain <rajatja@google.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Rajat Jain 提交于
Define a structure to hold the AER statistics. There are 2 groups of statistics: dev_* counters that are to be collected for all AER capable devices and rootport_* counters that are collected for all (AER capable) rootports only. Allocate and free this structure when device is added or released (thus counters survive the lifetime of the device). Signed-off-by: NRajat Jain <rajatja@google.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Rajat Jain 提交于
Since pci_aer_init() and pci_no_aer() are used only internally, move their declarations to the PCI internal header file. Also, no one cares about return value of pci_aer_init(), so make it void. Signed-off-by: NRajat Jain <rajatja@google.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Tyler Baicar 提交于
lspci uses abbreviated naming for AER error strings. Adopt the same naming convention for the AER printing so they match. Signed-off-by: NTyler Baicar <tbaicar@codeaurora.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NOza Pawandeep <poza@codeaurora.org>
-
由 Keith Busch 提交于
Export some common AER functions and structures for other PCI core drivers to use. Since this is making the function externally visible inside the PCI core, prepend "aer_" to the function name. Signed-off-by: NKeith Busch <keith.busch@intel.com> [bhelgaas: move AER declarations from linux/aer.h to drivers/pci/pci.h] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NSinan Kaya <okaya@kernel.org> Reviewed-by: NOza Pawandeep <poza@codeaurora.org>
-
- 11 6月, 2018 7 次提交
-
-
由 Bjorn Helgaas 提交于
Hoist aerdrv.c, aer_inject.c up to drivers/pci/pcie/ so they're next to other PCIe service drivers. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
由 Bjorn Helgaas 提交于
Most of the things in aerdrv.h are only used in aerdrv.c, so move them there. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
由 Bjorn Helgaas 提交于
Squash ecrc.c into aerdrv.c. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
由 Bjorn Helgaas 提交于
Squash aerdrv_acpi.c into aerdrv.c. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
由 Bjorn Helgaas 提交于
Squash aerdrv_errprint.c into aerdrv.c. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
由 Bjorn Helgaas 提交于
Squash aerdrv_core.c into aerdrv.c. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
由 Bjorn Helgaas 提交于
Reorder code to group probe/remove stuff together. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
- 08 6月, 2018 1 次提交
-
-
由 Bjorn Helgaas 提交于
Reorder code to remove forward declarations. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
- 06 6月, 2018 1 次提交
-
-
由 Keith Busch 提交于
The AER driver only needed the pcie_device to get to the port pci_dev. Save the pci_dev pointer directly in struct aer_rpc and remove the unnecessary indirection. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 18 5月, 2018 1 次提交
-
-
由 Oza Pawandeep 提交于
PCIe ERR_FATAL errors mean the Link is unreliable. Components on the Link may need to be reset to return to reliable operation (PCIe r4.0, sec 6.2.2). We previously handled these errors much differently depending on whether the platform supports Downstream Port Containment (DPC) (PCIe r4.0, sec 6.2.10) or not. The AER driver has historically logged the error details, called driver-supplied pci_error_handlers callbacks, and reset the Link. This reset downstream devices, but did not remove them from the PCI subsystem, re-enumerate them, or call their driver .remove() or .probe() methods. DPC is different because the hardware automatically disables the Link when it detects ERR_FATAL, which resets downstream devices. There's no opportunity for pci_error_handlers callbacks before resetting the Link. The DPC driver removes affected devices (which calls their driver .remove() methods), brings the Link back up, and re-enumerates (which calls driver .probe() methods). Align AER ERR_FATAL handling with DPC by resetting the Link in software, skipping the driver pci_error_handlers callbacks, removing the devices from the PCI subsystem, and re-enumerating. The idea is that drivers and devices should see the same behavior for ERR_FATAL events, regardless of whether they're handled by AER or DPC. Here are the basic ERR_FATAL recovery steps, showing the previous AER behavior, the AER behavior after this patch, and the DPC behavior: AER AER DPC previous new behavior -------- --- -------- Log error yes yes yes (minimal) drv.error_detected() yes no no Reset Link yes yes yes drv.mmio_enabled() yes no no drv.slot_reset() yes no no drv.resume() yes no no Remove PCI devices no yes yes (calls drv.remove()) Re-enumerate no yes yes (calls drv.probe()) N.B. With DPC, the Link reset happens before the driver .remove() calls, while with AER, the reset happens *after* the .remove() calls. The goal is to eventually do the reset before .remove() for AER as well. Signed-off-by: NOza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, squash doc patch into this, remove unused "result_data"] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
- 20 3月, 2018 1 次提交
-
-
由 Bjorn Helgaas 提交于
Remove pointless comments that tell us the file name, remove blank line comments, follow multi-line comment conventions. No functional change intended. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 23 2月, 2018 1 次提交
-
-
由 Frederick Lawler 提交于
Move pcieport_if.h from include/linux to drivers/pci/pcie/pcieport_if.h because the interfaces there are only used by the PCI core. Replace all uses of #include<linux/pcieport_if.h> with relative paths to the new file location, e.g., #include "../pcieport_if.h" Signed-off-by: NFrederick Lawler <fred@fredlawl.com> Signed-off-by: NBjorn Helgaas <helgaas@kernel.org>
-
- 29 1月, 2018 1 次提交
-
-
由 Bjorn Helgaas 提交于
Add SPDX GPL-2.0 to all PCI files that referred to the kernel default "COPYING" file, which specifies GPL version 2. Remove the boilerplate language referring to the GPL and "COPYING", relying on the assertion in b2441318 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license") that the SPDX identifier may be used instead of the full boilerplate text. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 19 1月, 2018 1 次提交
-
-
由 Frederick Lawler 提交于
Add PCI-specific dev_printk() wrappers and use them to simplify the code slightly. No functional change intended. Signed-off-by: NFrederick Lawler <fred@fredlawl.com> [bhelgaas: squash into one patch] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 01 8月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Move the error handler methods to struct pcie_port_service_driver and avoid the detour through the mostly unused pci_error_handlers structure. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 13 12月, 2016 3 次提交
-
-
由 Bjorn Helgaas 提交于
Add a log message when we enable AER on a Root Port and the hierarchy below it. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Bjorn Helgaas 提交于
All other AER-related log messages use the PCI device, e.g., "pci 0000:00:1c.0", not the PCIe service device, e.g., "aer 0000:00:1c.0:pcie02". Change the probe error messages to match the rest and include a little context. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Bjorn Helgaas 提交于
Remove the unused DRIVER_VERSION, DRIVER_AUTHOR, and DRIVER_DESC macros. The author information is already included in a comment above. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 30 9月, 2016 1 次提交
-
-
由 Cao jin 提交于
0516c8bc ("PCI: PCIe portdrv: Simplily probe callback of service drivers") removed the "id" argument of aer_probe() but neglected to remove the kernel-doc comment. Update the comment. [bhelgaas: changelog] Signed-off-by: NCao jin <caoj.fnst@cn.fujitsu.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 28 9月, 2016 1 次提交
-
-
由 Keith Busch 提交于
Save the position of the error reporting capability so it doesn't need to be rediscovered during error handling. Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> CC: Lukas Wunner <lukas@wunner.de>
-
- 15 9月, 2016 1 次提交
-
-
由 Bjorn Helgaas 提交于
Per the PCI Firmware spec, r3.0, sec 4.5.1, on ACPI systems, the OS must not use AER unless _OSC is present and _OSC grants AER control to the OS. The aerdriver.forceload kernel parameter was a way to enable Linux AER support on ACPI systems that lack _OSC or fail to grant control the the OS. Enabling Linux AER support when the firmware doesn't want us to is a recipe for problems, e.g., the firmware might be handling AER itself. Remove the aerdriver.forceload kernel parameter and related supporting code. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 25 8月, 2016 1 次提交
-
-
由 Paul Gortmaker 提交于
This code is not being built as a module by anyone: obj-$(CONFIG_PCIEAER) += aerdriver.o aerdriver-objs := aerdrv_errprint.o aerdrv_core.o aerdrv.o drivers/pci/pcie/aer/Kconfig:config PCIEAER drivers/pci/pcie/aer/Kconfig: bool "Root Port Advanced Error Reporting support" Remove uses of MODULE_DESCRIPTION(), MODULE_AUTHOR(), MODULE_LICENSE(), etc., so that when reading the driver there is no doubt it is builtin-only. The information is preserved in comments at the top of the file. Note that for non-modular code, module_init() translates to device_initcall(). [bhelgaas: changelog] Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> CC: Tom Long Nguyen <tom.l.nguyen@intel.com>
-
- 26 1月, 2016 1 次提交
-
-
A Root Port's AER structure (rpc) contains a queue of events. aer_irq() enqueues AER status information and schedules aer_isr() to dequeue and process it. When we remove a device, aer_remove() waits for the queue to be empty, then frees the rpc struct. But aer_isr() references the rpc struct after dequeueing and possibly emptying the queue, which can cause a use-after-free error as in the following scenario with two threads, aer_isr() on the left and a concurrent aer_remove() on the right: Thread A Thread B -------- -------- aer_irq(): rpc->prod_idx++ aer_remove(): wait_event(rpc->prod_idx == rpc->cons_idx) # now blocked until queue becomes empty aer_isr(): # ... rpc->cons_idx++ # unblocked because queue is now empty ... kfree(rpc) mutex_unlock(&rpc->rpc_mutex) To prevent this problem, use flush_work() to wait until the last scheduled instance of aer_isr() has completed before freeing the rpc struct in aer_remove(). I reproduced this use-after-free by flashing a device FPGA and re-enumerating the bus to find the new device. With SLUB debug, this crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in GPR25: pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000 Unable to handle kernel paging request for data at address 0x27ef9e3e Workqueue: events aer_isr GPR24: dd6aa000 6b6b6b6b 605f8378 605f8360 d99b12c0 604fc674 606b1704 d99b12c0 NIP [602f5328] pci_walk_bus+0xd4/0x104 [bhelgaas: changelog, stable tag] Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org
-