1. 21 9月, 2018 1 次提交
    • K
      PCI: portdrv: Initialize service drivers directly · c29de841
      Keith Busch 提交于
      The PCI port driver saves the PCI state after initializing the device with
      the applicable service devices.  This was, however, before the service
      drivers were even registered because PCI probe happens before the
      device_initcall initialized those service drivers.  The config space state
      that the services set up were not being saved.  The end result would cause
      PCI devices to not react to events that the drivers think they did if the
      PCI state ever needed to be restored.
      
      Fix this by changing the service drivers from using the init calls to
      having the portdrv driver calling the services directly.  This will get the
      state saved as desired, while making the relationship between the port
      driver and the services under it more explicit in the code.
      Signed-off-by: NKeith Busch <keith.busch@intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NSinan Kaya <okaya@kernel.org>
      c29de841
  2. 16 8月, 2018 1 次提交
  3. 01 8月, 2018 1 次提交
  4. 21 7月, 2018 5 次提交
  5. 20 7月, 2018 9 次提交
  6. 11 6月, 2018 7 次提交
  7. 08 6月, 2018 1 次提交
  8. 06 6月, 2018 1 次提交
  9. 18 5月, 2018 1 次提交
    • O
      PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices · 7e9084b3
      Oza Pawandeep 提交于
      PCIe ERR_FATAL errors mean the Link is unreliable.  Components on the Link
      may need to be reset to return to reliable operation (PCIe r4.0, sec
      6.2.2).  We previously handled these errors much differently depending on
      whether the platform supports Downstream Port Containment (DPC) (PCIe r4.0,
      sec 6.2.10) or not.
      
      The AER driver has historically logged the error details, called
      driver-supplied pci_error_handlers callbacks, and reset the Link.  This
      reset downstream devices, but did not remove them from the PCI subsystem,
      re-enumerate them, or call their driver .remove() or .probe() methods.
      
      DPC is different because the hardware automatically disables the Link when
      it detects ERR_FATAL, which resets downstream devices.  There's no
      opportunity for pci_error_handlers callbacks before resetting the Link.
      The DPC driver removes affected devices (which calls their driver .remove()
      methods), brings the Link back up, and re-enumerates (which calls driver
      .probe() methods).
      
      Align AER ERR_FATAL handling with DPC by resetting the Link in software,
      skipping the driver pci_error_handlers callbacks, removing the devices from
      the PCI subsystem, and re-enumerating.  The idea is that drivers and
      devices should see the same behavior for ERR_FATAL events, regardless of
      whether they're handled by AER or DPC.
      
      Here are the basic ERR_FATAL recovery steps, showing the previous AER
      behavior, the AER behavior after this patch, and the DPC behavior:
      
                                AER        AER      DPC
                                previous   new      behavior
                                --------   ---      --------
        Log error               yes        yes      yes (minimal)
        drv.error_detected()    yes        no       no
        Reset Link              yes        yes      yes
        drv.mmio_enabled()      yes        no       no
        drv.slot_reset()        yes        no       no
        drv.resume()            yes        no       no
        Remove PCI devices      no         yes      yes
          (calls drv.remove())
        Re-enumerate            no         yes      yes
          (calls drv.probe())
      
      N.B. With DPC, the Link reset happens before the driver .remove() calls,
      while with AER, the reset happens *after* the .remove() calls.  The goal is
      to eventually do the reset before .remove() for AER as well.
      Signed-off-by: NOza Pawandeep <poza@codeaurora.org>
      [bhelgaas: changelog, squash doc patch into this, remove unused
      "result_data"]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      7e9084b3
  10. 20 3月, 2018 1 次提交
  11. 23 2月, 2018 1 次提交
  12. 29 1月, 2018 1 次提交
  13. 19 1月, 2018 1 次提交
  14. 01 8月, 2017 1 次提交
  15. 13 12月, 2016 3 次提交
  16. 30 9月, 2016 1 次提交
  17. 28 9月, 2016 1 次提交
  18. 15 9月, 2016 1 次提交
    • B
      PCI/AER: Remove aerdriver.forceload kernel parameter · 7ece1417
      Bjorn Helgaas 提交于
      Per the PCI Firmware spec, r3.0, sec 4.5.1, on ACPI systems, the OS must
      not use AER unless _OSC is present and _OSC grants AER control to the OS.
      The aerdriver.forceload kernel parameter was a way to enable Linux AER
      support on ACPI systems that lack _OSC or fail to grant control the the OS.
      
      Enabling Linux AER support when the firmware doesn't want us to is a recipe
      for problems, e.g., the firmware might be handling AER itself.
      
      Remove the aerdriver.forceload kernel parameter and related supporting
      code.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      7ece1417
  19. 25 8月, 2016 1 次提交
    • P
      PCI/AER: Make explicitly non-modular · 8756336c
      Paul Gortmaker 提交于
      This code is not being built as a module by anyone:
      
        obj-$(CONFIG_PCIEAER) += aerdriver.o
        aerdriver-objs := aerdrv_errprint.o aerdrv_core.o aerdrv.o
      
        drivers/pci/pcie/aer/Kconfig:config PCIEAER
        drivers/pci/pcie/aer/Kconfig:  bool "Root Port Advanced Error Reporting support"
      
      Remove uses of MODULE_DESCRIPTION(), MODULE_AUTHOR(), MODULE_LICENSE(),
      etc., so that when reading the driver there is no doubt it is builtin-only.
      The information is preserved in comments at the top of the file.
      
      Note that for non-modular code, module_init() translates to
      device_initcall().
      
      [bhelgaas: changelog]
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      CC: Tom Long Nguyen <tom.l.nguyen@intel.com>
      8756336c
  20. 26 1月, 2016 1 次提交
    • S
      PCI/AER: Flush workqueue on device remove to avoid use-after-free · 4ae2182b
      Sebastian Andrzej Siewior 提交于
      A Root Port's AER structure (rpc) contains a queue of events.  aer_irq()
      enqueues AER status information and schedules aer_isr() to dequeue and
      process it.  When we remove a device, aer_remove() waits for the queue to
      be empty, then frees the rpc struct.
      
      But aer_isr() references the rpc struct after dequeueing and possibly
      emptying the queue, which can cause a use-after-free error as in the
      following scenario with two threads, aer_isr() on the left and a
      concurrent aer_remove() on the right:
      
        Thread A                      Thread B
        --------                      --------
        aer_irq():
          rpc->prod_idx++
                                      aer_remove():
                                        wait_event(rpc->prod_idx == rpc->cons_idx)
                                        # now blocked until queue becomes empty
        aer_isr():                      # ...
          rpc->cons_idx++               # unblocked because queue is now empty
          ...                           kfree(rpc)
          mutex_unlock(&rpc->rpc_mutex)
      
      To prevent this problem, use flush_work() to wait until the last scheduled
      instance of aer_isr() has completed before freeing the rpc struct in
      aer_remove().
      
      I reproduced this use-after-free by flashing a device FPGA and
      re-enumerating the bus to find the new device.  With SLUB debug, this
      crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in
      GPR25:
      
        pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000
        Unable to handle kernel paging request for data at address 0x27ef9e3e
        Workqueue: events aer_isr
        GPR24: dd6aa000 6b6b6b6b 605f8378 605f8360 d99b12c0 604fc674 606b1704 d99b12c0
        NIP [602f5328] pci_walk_bus+0xd4/0x104
      
      [bhelgaas: changelog, stable tag]
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      CC: stable@vger.kernel.org
      4ae2182b