1. 02 9月, 2020 8 次提交
  2. 21 11月, 2019 4 次提交
  3. 16 8月, 2018 1 次提交
  4. 01 8月, 2018 1 次提交
  5. 21 7月, 2018 5 次提交
  6. 20 7月, 2018 9 次提交
  7. 11 6月, 2018 7 次提交
  8. 08 6月, 2018 1 次提交
  9. 06 6月, 2018 1 次提交
  10. 18 5月, 2018 1 次提交
    • O
      PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices · 7e9084b3
      Oza Pawandeep 提交于
      PCIe ERR_FATAL errors mean the Link is unreliable.  Components on the Link
      may need to be reset to return to reliable operation (PCIe r4.0, sec
      6.2.2).  We previously handled these errors much differently depending on
      whether the platform supports Downstream Port Containment (DPC) (PCIe r4.0,
      sec 6.2.10) or not.
      
      The AER driver has historically logged the error details, called
      driver-supplied pci_error_handlers callbacks, and reset the Link.  This
      reset downstream devices, but did not remove them from the PCI subsystem,
      re-enumerate them, or call their driver .remove() or .probe() methods.
      
      DPC is different because the hardware automatically disables the Link when
      it detects ERR_FATAL, which resets downstream devices.  There's no
      opportunity for pci_error_handlers callbacks before resetting the Link.
      The DPC driver removes affected devices (which calls their driver .remove()
      methods), brings the Link back up, and re-enumerates (which calls driver
      .probe() methods).
      
      Align AER ERR_FATAL handling with DPC by resetting the Link in software,
      skipping the driver pci_error_handlers callbacks, removing the devices from
      the PCI subsystem, and re-enumerating.  The idea is that drivers and
      devices should see the same behavior for ERR_FATAL events, regardless of
      whether they're handled by AER or DPC.
      
      Here are the basic ERR_FATAL recovery steps, showing the previous AER
      behavior, the AER behavior after this patch, and the DPC behavior:
      
                                AER        AER      DPC
                                previous   new      behavior
                                --------   ---      --------
        Log error               yes        yes      yes (minimal)
        drv.error_detected()    yes        no       no
        Reset Link              yes        yes      yes
        drv.mmio_enabled()      yes        no       no
        drv.slot_reset()        yes        no       no
        drv.resume()            yes        no       no
        Remove PCI devices      no         yes      yes
          (calls drv.remove())
        Re-enumerate            no         yes      yes
          (calls drv.probe())
      
      N.B. With DPC, the Link reset happens before the driver .remove() calls,
      while with AER, the reset happens *after* the .remove() calls.  The goal is
      to eventually do the reset before .remove() for AER as well.
      Signed-off-by: NOza Pawandeep <poza@codeaurora.org>
      [bhelgaas: changelog, squash doc patch into this, remove unused
      "result_data"]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NKeith Busch <keith.busch@intel.com>
      7e9084b3
  11. 20 3月, 2018 1 次提交
  12. 23 2月, 2018 1 次提交