1. 02 9月, 2020 23 次提交
  2. 21 11月, 2019 5 次提交
  3. 06 11月, 2019 1 次提交
  4. 26 5月, 2019 2 次提交
  5. 06 4月, 2019 1 次提交
    • R
      PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove() · 9546c366
      Rafael J. Wysocki 提交于
      [ Upstream commit 95c80bc6952b6a5badc7b702d23e5bf14d251e7c ]
      
      Dongdong reported a deadlock triggered by a hotplug event during a sysfs
      "remove" operation:
      
        pciehp 0000:00:0c.0:pcie004: Slot(0-1): Link Up
        # echo 1 > 0000:00:0c.0/remove
      
        PME and hotplug share an MSI/MSI-X vector.  The sysfs "remove" side is:
      
          remove_store
             pci_stop_and_remove_bus_device_locked
      	 pci_lock_rescan_remove
      	 pci_stop_and_remove_bus_device
      	   ...
      	   pcie_pme_remove
      	     pcie_pme_suspend
      	       synchronize_irq        # wait for hotplug IRQ handler
      	 pci_unlock_rescan_remove
      
        The hotplug side is:
      
          pciehp_ist
             pciehp_handle_presence_or_link_change
      	 pciehp_configure_device
      	   pci_lock_rescan_remove     # wait for pci_unlock_rescan_remove()
      
        INFO: task bash:10913 blocked for more than 120 seconds.
      
        # ps -ax |grep D
         PID TTY      STAT   TIME COMMAND
        10913 ttyAMA0  Ds+    0:00 -bash
        14022 ?        D      0:00 [irq/745-pciehp]
      
        # cat /proc/14022/stack
        __switch_to+0x94/0xd8
        pci_lock_rescan_remove+0x20/0x28
        pciehp_configure_device+0x30/0x140
        pciehp_handle_presence_or_link_change+0x324/0x458
        pciehp_ist+0x1dc/0x1e0
      
        # cat /proc/10913/stack
        __switch_to+0x94/0xd8
        synchronize_irq+0x8c/0xc0
        pcie_pme_suspend+0xa4/0x118
        pcie_pme_remove+0x20/0x40
        pcie_port_remove_service+0x3c/0x58
        ...
        pcie_port_device_remove+0x2c/0x48
        pcie_portdrv_remove+0x68/0x78
        pci_device_remove+0x48/0x120
        ...
        pci_stop_bus_device+0x84/0xc0
        pci_stop_and_remove_bus_device_locked+0x24/0x40
        remove_store+0xa4/0xb8
        dev_attr_store+0x44/0x60
        sysfs_kf_write+0x58/0x80
      
      It is incorrect to call pcie_pme_suspend() from pcie_pme_remove() for two
      reasons.
      
      First, pcie_pme_suspend() calls synchronize_irq(), which will wait for the
      native hotplug interrupt handler as well as for the PME one, because they
      share one IRQ (as per the spec).  That may deadlock if hotplug is signaled
      while pcie_pme_remove() is running and the latter calls
      pci_lock_rescan_remove() before the former.
      
      Second, if pcie_pme_suspend() figures out that wakeup needs to be enabled
      for the port, it will return without disabling the interrupt as expected by
      pcie_pme_remove() which was overlooked by commit c7b5a4e6 ("PCI / PM:
      Fix native PME handling during system suspend/resume").
      
      To fix that, rework pcie_pme_remove() to disable the PME interrupt, clear
      its status and prevent the PME worker function from re-enabling it before
      calling free_irq() on it, which should be sufficient.
      
      Fixes: c7b5a4e6 ("PCI / PM: Fix native PME handling during system suspend/resume")
      Link: https://lore.kernel.org/linux-pci/c7697e7c-e1af-13e4-8491-0a3996e6ab5d@huawei.comReported-by: NDongdong Liu <liudongdong3@huawei.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      [bhelgaas: add URL and deadlock details from Dongdong]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9546c366
  6. 24 3月, 2019 1 次提交
  7. 14 11月, 2018 1 次提交
    • L
      PCI/ASPM: Fix link_state teardown on device removal · 1e37e70d
      Lukas Wunner 提交于
      commit aeae4f3e upstream.
      
      Upon removal of the last device on a bus, the link_state of the bridge
      leading to that bus is sought to be torn down by having pci_stop_dev()
      call pcie_aspm_exit_link_state().
      
      When ASPM was originally introduced by commit 7d715a6c ("PCI: add
      PCI Express ASPM support"), it determined whether the device being
      removed is the last one by calling list_empty() on the bridge's
      subordinate devices list.  That didn't work because the device is only
      removed from the list slightly later in pci_destroy_dev().
      
      Commit 3419c75e ("PCI: properly clean up ASPM link state on device
      remove") attempted to fix it by calling list_is_last(), but that's not
      correct either because it checks whether the device is at the *end* of
      the list, not whether it's the last one *left* in the list.  If the user
      removes the device which happens to be at the end of the list via sysfs
      but other devices are preceding the device in the list, the link_state
      is torn down prematurely.
      
      The real fix is to move the invocation of pcie_aspm_exit_link_state() to
      pci_destroy_dev() and reinstate the call to list_empty().  Remove a
      duplicate check for dev->bus->self because pcie_aspm_exit_link_state()
      already contains an identical check.
      
      Fixes: 7d715a6c ("PCI: add PCI Express ASPM support")
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Shaohua Li <shaohua.li@intel.com>
      Cc: stable@vger.kernel.org # v2.6.26
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e37e70d
  8. 16 8月, 2018 1 次提交
  9. 07 8月, 2018 1 次提交
  10. 01 8月, 2018 3 次提交
    • B
      PCI/AER: Remove duplicate PCI_EXP_AER_FLAGS definition · 944d5859
      Bjorn Helgaas 提交于
      PCI_EXP_AER_FLAGS was defined twice (with identical definitions), once
      under #ifdef CONFIG_ACPI_APEI, and again at the top level.  This looks like
      my merge error from these commits:
      
        fd3362cb ("PCI/AER: Squash aerdrv_core.c into aerdrv.c")
        41cbc9eb ("PCI/AER: Squash ecrc.c into aerdrv.c")
      
      Remove the duplicate PCI_EXP_AER_FLAGS definition.
      
      Fixes: 41cbc9eb ("PCI/AER: Squash ecrc.c into aerdrv.c")
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NOza Pawandeep <poza@codeaurora.org>
      944d5859
    • L
      PCI: pciehp: Clear spurious events earlier on resume · 79037824
      Lukas Wunner 提交于
      Thunderbolt hotplug ports that were occupied before system sleep resume
      with their downstream link in "off" state.  Only after the Thunderbolt
      controller has reestablished the PCIe tunnels does the link go up.
      As a result, a spurious Presence Detect Changed and/or Data Link Layer
      State Changed event occurs.
      
      The events are not immediately acted upon because tunnel reestablishment
      happens in the ->resume_noirq phase, when interrupts are still disabled.
      Also, notification of events may initially be disabled in the Slot
      Control register when coming out of system sleep and is reenabled in the
      ->resume_noirq phase through:
      
        pci_pm_resume_noirq()
          pci_pm_default_resume_early()
            pci_restore_state()
              pci_restore_pcie_state()
      
      It is not guaranteed that the events are acted upon at all:  PCIe r4.0,
      sec 6.7.3.4 says that "a port may optionally send an MSI when there are
      hot-plug events that occur while interrupt generation is disabled, and
      interrupt generation is subsequently enabled."  Note the "optionally".
      
      If an MSI is sent, pciehp will gratuitously turn the slot off and back
      on once the ->resume_early phase has commenced.
      
      If an MSI is not sent, the extant, unacknowledged events in the Slot
      Status register will prevent future notification of presence or link
      changes.
      
      Commit 13c65840 ("PCI: pciehp: Clear Presence Detect and Data Link
      Layer Status Changed on resume") fixed the latter by clearing the events
      in the ->resume phase.  Move this to the ->resume_noirq phase to also
      fix the gratuitous disable/enablement of the slot.
      
      The commit further restored the Slot Control register in the ->resume
      phase, but that's dispensable because as shown above it's already been
      done in the ->resume_noirq phase.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
      79037824
    • L
      PCI: portdrv: Deduplicate PM callback iterator · 6ccb127b
      Lukas Wunner 提交于
      Replace suspend_iter() and resume_iter() with a single function pm_iter()
      to allow addition of port service callbacks for further power management
      phases without having to add another iterator each time.
      
      No functional change intended.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      6ccb127b
  11. 27 7月, 2018 1 次提交
    • T
      PCI/AER: Work around use-after-free in pcie_do_fatal_recovery() · bd91b56c
      Thomas Tai 提交于
      When an fatal error is received by a non-bridge device, the device is
      removed, and pci_stop_and_remove_bus_device() deallocates the device
      structure.  The freed device structure is used by subsequent code to send
      uevents and print messages.
      
      Hold a reference on the device until we're finished using it.  This is not
      an ideal fix because pcie_do_fatal_recovery() should not use the device at
      all after removing it, but that's too big a project for right now.
      
      Fixes: 7e9084b3 ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices")
      Signed-off-by: NThomas Tai <thomas.tai@oracle.com>
      [bhelgaas: changelog, reduce get/put coverage]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      bd91b56c