- 01 8月, 2018 3 次提交
-
-
由 Lukas Wunner 提交于
Ensure accessibility of a hotplug port's config space when accessed via sysfs by resuming its parent to D0. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org>
-
由 Lukas Wunner 提交于
If a hotplug port is able to send an interrupt, one would naively assume that it is accessible at that moment. After all, if it wouldn't be accessible, i.e. if its parent is in D3hot and the link to the hotplug port is thus down, how should an interrupt come through? It turns out that assumption is wrong at least for Thunderbolt: Even though its parents are in D3hot, a Thunderbolt hotplug port is able to signal interrupts. Because the port's config space is inaccessible and resuming the parents may sleep, the hard IRQ handler has to defer runtime resuming the parents and reading the Slot Status register to the IRQ thread. If the hotplug port uses a level-triggered INTx interrupt, it needs to be masked until the IRQ thread has cleared the signaled events. For simplicity, this commit also masks edge-triggered MSI/MSI-X interrupts. Note that if the interrupt is shared (which can only happen for INTx), other devices are starved from receiving interrupts until the IRQ thread is scheduled, has runtime resumed the hotplug port's parents and has read and cleared the Slot Status register. That delay is dominated by the 10 ms D3hot->D0 transition time of each parent port. The worst case is a Thunderbolt downstream port at the end of a daisy chain: There may be up to six Thunderbolt controllers in-between it and the root port, each comprising an upstream and downstream port, plus its own upstream port. That's 13 x 10 = 130 ms. Possible mitigations are polling the interrupt while it's disabled or reducing the d3_delay of Thunderbolt ports if possible. Open code masking of the interrupt instead of requesting it with the IRQF_ONESHOT flag to minimize the period during which it is masked. (IRQF_ONESHOT unmasks the IRQ only after the IRQ thread has finished.) PCIe r4.0 sec 6.7.3.4 states that "If wake generation is required by the associated form factor specification, a hotplug capable Downstream Port must support generation of a wakeup event (using the PME mechanism) on hotplug events that occur when the system is in a sleep state or the Port is in device state D1, D2, or D3Hot." This would seem to imply that PME needs to be enabled on the hotplug port when it is runtime suspended. pci_enable_wake() currently doesn't enable PME on bridges, it may be necessary to add an exemption for hotplug bridges there. On "Light Ridge" Thunderbolt controllers, the PME_Status bit is not set when an interrupt occurs while the hotplug port is in D3hot, even if PME is enabled. (I've tested this on a Mac and we hardcode the OSC_PCI_EXPRESS_PME_CONTROL bit to 0 on Macs in negotiate_os_control(), modifying it to 1 didn't change the behavior.) (Side note: Section 6.7.3.4 also states that "PME and Hot-Plug Event interrupts (when both are implemented) always share the same MSI or MSI-X vector". That would only seem to apply to Root Ports, however the section never mentions Root Ports, only Downstream Ports. This is explained in the definition of "Downstream Port" in the "Terms and Acronyms" section of the PCIe Base Spec: "The Ports on a Switch that are not the Upstream Port are Downstream Ports. All Ports on a Root Complex are Downstream Ports.") Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org>
-
由 Lukas Wunner 提交于
Thunderbolt hotplug ports that were occupied before system sleep resume with their downstream link in "off" state. Only after the Thunderbolt controller has reestablished the PCIe tunnels does the link go up. As a result, a spurious Presence Detect Changed and/or Data Link Layer State Changed event occurs. The events are not immediately acted upon because tunnel reestablishment happens in the ->resume_noirq phase, when interrupts are still disabled. Also, notification of events may initially be disabled in the Slot Control register when coming out of system sleep and is reenabled in the ->resume_noirq phase through: pci_pm_resume_noirq() pci_pm_default_resume_early() pci_restore_state() pci_restore_pcie_state() It is not guaranteed that the events are acted upon at all: PCIe r4.0, sec 6.7.3.4 says that "a port may optionally send an MSI when there are hot-plug events that occur while interrupt generation is disabled, and interrupt generation is subsequently enabled." Note the "optionally". If an MSI is sent, pciehp will gratuitously turn the slot off and back on once the ->resume_early phase has commenced. If an MSI is not sent, the extant, unacknowledged events in the Slot Status register will prevent future notification of presence or link changes. Commit 13c65840 ("PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume") fixed the latter by clearing the events in the ->resume phase. Move this to the ->resume_noirq phase to also fix the gratuitous disable/enablement of the slot. The commit further restored the Slot Control register in the ->resume phase, but that's dispensable because as shown above it's already been done in the ->resume_noirq phase. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
-
- 31 7月, 2018 1 次提交
-
-
由 Lukas Wunner 提交于
The ->reset_slot callback introduced by commits: 2e35afae ("PCI: pciehp: Add reset_slot() method") and 06a8d89a ("PCI: pciehp: Disable link notification across slot reset") disables notification of Presence Detect Changed and Data Link Layer State Changed events for the duration of a secondary bus reset. However a bus reset not only triggers these events, but may also clear the Presence Detect State bit in the Slot Status register and the Data Link Layer Link Active bit in the Link Status register momentarily. According to Sinan Kaya: "I know for a fact that bus reset clears the Data Link Layer Active bit as soon as link goes down. It gets set again following link up. Presence detect depends on the HW implementation. QDT root ports don't change presence detect for instance since nobody actually removed the card. If an implementation supports in-band presence detect, the answer is yes. As soon as the link goes down, presence detect bit will get cleared until recovery." https://lkml.kernel.org/r/42e72f83-3b24-f7ef-e5bc-290fae99259a@codeaurora.org In-band presence detect is also covered in Table 4-15 in PCIe r4.0, sec 4.2.6. pciehp should therefore ensure that any parts of the driver that access those bits do not run concurrently to a bus reset. The only precaution the commits took to that effect was to halt interrupt polling. They made no effort to drain the slot workqueue, cancel an outstanding Attention Button work, or block slot enable/disable requests via sysfs and in the ->probe hook. Now that pciehp is converted to enable/disable the slot exclusively from the IRQ thread, the only places accessing the two above-mentioned bits are the IRQ thread and the ->probe hook. Add locking to serialize them with a bus reset. This obviates the need to halt interrupt polling. Do not add locking to the ->get_adapter_status sysfs callback to afford users unfettered access to that bit. Use an rw_semaphore in lieu of a regular mutex to allow parallel execution of the non-reset code paths accessing the critical bits, i.e. the IRQ thread and the ->probe hook. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Rajat Jain <rajatja@google.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Sinan Kaya <okaya@kernel.org>
-
- 24 7月, 2018 12 次提交
-
-
由 Lukas Wunner 提交于
Per PCIe r4.0, sec 6.7.3.4, a "port may optionally send an MSI when there are hot-plug events that occur while interrupt generation is disabled, and interrupt generation is subsequently enabled." On probe, we currently clear all event bits in the Slot Status register with the notable exception of the Presence Detect Changed bit. Thereby we seek to receive an interrupt for an already occupied slot once event notification is enabled. But because the interrupt is optional, users may have to specify the pciehp_force parameter on the command line, which is inconvenient. Moreover, now that pciehp's event handling has become resilient to missed events, a Presence Detect Changed interrupt for a slot which is powered on is interpreted as removal of the card. If the slot has already been brought up by the BIOS, receiving such an interrupt on probe causes the slot to be powered off and immediately back on, which is likewise undesirable. Avoid both issues by making the behavior of pciehp_force the default and clearing the Presence Detect Changed bit on probe. Note that the stated purpose of pciehp_force per the MODULE_PARM_DESC ("Force pciehp, even if OSHP is missing") seems nonsensical because the OSHP control method is only relevant for SHCP slots according to the PCI Firmware specification r3.0, sec 4.8. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
-
由 Lukas Wunner 提交于
A hotplug port's Slot Status register does not count how often each type of event occurred, it only records the fact *that* an event has occurred. Previously pciehp queued a work item for each event. But if it missed an event, e.g. removal of a card in-between two back-to-back insertions, it queued up the wrong work item or no work item at all. Commit fad214b0 ("PCI: pciehp: Process all hotplug events before looking for new ones") sought to improve the situation by shrinking the window during which events may be missed. But Stefan Roese reports unbalanced Card present and Link Up events, suggesting that we're still missing events if they occur very rapidly. Bjorn Helgaas responds that he considers pciehp's event handling "baroque" and calls for its simplification and rationalization: https://lkml.kernel.org/r/20180202192045.GA53759@bhelgaas-glaptop.roam.corp.google.com It gets worse once a hotplug port is runtime suspended: The port can signal an interrupt while it and its parents are in D3hot, i.e. while it is inaccessible. By the time we've runtime resumed all parents to D0 and read the port's Slot Status register, we may have missed an arbitrary number of events. Event handling therefore needs to be reworked to become resilient to missed events. Assume that a Presence Detect Changed event has occurred. Consider the following truth table: - Slot is in OFF_STATE and is currently empty. => Do nothing. (The event is trailing a Link Down or we've missed an insertion and subsequent removal.) - Slot is in OFF_STATE and is currently occupied. => Turn the slot on. - Slot is in ON_STATE and is currently empty. => Turn the slot off. - Slot is in ON_STATE and is currently occupied. => Turn the slot off, (Be cautious and assume the card in then back on. the slot isn't the same as before.) This leads to the following simple algorithm: 1 If the slot is in ON_STATE, turn it off unconditionally. 2 If the slot is currently occupied, turn it on. Because those actions are now carried out synchronously, rather than by scheduled work items, pciehp reacts to the *current* situation and missed events no longer matter. Data Link Layer State Changed events can be handled identically to Presence Detect Changed events. Note that in the above truth table, a Link Up trailing a Card present event didn't have to be accounted for: It is filtered out by pciehp_check_link_status(). As for Attention Button Pressed events, PCIe r4.0, sec 6.7.1.5 says: "Once the Power Indicator begins blinking, a 5-second abort interval exists during which a second depression of the Attention Button cancels the operation." In other words, the user can only expect the system to react to a button press after it starts blinking. Missed button presses that occur in-between are irrelevant. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Stefan Roese <sr@denx.de> Cc: Mayurkumar Patel <mayurkumar.patel@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
-
由 Lukas Wunner 提交于
When a device is hotplugged, Presence Detect and Link Up events often do not occur simultaneously, but with a lag of a few milliseconds. Only the first event received is relevant, the other one can be disregarded. Moreover, Stefan Roese reports that on certain platforms, Link State and Presence Detect may flap for up to 100 ms before stabilizing, suggesting that such events should be disregarded for at least this long: https://lkml.kernel.org/r/20180130084121.18653-1-sr@denx.de On slot enablement, pciehp_check_link_status() waits for 100 ms per PCIe r4.0, sec 6.7.3.3, then probes the hotplugged device's vendor register for up to 1 second. If this succeeds, the link is definitely up, so ignore any Presence Detect or Link State events that occurred up to this point. pciehp_check_link_status() then checks the Link Training bit in the Link Status register. This is the final opportunity to detect inaccessibility of the device and abort slot enablement. Any link or presence change that occurs afterwards will cause the slot to be disabled again immediately after attempting to enable it. The astute reviewer may appreciate that achieving this behavior would be more complicated had pciehp not just been converted to enable/disable the slot exclusively from the IRQ thread: When the slot is enabled via sysfs, each link or presence flap would otherwise cause the IRQ thread to run and it would have to sense that those events are belonging to a concurrent slot enablement operation and disregard them. It would be much more difficult than this mere 3 line change. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Stefan Roese <sr@denx.de>
-
由 Lukas Wunner 提交于
Previously slot enablement and disablement could happen concurrently. But now it's under the exclusive control of the IRQ thread, rendering the locking obsolete. Drop it. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Lukas Wunner 提交于
Besides the IRQ thread, there are several other places in the driver which enable or disable the slot: - pciehp_probe() enables the slot if it's occupied and the pciehp_force module parameter is used. - pciehp_resume() enables or disables the slot after system sleep. - pciehp_queue_pushbutton_work() enables or disables the slot after the 5 second delay following an Attention Button press. - pciehp_sysfs_enable_slot() and pciehp_sysfs_disable_slot() enable or disable the slot on sysfs write. This requires locking and complicates pciehp's state machine. A simplification can be achieved by enabling and disabling the slot exclusively from the IRQ thread. Amend the functions listed above to request slot enable/disablement from the IRQ thread by either synthesizing a Presence Detect Changed event or, in the case of a disable user request (via sysfs or an Attention Button press), submitting a newly introduced force disable request. The latter is needed because the slot shall be forced off despite being occupied. For this force disable request, avoid colliding with Slot Status register bits by using a bit number greater than 16. For synchronous execution of requests (on sysfs write), wait for the request to finish and retrieve the result. There can only ever be one sysfs write in flight due to the locking in kernfs_fop_write(), hence there is no risk of returning the result of a different sysfs request to user space. The POWERON_STATE and POWEROFF_STATE is now no longer entered by the above-listed functions, but solely by the IRQ thread when it begins a power transition. Afterwards, it moves to STATIC_STATE. The same applies to canceling the Attention Button work, it likewise becomes an IRQ thread only operation. An immediate consequence is that the POWERON_STATE and POWEROFF_STATE is never observed by the IRQ thread itself, only by functions called in a different context, such as pciehp_sysfs_enable_slot(). So remove handling of these states from pciehp_handle_button_press() and pciehp_handle_link_change() which are exclusively called from the IRQ thread. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Lukas Wunner 提交于
handle_button_press_event() currently determines whether the slot has been turned on or off by looking at the Power Controller Control bit in the Slot Control register. This assumes that an attention button implies presence of a power controller even though that's not mandated by the spec. Moreover the Power Controller Control bit is unreliable when a power fault occurs (PCIe r4.0, sec 6.7.1.8). This issue has existed since the driver was introduced in 2004. Fix by replacing STATIC_STATE with ON_STATE and OFF_STATE and tracking whether the slot has been turned on or off. This is also a required ingredient to make pciehp resilient to missed events, which is the object of an upcoming commit. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Lukas Wunner 提交于
Previously the slot workqueue was used to handle events and enable or disable the slot. That's no longer the case as those tasks are done synchronously in the IRQ thread. The slot workqueue is thus merely used to handle a button press after the 5 second delay and only one such work item may be in flight at any given time. A separate workqueue isn't necessary for this simple task, so use the system workqueue instead. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Lukas Wunner 提交于
Up until now, pciehp's IRQ handler schedules a work item for each event, which in turn schedules a work item to enable or disable the slot. This double indirection was necessary because sleeping wasn't allowed in the IRQ handler. However it is now that pciehp has been converted to threaded IRQ handling and polling, so handle events synchronously in pciehp_ist() and remove the work item infrastructure (with the exception of work items to handle a button press after the 5 second delay). For link or presence change events, move the register read to determine the current link or presence state behind acquisition of the slot lock to prevent it from becoming stale while the lock is contended. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Lukas Wunner 提交于
We've just converted pciehp to threaded IRQ handling, but still cannot sleep in pciehp_ist() because the function is also called in poll mode, which runs in softirq context (from a timer). Convert poll mode to a kthread so that pciehp_ist() always runs in task context. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
由 Lukas Wunner 提交于
pciehp's IRQ handler queues up a work item for each event signaled by the hardware. A more modern alternative is to let a long running kthread service the events. The IRQ handler's sole job is then to check whether the IRQ originated from the device in question, acknowledge its receipt to the hardware to quiesce the interrupt and wake up the kthread. One benefit is reduced latency to handle the IRQ, which is a necessity for realtime environments. Another benefit is that we can make pciehp simpler and more robust by handling events synchronously in process context, rather than asynchronously by queueing up work items. pciehp's usage of work items is a historic artifact, it predates the introduction of threaded IRQ handlers by two years. (The former was introduced in 2007 with commit 5d386e1a ("pciehp: Event handling rework"), the latter in 2009 with commit 3aa551c9 ("genirq: add threaded interrupt handler support").) Convert pciehp to threaded IRQ handling by retrieving the pending events in pciehp_isr(), saving them for later consumption by the thread handler pciehp_ist() and clearing them in the Slot Status register. By clearing the Slot Status (and thereby acknowledging the events) in pciehp_isr(), we can avoid requesting the IRQ with IRQF_ONESHOT, which would have the unpleasant side effect of starving devices sharing the IRQ until pciehp_ist() has finished. pciehp_isr() does not count how many times each event occurred, but merely records the fact *that* an event occurred. If the same event occurs a second time before pciehp_ist() is woken, that second event will not be recorded separately, which is problematic according to commit fad214b0 ("PCI: pciehp: Process all hotplug events before looking for new ones") because we may miss removal of a card in-between two back-to-back insertions. We're about to make pciehp_ist() resilient to missed events. The present commit regresses the driver's behavior temporarily in order to separate the changes into reviewable chunks. This doesn't affect regular slow-motion hotplug, only plug-unplug-plug operations that happen in a timespan shorter than wakeup of the IRQ thread. Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mayurkumar Patel <mayurkumar.patel@intel.com> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
-
由 Lukas Wunner 提交于
Commit b440bde7 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device") iterates over the devices on a hotplug port's subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem. It is thus possible for a user to cause a crash by concurrently manipulating the device list, e.g. by disabling slot power via sysfs on a different CPU or by initiating a remove/rescan via sysfs. This can't be fixed by acquiring pci_bus_sem because it may sleep. The simplest fix is to avoid the list iteration altogether and just check the ignore_hotplug flag on the port itself. This works because pci_ignore_hotplug() sets the flag both on the device as well as on its parent bridge. We do lose the ability to print the name of the device blocking hotplug in the debug message, but that's probably bearable. Fixes: b440bde7 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device") Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
-
由 Lukas Wunner 提交于
When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the hotplug_slot struct is deregistered and thus freed before freeing the IRQ. The IRQ handler and the work items it schedules print the slot name referenced from the freed structure in various informational and debug log messages, each time resulting in a quadruple dereference of freed pointers (hotplug_slot -> pci_slot -> kobject -> name). At best the slot name is logged as "(null)", at worst kernel memory is exposed in logs or the driver crashes: pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present An attacker may provoke the bug by unplugging multiple devices on a Thunderbolt daisy chain at once. Unplugging can also be simulated by powering down slots via sysfs. The bug is particularly easy to trigger in poll mode. It has been present since the driver's introduction in 2004: https://git.kernel.org/tglx/history/c/c16b4b14d980 Fix by rearranging teardown such that the IRQ is freed first. Run the work items queued by the IRQ handler to completion before freeing the hotplug_slot struct by draining the work queue from the ->release_slot callback which is invoked by pci_hp_deregister(). Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v2.6.4
-
- 24 5月, 2018 1 次提交
-
-
由 Mika Westerberg 提交于
After a suspend/resume cycle the Presence Detect or Data Link Layer Status Changed bits might be set. If we don't clear them those events will not fire anymore and nothing happens for instance when a device is now hot-unplugged. Fix this by clearing those bits in a newly introduced function pcie_reenable_notification(). This should be fine because immediately after, we check if the adapter is still present by reading directly from the status register. Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org
-
- 18 5月, 2018 1 次提交
-
-
由 Oza Pawandeep 提交于
Clients such as hotplug and Downstream Port Containment (DPC) both need to wait until a link becomes active or inactive. Add a generic pcie_wait_link_active() interface and use it instead of duplicating the code. Signed-off-by: NOza Pawandeep <poza@codeaurora.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
- 08 5月, 2018 1 次提交
-
-
由 Bjorn Helgaas 提交于
Several PCIe hotplug controllers have errata that mean they do not set the Command Completed bit unless writes to the Slot Command register change "Control" bits. Command Completed is never set for writes that only change software notification "Enable" bits. This results in timeouts like this: pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038 (issued 65284 msec ago) When this erratum is present, avoid these timeouts by marking commands "completed" immediately unless they change the "Control" bits. Here's the text of the Intel erratum CF118. We assume this applies to all Intel parts: CF118 PCIe Slot Status Register Command Completed bit not always updated on any configuration write to the Slot Control Register Problem: For PCIe root ports (devices 0 - 10) supporting hot-plug, the Slot Status Register (offset AAh) Command Completed (bit[4]) status is updated under the following condition: IOH will set Command Completed bit after delivering the new commands written in the Slot Controller register (offset A8h) to VPP. The IOH detects new commands written in Slot Control register by checking the change of value for Power Controller Control (bit[10]), Power Indicator Control (bits[9:8]), Attention Indicator Control (bits[7:6]), or Electromechanical Interlock Control (bit[11]) fields. Any other configuration writes to the Slot Control register without changing the values of these fields will not cause Command Completed bit to be set. The PCIe Base Specification Revision 2.0 or later describes the “Slot Control Register” in section 7.8.10, as follows (Reference section 7.8.10, Slot Control Register, Offset 18h). In hot-plug capable Downstream Ports, a write to the Slot Control register must cause a hot-plug command to be generated (see Section 6.7.3.2 for details on hot-plug commands). A write to the Slot Control register in a Downstream Port that is not hotplug capable must not cause a hot-plug command to be executed. The PCIe Spec intended that every write to the Slot Control Register is a command and expected a command complete status to abstract the VPP implementation specific nuances from the OS software. IOH PCIe Slot Control Register implementation is not fully conforming to the PCIe Specification in this respect. Implication: Software checking on the Command Completed status after writing to the Slot Control register may time out. Workaround: Software can read the Slot Control register and compare the existing and new values to determine if it should check the Command Completed status after writing to the Slot Control register. Per Sinan, the Qualcomm QDF2400 controller also does not set the Command Completed bit unless writes to the Slot Command register change "Control" bits. Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html Link: https://lkml.kernel.org/r/8770820b-85a0-172b-7230-3a44524e6c9f@molgen.mpg.de Reported-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de> # Lenovo X60 Tested-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de> # Lenovo X60 Signed-off-by: Sinan Kaya <okaya@codeaurora.org> # Qcom quirk Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
- 29 1月, 2018 1 次提交
-
-
由 Bjorn Helgaas 提交于
Add SPDX GPL-2.0+ to all PCI files that specified the GPL and allowed either GPL version 2 or any later version. Remove the boilerplate GPL version 2 or later language, relying on the assertion in b2441318 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license") that the SPDX identifier may be used instead of the full boilerplate text. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 1月, 2018 1 次提交
-
-
由 Lukas Wunner 提交于
Certain Thunderbolt 1 controllers claim to support Command Completed events (value of 0b in the No Command Completed Support field of the Slot Capabilities register) but in reality they neither set the Command Completed bit in the Slot Status register nor signal a Command Completed interrupt: 8086:1513 CV82524 [Light Ridge 4C 2010] 8086:151a DSL2310 [Eagle Ridge 2C 2011] 8086:151b CVL2510 [Light Peak 2C 2010] 8086:1547 DSL3510 [Cactus Ridge 4C 2012] 8086:1548 DSL3310 [Cactus Ridge 2C 2012] 8086:1549 DSL2210 [Port Ridge 1C 2011] All known newer chips (Redwood Ridge and onwards) set No Command Completed Support, indicating that they do not support Command Completed events. The user-visible impact is that after unplugging such a device, 2 seconds elapse until pciehp is unbound. That's because on ->remove, pcie_write_cmd() is called via pcie_disable_notification() and every call to pcie_write_cmd() takes 2 seconds (1 second for each invocation of pcie_wait_cmd()): [ 337.942727] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x1038 (issued 21176 msec ago) [ 340.014735] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x0000 (issued 2072 msec ago) That by itself has always been unpleasant, but the situation has become worse with commit cc27b735 ("PCI/portdrv: Turn off PCIe services during shutdown"): Now pciehp is unbound on ->shutdown. Because Thunderbolt controllers typically have 4 hotplug ports, every reboot and shutdown is now delayed by 8 seconds, plus another 2 seconds for every attached Thunderbolt 1 device. Thunderbolt hotplug slots are not physical slots that one inserts cards into, but rather logical hotplug slots implemented in silicon. Devices appear beyond those logical slots once a PCI tunnel is established on top of the Thunderbolt Converged I/O switch. One would expect commands written to the Slot Control register to be executed immediately by the silicon, so for simplicity we always assume NoCompl+ for Thunderbolt ports. Fixes: cc27b735 ("PCI/portdrv: Turn off PCIe services during shutdown") Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org # v4.12+ Cc: Sinan Kaya <okaya@codeaurora.org> Cc: Yehezkel Bernat <yehezkel.bernat@intel.com> Cc: Michael Jamet <michael.jamet@intel.com> Cc: Andreas Noever <andreas.noever@gmail.com>
-
- 17 1月, 2018 1 次提交
-
-
由 Markus Elfring 提交于
Per ebfdc409 ("checkpatch: attempt to find unnecessary 'out of memory' messages"), when a memory allocation fails, the memory subsystem emits generic "out of memory" messages (see slab_out_of_memory() for some of this logging). Therefore, additional error messages in the caller don't add much value. Remove messages that merely report "out of memory". This preserves some messages that report additional information, e.g., allocation failures that mean we drop hotplug events. This issue was detected by using the Coccinelle software. Signed-off-by: NMarkus Elfring <elfring@users.sourceforge.net> [bhelgaas: changelog, squash patches, make similar changes to acpiphp, cpqphp, ibmphp, keep warning when dropping hotplug event] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 07 11月, 2017 3 次提交
-
-
由 Mika Westerberg 提交于
It is possible that the hotplug event has already happened before the driver is attached to a PCIe hotplug downstream port. If we just clear the status we never get the hotplug interrupt and thus the event will be missed. To make sure that does not happen, we leave Presence Detect Changed bit untouched during initialization. Then once the event is unmasked we get an interrupt and handle the hotplug event properly. Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Mika Westerberg 提交于
A surprise link down may retrain very quickly causing the same slot generate a link up event before handling the link down event completes. Since the link is active, the power off work queued from the first link down will cause a second down event when power is disabled. However, the link up event sets the slot state to POWERON_STATE before the event to handle this is enqueued, making the second down event believe it needs to do something. This creates constant link up and down event cycle. To prevent this it is better to handle each event at the time in order it occurred, so change the driver to use ordered workqueue instead. A normal device hotplug triggers two events (presense detect and link up) that are already handled properly in the driver but we currently log an error if we find an existing device in the slot. Since this is not an error change the log level to be debug instead to avoid scaring users. This is based on the original work by Ashok Raj. Link: https://patchwork.kernel.org/patch/9469023Suggested-by: NBjorn Helgaas <bhelgaas@google.com> Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Kees Cook 提交于
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. This fixes what appears to be a bug in passing the wrong pointer to the timer handler (address of ctrl pointer instead of ctrl pointer). Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Mayurkumar Patel <mayurkumar.patel@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de>
-
- 16 8月, 2017 1 次提交
-
-
由 Keith Busch 提交于
When a power fault occurs, the power controller sets Power Fault Detected in the Slot Status register, and pciehp_isr() queues an INT_POWER_FAULT event to handle it. It also clears Power Fault Detected, but since nothing has yet changed to correct the power fault, the power controller will likely set it again immediately, which may cause an infinite loop when pcie_isr() rechecks Slot Status. Fix that by masking off Power Fault Detected from new events if the driver hasn't seen the power fault clear from the previous handling attempt. Fixes: fad214b0 ("PCI: pciehp: Process all hotplug events before looking for new ones") Signed-off-by: NKeith Busch <keith.busch@intel.com> [bhelgaas: changelog, pull test out and add comment] Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Cc: Mayurkumar Patel <mayurkumar.patel@intel.com> Cc: stable@vger.kernel.org # 4.9+
-
- 08 12月, 2016 1 次提交
-
-
由 Ashok Raj 提交于
If Slot Status indicates changes in both Data Link Layer Status and Presence Detect, prioritize the Link status change. When both events are observed, pciehp currently relies on the Slot Status Presence Detect State (PDS) to agree with the Link Status Data Link Layer Active status. The Presence Detect State, however, may be set to 1 through out-of-band presence detect even if the link is down, which creates conflicting events. Since the Link Status accurately reflects the reachability of the downstream bus, the Link Status event should take precedence over a Presence Detect event. Skip checking the PDC status if we handled a link event in the same handler. Signed-off-by: NAshok Raj <ashok.raj@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NKeith Busch <keith.busch@intel.com>
-
- 23 9月, 2016 1 次提交
-
-
由 Keith Busch 提交于
PCIe hotplug supports optional Attention and Power Indicators, which are used internally by pciehp. Users can't control the Power Indicator, but they can control the Attention Indicator by writing to a sysfs "attention" file. The Slot Control register has two bits for each indicator, and the PCIe spec defines the encodings for each as (Reserved/On/Blinking/Off). For sysfs "attention" writes, pciehp_set_attention_status() maps into these encodings, so the only useful write values are 0 (Off), 1 (On), and 2 (Blinking). However, some platforms use all four bits for platform-specific indicators, and they need to allow direct user control of them while preventing pciehp from using them at all. Add a "hotplug_user_indicators" flag to the pci_dev structure. When set, pciehp does not use either the Attention Indicator or the Power Indicator, and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values are written directly to the Attention Indicator Control and Power Indicator Control fields. [bhelgaas: changelog, rename flag and accessors to s/attention/indicator/] Signed-off-by: NKeith Busch <keith.busch@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 15 9月, 2016 5 次提交
-
-
由 Bjorn Helgaas 提交于
Print slot name consistently as "Slot(%s)". I don't know whether that's ideal, but we can at least do it the same way all the time. No functional change intended. Tested-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Bjorn Helgaas 提交于
In pcie_isr(), we return early if no status bits other than PCI_EXP_SLTSTA_CC are set. This was introduced by dbd79aed ("pciehp: fix NULL dereference in interrupt handler"), but it is no longer necessary because all the subsequent pcie_isr() code is already predicated on a status bit being set. Remove the unnecessary test for ~PCI_EXP_SLTSTA_CC. No functional change intended. Tested-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Mayurkumar Patel 提交于
Previously we read Slot Status to learn about hotplug events, then cleared the events, then re-read Slot Status to find out what happened. But Slot Status might have changed before the second read. Capture the Slot Status once before clearing the events. Also capture the Link Status if we had a link status change. [bhelgaas: changelog, split to separate patch] Tested-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NMayurkumar Patel <mayurkumar.patel@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Mayurkumar Patel 提交于
Previously we accumulated hotplug events, then processed them, essentially like this: events = 0 do { status = read(Slot Status) status &= EVENT_MASK # only look at events events |= status # accumulate events write(Slot Status, events) # clear events } while (status) process events The problem is that as soon as we clear events in Slot Status, the hardware may send notifications for new events, and we lose information about the first events. For example, we might see two Presence Detect Changed events, but lose the fact that the slot was temporarily empty: read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty write PCI_EXP_SLTSTA_PDC # clear PDC event read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied The current code does not process a removal; it only processes the insertion, which fails because we didn't remove the original device. To avoid this problem, read Slot Status once and process all the events before reading it again, like this: do { read events clear events process events } while (events) [bhelgaas: changelog, add external loop around pciehp_isr()] Tested-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NMayurkumar Patel <mayurkumar.patel@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
由 Bjorn Helgaas 提交于
After 1469d17d ("PCI: pciehp: Handle invalid data when reading from non-existent devices"), we returned IRQ_HANDLED when we failed to read interrupt status from the bridge. I think it's better to return IRQ_NONE, as we do in other cases where there's no interrupt pending. This will facilitate refactoring the loop in pcie_isr(): we'll be able to call the ISR in a loop as long as it returns IRQ_HANDLED. Return IRQ_NONE if we couldn't read interrupt status. Tested-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
- 13 9月, 2016 1 次提交
-
-
由 Bjorn Helgaas 提交于
Rename "detected" and "intr_loc" to "status" and "events" for clarity. "status" is the value we read from the Slot Status register; "events" is the set of hot-plug events we need to process. No functional change intended. Tested-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
-
- 21 6月, 2016 1 次提交
-
-
由 Lukas Wunner 提交于
If a hotplug port is suspended to D3cold, its slot status register cannot be read. If that hotplug port happens to share its IRQ with other devices, whenever an interrupt occurs for one of these devices, pciehp logs a "no response from device" message and tries to read the PCI_EXP_SLTSTA register, even though we know that will fail. Ignore interrupts while we're in D3cold. [bhelgaas: changelog] Signed-off-by: NLukas Wunner <lukas@wunner.de> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 11 8月, 2015 2 次提交
-
-
由 Bjorn Helgaas 提交于
We queued interrupt events for the MRL being opened or closed, but the code in interrupt_event_handler() that handles these events ignored them. Stop enabling MRL interrupts and remove the ignored events. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
由 Jarod Wilson 提交于
It's platform-dependent, but an MMIO read to a non-existent PCI device generally returns data with all bits set. This happens when the host bridge or Root Complex times out waiting for a response from the device and fabricates return data to complete the CPU's read. One example, reported in the bugzilla below, involved this hierarchy: pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port pci 0000:07:00.0: BCM57762 NIC Unplugging the Thunderbolt switch and the NIC below it resulted in this: pciehp 0000:03:03.0: Surprise Removal tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff pciehp 0000:06:00.0: unloading service driver pciehp pciehp 0000:06:00.0: pcie_isr: intr_loc 11f pciehp 0000:06:00.0: Switch interrupt received pciehp 0000:06:00.0: Latch open on Slot pciehp 0000:06:00.0: Attention button interrupt received pciehp 0000:06:00.0: Button pressed on Slot pciehp 0000:06:00.0: Presence/Notify input change pciehp 0000:06:00.0: Card present on Slot pciehp 0000:06:00.0: Power fault interrupt received pciehp 0000:06:00.0: Data Link Layer State change pciehp 0000:06:00.0: Link Up event The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0 and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver remove methods. Since the NIC was already gone, tg3 received 0xffffffff when it tried to read from the device. The resulting timeout is a tg3 issue and not of interest here. Similarly, since the 06:00.0 Thunderbolt switch was already gone, pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp thought that was valid status showing that many events had happened: the latch had been opened, the attention button had been pressed, a card was now present, and the link was now up. These are all wrong, of course, but pciehp went on to try to power up and enumerate devices below the non-existent bridge: pciehp 0000:06:00.0: PCI slot - powering on due to button press pciehp 0000:06:00.0: Surprise Insertion pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff [bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()] Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841Suggested-by: NBjorn Helgaas <bhelgaas@google.com> Signed-off-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 16 7月, 2015 1 次提交
-
-
由 Yijing Wang 提交于
Move first slot status read into while to simplify code. Signed-off-by: NYijing Wang <wangyijing@huawei.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
-
- 19 6月, 2015 1 次提交
-
-
由 Bjorn Helgaas 提交于
The pciehp_handle_*() functions (pciehp_handle_attention_button(), etc.) only contain a line or two of useful code, so it's clumsy to put them in separate functions. All they so is add an event to a work queue, and it's clearer to see that directly in the ISR. Inline them directly into pcie_isr(). No functional change. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NRajat Jain <rajatja@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-
- 18 6月, 2015 1 次提交
-
-
由 Bjorn Helgaas 提交于
The pciehp debug logging is overly verbose and often redundant. Almost all of the information printed by dbg_ctrl() is also printed by the normal PCI core enumeration code and by pcie_init(). Remove the redundant debug info. When claiming a pciehp bridge, we print the slot characteristics, e.g., Slot #6 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl+ LLActRep+ Add the Hot-Plug Capable and Hot-Plug Surprise bits to this information, and print it all in the same order as lspci does. No functional change except the message text changes. Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NRajat Jain <rajatja@google.com> Acked-by: NYinghai Lu <yinghai@kernel.org>
-