提交 · b440bde74f043c8ec31081cb59c9a53ade954701 · openanolis / cloud-kernel

11 9月, 2014 1 次提交

PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device · b440bde7

由 Bjorn Helgaas 提交于 9月 10, 2014

Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.

Some drivers expect to remain bound to a device even while they power it
off and back on again. This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed. But some drivers accept that risk.

Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.

The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU. They
power off the unused GPU, but they want to remain bound to it.

This is a reimplementation of f244d8b6 ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.

This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below). The
resume of the radeon device may also fail, e.g.,

This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:

[drm] radeon: finishing device.
radeon 0000:01:00.0: Userspace still has active objects !
radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
...
WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
trying to unbind memory from uninitialized GART !

or while resuming it, as in bug 77261:

radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
radeon 0000:01:00.0: GPU lockup ...
radeon 0000:01:00.0: GPU pci config reset
pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
*ERROR* radeon: dpm resume failed
radeon 0000:01:00.0: Wait for MC idle timedout !

Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701Reported-by: NShawn Starr <shawn.starr@rogers.com>
Reported-by: NJose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NRajat Jain <rajatxjain@gmail.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NDave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org # v3.15+

b440bde7

08 7月, 2014 1 次提交

PCI: pciehp: Clear Data Link Layer State Changed during init · 0d25d35c

由 Myron Stowe 提交于 6月 17, 2014

During PCIe hot-plug initialization - pciehp_probe() - data structures
related to slot capabilities are set up.  As part of this set up, ISRs are
put in place to handle slot events and all event bits are cleared out.

This patch adds the Data Link Layer State Changed (PCI_EXP_SLTSTA_DLLSC)
Slot Status bit to the event bits that are cleared out during
initialization.

If the BIOS doesn't clear DLLSC before handoff to the OS, pciehp notices
that it's set and interprets it as a new Link Up event, which results in
spurious messages:

  pciehp 0000:82:04.0:pcie24: slot(4): Link Up event
  pciehp 0000:82:04.0:pcie24: Device 0000:83:00.0 already exists at 0000:83:00, cannot hot-add
  pciehp 0000:82:04.0:pcie24: Cannot add device at 0000:83:00

Prior to e48f1b67 ("PCI: pciehp: Use link change notifications for
hot-plug and removal"), pciehp ignored DLLSC.

Reference:
  PCI-SIG.  PCI Express Base Specification Revision 4.0 Version 0.3
  (PCI-SIG, 2014): 7.8.11. Slot Status Register (Offset 1Ah).

[bhelgaas: add e48f1b67 ref and stable tag]
Fixes: e48f1b67 ("PCI: pciehp: Use link change notifications for hot-plug and removal")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79611Signed-off-by: NMyron Stowe <myron.stowe@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v3.15+

0d25d35c

06 7月, 2014 1 次提交

PCI: pciehp: Remove struct controller.no_cmd_complete · 6c1a32e0

由 Rajat Jain 提交于 6月 26, 2014

"no_cmd_complete" is only used once, and it duplicates read-only
information we already have in the cached Slot Capabilities value.

Remove the field and use the existing macro NO_CMD_CMPL() instead.

[bhelgaas: changelog]
Signed-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

6c1a32e0

18 6月, 2014 3 次提交

PCI: pciehp: Remove assumptions about which commands cause completion events · 2cc56f30

由 Bjorn Helgaas 提交于 6月 14, 2014

We use incorrect logic to decide whether a PCIe hotplug controller
generates command completion events.

5808639b ("pciehp: fix slow probing") assumed that the Slot Status
"Command Completed" bit was set only for commands affecting slot power,
indicators, or electromechanical interlock.  That assumption is false: per
sec. 6.7.3.2 of PCIe spec r3.0, a write targeting any portion of the Slot
Control register is a command, and (if command completed events are
supported) software must wait for a command to complete before issuing the
next command.

5808639b was to fix boot-time timeouts (see bugzilla below) on a Lenovo
Thinkpad R61 with an Intel hotplug controller.  The controller probably has
the Intel CF118 erratum, which means it doesn't report Command Completed
unless the Slot Control power, indicator, or interlock bits are changed.
This causes a timeout because pciehp always waits for Command Complete (if
supported), regardless of which bits are changed.

Remove the incorrect logic because the timeouts have been addressed
differently by these changes:

  PCI: pciehp: Wait for hotplug command completion lazily
  PCI: pciehp: Compute timeout from hotplug command start time

Link: https://bugzilla.kernel.org/show_bug.cgi?id=10751
Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>

2cc56f30

PCI: pciehp: Compute timeout from hotplug command start time · 40b96083

由 Bjorn Helgaas 提交于 6月 14, 2014

If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.

Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.

For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:

  At time T1 (during boot):
    - Write DLLSCE, ABPE, PDCE, etc. to Slot Control
  At time T2 (hotplug event):
    - Wait for command completion (CC) in Slot Status
    - Timeout at T2 + 1 second because CC is never set in Slot Status
    - Write PCC, PIC, etc. to Slot Control

With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.

We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>

40b96083

PCI: pciehp: Wait for hotplug command completion lazily · 3461a068

由 Bjorn Helgaas 提交于 6月 13, 2014

Previously we issued a hotplug command and waited for it to complete.  But
there's no need to wait until we're ready to issue the *next* command.  The
next command will probably be much later, so the first one may have already
completed and we may not have to actually wait at all.

Because of hardware errata, some controllers generate command completion
events for some commands but not others.  In the case of Intel CF118 (see
spec update reference), the controller indicates command completion only
for Slot Control writes that change the value of the following bits:

  Power Controller Control
  Power Indicator Control
  Attention Indicator Control
  Electromechanical Interlock Control

Changes to other bits, e.g., the interrupt enable bits, do not cause the
Command Completed bit to be set.  Controllers from AMD and Nvidia are
reported to have similar errata.

These errata cause timeouts when pcie_enable_notification() enables
interrupts.  Previously that timeout occurred at boot-time.  With this
change, the timeout occurs later, when we change the state of the slot
power, indicators, or interlock.  This speeds up boot but causes a timeout
at the first hotplug event on the slot.  Subsequent events don't timeout
because only the first (boot-time) hotplug command updates Slot Control
without touching the power/indicator/interlock controls.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>

3461a068

17 6月, 2014 1 次提交

PCI: pciehp: Make pcie_wait_cmd() self-contained · 4283c70e

由 Bjorn Helgaas 提交于 6月 13, 2014

pcie_wait_cmd() waits for the controller to finish a hotplug command.  Move
the associated logic (to determine whether waiting is required and whether
we're using interrupts or polling) from pcie_write_cmd() to
pcie_wait_cmd().

No functional change.

Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NYinghai Lu <yinghai@kernel.org>

4283c70e

11 6月, 2014 2 次提交

PCI: Merge multi-line quoted strings · 227f0647

由 Ryan Desfosses 提交于 4月 18, 2014

Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.

No functional change.

[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: NRyan Desfosses <ryan@desfo.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

227f0647

PCI: Whitespace cleanup · 3c78bc61

由 Ryan Desfosses 提交于 4月 18, 2014

Fix various whitespace errors.

No functional change.

[bhelgaas: fix other similar problems]
Signed-off-by: NRyan Desfosses <ryan@desfo.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

3c78bc61

25 4月, 2014 1 次提交

PCI: pciehp: Acknowledge spurious "cmd completed" event · 476a357f

由 Rajat Jain 提交于 2月 20, 2014

In case of a spurious "cmd completed", pcie_write_cmd() does not clear it,
but yet expects more "cmd completed" events to be generated.  This does not
happen because the previous (spurious) event has not been acknowledged.
Fix that.
Signed-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

476a357f

20 2月, 2014 1 次提交

PCI: pciehp: Remove a non-existent card, regardless of "surprise" capability · 2b3940b6

由 Rajat Jain 提交于 2月 18, 2014

In case a card is physically yanked out, it should immediately be removed,
regardless of the "surprise" capability bit. Thus:

  - Always handle the physical removal - regardless of the "surprise" bit.
  - Don't use "surprise" capability when making decisions about enabling
    presence detect notifications.
  - Reword the comments to indicate the intent.
Signed-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

2b3940b6

12 2月, 2014 4 次提交

PCI: pciehp: Add hotplug_lock to serialize hotplug events · 50b52fde

由 Rajat Jain 提交于 2月 04, 2014

Today it is there is no protection around pciehp_enable_slot() and
pciehp_disable_slot() to ensure that they complete before another
hot-plug operation can be done on that particular slot.

This patch introduces the slot->hotplug_lock to ensure that any hotplug
operations (add / remove) complete before another hotplug event can begin
processing on that particular slot.
Signed-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

50b52fde

PCI: pciehp: Disable link notification across slot reset · 06a8d89a

由 Rajat Jain 提交于 2月 04, 2014

Disable the link notification (in addition to presence detect
notifications) across the slot reset since the reset could flap the link,
and we don't want to treat it as hot unplug followed by a hotplug.
Signed-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

06a8d89a

PCI: pciehp: Don't disable the link permanently during removal · b1811d24

由 Rajat Jain 提交于 2月 04, 2014

We need future link up events for hot-add, thus don't disable the link
permanently during device removal. Also, remove the static functions that
are now left unused.

This reverts part of 2debd928 ("PCI: pciehp: Disable/enable link during
slot power off/on"). This was discussed at the URL below, where it was
revealed that it was done for a bug in a PCIe repeater chip on that
particular platform.

Link: https://lkml.kernel.org/r/CAErSpo72KZ-a2OSQLWoK71GCgwBt676XZdGt4tEYm-6UYnLmPw@mail.gmail.comSigned-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

b1811d24

PCI: pciehp: Enable link state change notifications · 4f854f2a

由 Rajat Jain 提交于 2月 04, 2014

Enable the Link state notifications unconditionally. Enable the
presence detection notification only if attention button is absent.
This was discussed at this thread:
    https://lkml.kernel.org/r/529E5C0E.80903@gmail.comSigned-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

4f854f2a

11 2月, 2014 2 次提交

PCI: pciehp: Use link change notifications for hot-plug and removal · e48f1b67

由 Rajat Jain 提交于 2月 04, 2014

A lot of systems do not have the fancy buttons and LEDs, and instead
want to rely only on the Link state change events to drive the hotplug
and removal state machinery.
(http://www.spinics.net/lists/hotplug/msg05802.html)

This patch adds support for that functionality. Here are the details
about the patch itself:

* Define and use interrupt events for linkup / linkdown.

* Make the pcie_isr() also look at link events, and direct control to
  corresponding (new) link state change handler function.

* Introduce the functions to handle link-up and link-down events and
  queue the add / removal work in the slot->wq to be processed by
  pciehp_power_thread()

As a side note, this patch also fixes the bug
https://bugzilla.kernel.org/show_bug.cgi?id=65521 "pciehp ignores Data Link
Layer State Changed bit."
Signed-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

e48f1b67

PCI: pciehp: Make check_link_active() non-static · 4703389f

由 Rajat Jain 提交于 2月 04, 2014

check_link_active() functionality needs to be used by subsequent patches
(that introduce link state change based hotplug). Thus make the function
non-static, and rename it to pciehp_check_link_active() so as to be
consistent with other non-static functions.
Signed-off-by: NRajat Jain <rajatxjain@gmail.com>
Signed-off-by: NRajat Jain <rajatjain@juniper.net>
Signed-off-by: NGuenter Roeck <groeck@juniper.net>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

4703389f

16 12月, 2013 7 次提交

PCI: pciehp: Move Attention & Power Indicator support tests to accessors · af9ab791

由 Bjorn Helgaas 提交于 12月 15, 2013

Previously, the caller checked ATTN_LED() or PWR_LED() to see whether the
slot has indicators before setting the indicator state.  That clutters the
caller unnecessarily, so this moves the test inside the callees.  The test
may not even be necessary; per spec it should be harmless to try to turn on
a non-existent LED.  But checking first does avoid unnecessary hotplug
commands.

No functional change.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

af9ab791

PCI: pciehp: Use symbolic constants for Slot Control fields · e7b4f0d7