提交 · 6db645f99cc5357ab5520982b85396487c113dc9 · openeuler / Kernel

01 12月, 2020 1 次提交

由 Mauro Carvalho Chehab 提交于 10月 23, 2020

Update kernel-doc so the names in the doc match the prototypes.

Link: https://lore.kernel.org/r/f19caf7a68f8365c8b573a42b4ac89ec21925c73.1603469755.git.mchehab+huawei@kernel.orgSigned-off-by: NMauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

2f0cd59c

21 11月, 2020 2 次提交

PCI: Avoid duplicate IDs in driver dynamic IDs list · 3853f912

由 Zhenzhong Duan 提交于 11月 17, 2020

When a device ID is written to /sys/bus/pci/drivers/.../new_id, we
previously only checked the driver's static ID table for duplicates.
Writing the same ID several times added it to the dynamic IDs list several
times.

This doesn't cause user-visible broken behavior, but remove_id_store() only
removes one of the duplicate IDs, so if we add an ID several times, we
would have to remove it the same number of times before it's completely
gone.

Fix it by calling pci_match_device(), which checks both dynamic and static
IDs to avoid inserting duplicate IDs in dynamic IDs list.

After fix, attempts to add an ID more than once cause an error:

  # echo "1af4 1041" > /sys/bus/pci/drivers/vfio-pci/new_id
  # echo "1af4 1041" > /sys/bus/pci/drivers/vfio-pci/new_id
  bash: echo: write error: File exists

Link: https://lore.kernel.org/r/20201117054409.3428-3-zhenzhong.duan@gmail.comSigned-off-by: NZhenzhong Duan <zhenzhong.duan@gmail.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

3853f912

PCI: Move pci_match_device() ahead of new_id_store() · 1f40704b

由 Zhenzhong Duan 提交于 11月 17, 2020

Move pci_match_device() and its dependencies (pci_match_id() and
pci_device_id_any) ahead of new_id_store().

This is preparation work for calling pci_match_device() in new_id_store().
No functional changes.

[bhelgaas: update function comments]
Link: https://lore.kernel.org/r/20201117054409.3428-2-zhenzhong.duan@gmail.comSigned-off-by: NZhenzhong Duan <zhenzhong.duan@gmail.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

1f40704b

06 10月, 2020 1 次提交

dma-mapping: move dma-debug.h to kernel/dma/ · a1fd09e8

由 Christoph Hellwig 提交于 9月 11, 2020

Most of dma-debug.h is not required by anything outside of kernel/dma.
Move the four declarations needed by dma-mappin.h or dma-ops providers
into dma-mapping.h and dma-map-ops.h, and move the remainder of the
file to kernel/dma/debug.h.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

a1fd09e8

30 9月, 2020 1 次提交

PCI/PM: Remove unused pcibios_pm_ops · a5d02e90

由 Vaibhav Gupta 提交于 7月 31, 2020

The "struct dev_pm_ops pcibios_pm_ops", declared in include/linux/pci.h and
defined in drivers/pci/pci-driver.c, provided arch-specific hooks when a
PCI device was doing a hibernate transition.

39421627 ("s390: remove broken hibernate / power management support")
removed the last use of pcibios_pm_ops, so remove it completely.

[bhelgaas: drop unused "error"]
Link: https://lore.kernel.org/r/20200730194416.1029509-1-vaibhavgupta40@gmail.comReported-by: NBjorn Helgaas <helgaas@kernel.org>
Signed-off-by: NVaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

a5d02e90

08 7月, 2020 1 次提交

PCI: Restrict probe functions to housekeeping CPUs · 69a18b18

由 Alex Belits 提交于 6月 25, 2020

pci_call_probe() prevents the nesting of work_on_cpu() for a scenario
where a VF device is probed from work_on_cpu() of the PF.

Replace the cpumask used in pci_call_probe() from all online CPUs to only
housekeeping CPUs. This is to ensure that there are no additional latency
overheads caused due to the pinning of jobs on isolated CPUs.
Signed-off-by: NAlex Belits <abelits@marvell.com>
Signed-off-by: NNitesh Narayan Lal <nitesh@redhat.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NFrederic Weisbecker <frederic@kernel.org>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>
Link: https://lkml.kernel.org/r/20200625223443.2684-3-nitesh@redhat.com

69a18b18

25 4月, 2020 4 次提交

PM: sleep: core: Rename dev_pm_smart_suspend_and_suspended() · fa2bfead

由 Rafael J. Wysocki 提交于 4月 18, 2020

Because all callers of dev_pm_smart_suspend_and_suspended use it only
for checking whether or not to skip driver suspend callbacks for a
device, rename it to dev_pm_skip_suspend() in analogy with
dev_pm_skip_resume().

No functional impact.
Suggested-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NAlan Stern <stern@rowland.harvard.edu>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

fa2bfead

PM: sleep: core: Rename dev_pm_may_skip_resume() · 76c70cb5

由 Rafael J. Wysocki 提交于 4月 18, 2020

The name of dev_pm_may_skip_resume() may be easily confused with the
power.may_skip_resume flag which is not checked by that function, so
rename the former as dev_pm_skip_resume().

No functional impact.
Suggested-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NAlan Stern <stern@rowland.harvard.edu>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

76c70cb5

PM: sleep: core: Rework the power.may_skip_resume handling · 0fe8a1be

由 Rafael J. Wysocki 提交于 4月 18, 2020

Because the power.may_skip_resume device status bit is taken
into account in combination with the DPM_FLAG_LEAVE_SUSPENDED
driver flag, it can be set to 'true' for all devices in the
"suspend" phase of a suspend-resume cycle, so do that.

Then, neither the PM core nor the middle-layer (sybsystem) code
handling it needs to set it to 'true' any more and it just has
to be cleared if there is a reason to avoid skipping the "noirq"
and "early" resume callbacks provided by the driver, so update
the code in question accordingly.
Suggested-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NAlan Stern <stern@rowland.harvard.edu>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

0fe8a1be

PM: sleep: core: Do not skip callbacks in the resume phase · 6e176bf8

由 Rafael J. Wysocki 提交于 4月 18, 2020

The current code in device_resume_noirq() causes the entire early
resume and resume phases of device suspend to be skipped for
devices for which the noirq resume phase have been skipped (due
to the LEAVE_SUSPENDED flag being set) on the premise that those
devices should stay in runtime-suspend after system-wide resume.

However, that may not be correct in two situations.  First, the
middle layer (subsystem) noirq resume callback may be missing for
a given device, but its early resume callback may be present and it
may need to do something even if it decides to skip the driver
callback.  Second, if the device's wakeup settings were adjusted
in the suspend phase without resuming the device (that was in
runtime suspend at that time), they most likely need to be
adjusted again in the resume phase and so the driver callback
in that phase needs to be run.

For the above reason, modify the core to allow the middle layer
->resume_late callback to run even if its ->resume_noirq callback
is missing (and the core has skipped the driver-level callback
in that phase) and to allow all device callbacks to run in the
resume phase.  Also make the core set the PM-runtime status of
devices with SMART_SUSPEND set whose resume callbacks are not
skipped to "active" in the "noirq" resume phase and update the
affected subsystems (PCI and ACPI) accordingly.

After this change, middle-layer (subsystem) callbacks will always
be invoked in all phases of system suspend and resume and driver
callbacks will always run in the prepare, suspend, resume, and
complete phases for all devices.

For devices with SMART_SUSPEND set, driver callbacks will be
skipped in the late and noirq phases of system suspend if those
devices remain in runtime suspend in __device_suspend_late().
Driver callbacks will also be skipped for them during the
noirq and early phases of the "thaw" transition related to
hibernation in that case.

Setting LEAVE_SUSPENDED means that the driver allows its callbacks
to be skipped in the noirq and early phases of system resume, but
some additional conditions need to be met for that to happen (among
other things, the power.may_skip_resume flag needs to be set for the
device during system suspend for the driver callbacks to be skipped
during the subsequent resume transition).

For all devices with SMART_SUSPEND set whose driver callbacks are
invoked during system resume, the PM-runtime status will be set to
"active" (by the core).
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NAlan Stern <stern@rowland.harvard.edu>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

6e176bf8

21 11月, 2019 10 次提交

PCI/PM: Add missing link delays required by the PCIe spec · ad9001f2

由 Mika Westerberg 提交于 11月 12, 2019

Currently Linux does not follow PCIe spec regarding the required delays
after reset. A concrete example is a Thunderbolt add-in-card that consists
of a PCIe switch and two PCIe endpoints:

  +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
                                  +-01.0-[04-36]-- DS hotplug port
                                  +-02.0-[37]----00.0 xHCI controller
                                  \-04.0-[38-6b]-- DS hotplug port

The root port (1b.0) and the PCIe switch downstream ports are all PCIe Gen3
so they support 8GT/s link speeds.

We wait for the PCIe hierarchy to enter D3cold (runtime):

  pcieport 0000:00:1b.0: power state changed by ACPI to D3cold

When it wakes up from D3cold, according to the PCIe 5.0 section 5.8 the
PCIe switch is put to reset and its power is re-applied. This means that we
must follow the rules in PCIe 5.0 section 6.6.1.

For the PCIe Gen3 ports we are dealing with here, the following applies:

  With a Downstream Port that supports Link speeds greater than 5.0 GT/s,
  software must wait a minimum of 100 ms after Link training completes
  before sending a Configuration Request to the device immediately below
  that Port. Software can determine when Link training completes by polling
  the Data Link Layer Link Active bit or by setting up an associated
  interrupt (see Section 6.7.3.3).

Translating this into the above topology we would need to do this (DLLLA
stands for Data Link Layer Link Active):

  0000:00:1b.0: wait for 100 ms after DLLLA is set before access to 0000:01:00.0
  0000:02:00.0: wait for 100 ms after DLLLA is set before access to 0000:03:00.0
  0000:02:02.0: wait for 100 ms after DLLLA is set before access to 0000:37:00.0

I've instrumented the kernel with some additional logging so we can see the
actual delays performed:

  pcieport 0000:00:1b.0: power state changed by ACPI to D0
  pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
  pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
  pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
  pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms

For the switch upstream port (01:00.0 reachable through 00:1b.0 root port)
we wait for 100 ms but not taking into account the DLLLA requirement. We
then wait 10 ms for D3hot -> D0 transition of the root port and the two
downstream hotplug ports. This means that we deviate from what the spec
requires.

Performing the same check for system sleep (s2idle) transitions it turns
out to be even worse. None of the mandatory delays are performed. If this
would be S3 instead of s2idle then according to PCI FW spec 3.2 section
4.6.8. there is a specific _DSM that allows the OS to skip the delays but
this platform does not provide the _DSM and does not go to S3 anyway so no
firmware is involved that could already handle these delays.

On this particular platform these delays are not actually needed because
there is an additional delay as part of the ACPI power resource that is
used to turn on power to the hierarchy but since that additional delay is
not required by any of standards (PCIe, ACPI) it is not present in the
Intel Ice Lake, for example where missing the mandatory delays causes
pciehp to start tearing down the stack too early (links are not yet
trained). Below is an example how it looks like when this happens:

  pcieport 0000:83:04.0: pciehp: Slot(4): Card not present
  pcieport 0000:87:04.0: PME# disabled
  pcieport 0000:83:04.0: pciehp: pciehp_unconfigure_device: domain:bus:dev = 0000:86:00
  pcieport 0000:86:00.0: Refused to change power state, currently in D3
  pcieport 0000:86:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x201ff)
  pcieport 0000:86:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0)
  ...

There is also one reported case (see the bugzilla link below) where the
missing delay causes xHCI on a Titan Ridge controller fail to runtime
resume when USB-C dock is plugged. This does not involve pciehp but instead
the PCI core fails to runtime resume the xHCI device:

  pcieport 0000:04:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
  pcieport 0000:04:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100406)
  xhci_hcd 0000:39:00.0: Refused to change power state, currently in D3
  xhci_hcd 0000:39:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff)
  xhci_hcd 0000:39:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0)
  ...

Add a new function pci_bridge_wait_for_secondary_bus() that is called on
PCI core resume and runtime resume paths accordingly if the bridge entered
D3cold (and thus went through reset).

This is second attempt to add the missing delays. The previous solution in
c2bf1fc2 ("PCI: Add missing link delays required by the PCIe spec") was
reverted because of two issues it caused:

  1. One system become unresponsive after S3 resume due to PME service
     spinning in pcie_pme_work_fn(). The root port in question reports that
     the xHCI sent PME but the xHCI device itself does not have PME status
     set. The PME status bit is never cleared in the root port resulting
     the indefinite loop in pcie_pme_work_fn().

  2. Slows down resume if the root/downstream port does not support Data
     Link Layer Active Reporting because pcie_wait_for_link_delay() waits
     1100 ms in that case.

This version should avoid the above issues because we restrict the delay to
happen only if the port went into D3cold.

Link: https://lore.kernel.org/linux-pci/SL2P216MB01878BBCD75F21D882AEEA2880C60@SL2P216MB0187.KORP216.PROD.OUTLOOK.COM/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203885
Link: https://lore.kernel.org/r/20191112091617.70282-3-mika.westerberg@linux.intel.comReported-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

ad9001f2

PCI/PM: Move power state update away from pci_power_up() · 81cfa590

由 Rafael J. Wysocki 提交于 11月 05, 2019

Move the invocation of pci_update_current_state() from pci_power_up() to
pci_pm_default_resume_early(), which is the only caller of that function.

Preparatory change, no functional impact.

Link: https://lore.kernel.org/r/37482337.udjOGdOKNb@kreacherSigned-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>

81cfa590

PCI/PM: Remove unused pci_driver.suspend_late() hook · 1a1daf09

由 Bjorn Helgaas 提交于 10月 31, 2019

The struct pci_driver.suspend_late() hook is one of the legacy PCI power
management callbacks, and there are no remaining users of it. Remove it.

Link: https://lore.kernel.org/r/20191101204558.210235-7-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1a1daf09

PCI/PM: Remove unused pci_driver.resume_early() hook · 89cdbc35

由 Bjorn Helgaas 提交于 10月 31, 2019

The struct pci_driver.resume_early() hook is one of the legacy PCI power
management callbacks, and there are no remaining users of it. Remove it.

Link: https://lore.kernel.org/r/20191101204558.210235-6-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

89cdbc35

PCI/PM: Use pci_WARN() to include device information · 12bcae44

由 Bjorn Helgaas 提交于 10月 07, 2019

Add and use pci_WARN() wrappers so warnings include device information.

Link: https://lore.kernel.org/r/20191017212851.54237-3-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

12bcae44

PCI/PM: Use PCI dev_printk() wrappers for consistency · 6941a0c2

由 Bjorn Helgaas 提交于 10月 07, 2019

Use the PCI dev_printk() wrappers for consistency with the rest of the PCI
core. No functional change intended.

Link: https://lore.kernel.org/r/20191017212851.54237-2-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6941a0c2

PCI/PM: Make power management op coding style consistent · 6da2f2cc

由 Bjorn Helgaas 提交于 10月 14, 2019

Some of the power management ops use this style:

  struct device_driver *drv = dev->driver;
  if (drv && drv->pm && drv->pm->prepare(dev))
    drv->pm->prepare(dev);

while others use this:

  const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
  if (pm && pm->runtime_resume)
    pm->runtime_resume(dev);

Convert the first style to the second so they're all consistent.  Remove
local "error" variables when unnecessary.  No functional change intended.

Link: https://lore.kernel.org/r/20191014230016.240912-6-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

6da2f2cc

PCI/PM: Run resume fixups before disabling wakeup events · f7b32a86

由 Bjorn Helgaas 提交于 10月 12, 2019

pci_pm_resume() and pci_pm_restore() call pci_pm_default_resume(), which
runs resume fixups before disabling wakeup events:

  static void pci_pm_default_resume(struct pci_dev *pci_dev)
  {
    pci_fixup_device(pci_fixup_resume, pci_dev);
    pci_enable_wake(pci_dev, PCI_D0, false);
  }

pci_pm_runtime_resume() does both of these, but in the opposite order:

  pci_enable_wake(pci_dev, PCI_D0, false);
  pci_fixup_device(pci_fixup_resume, pci_dev);

We should always use the same ordering unless there's a reason to do
otherwise.  Change pci_pm_runtime_resume() to call pci_pm_default_resume()
instead of open-coding this, so the fixups are always done before disabling
wakeup events.

pci_pm_default_resume() is called from pci_pm_runtime_resume(), which is
under #ifdef CONFIG_PM.  If SUSPEND and HIBERNATION are disabled, PM_SLEEP
is disabled also, so move pci_pm_default_resume() from #ifdef
CONFIG_PM_SLEEP to #ifdef CONFIG_PM.

Link: https://lore.kernel.org/r/20191014230016.240912-5-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

f7b32a86

PCI/PM: Clear PCIe PME Status even for legacy power management · ec6a75ef

由 Bjorn Helgaas 提交于 10月 10, 2019

Previously, pci_pm_resume_noirq() cleared the PME Status bit in the Root
Status register only if the device had no driver or the driver did not
implement legacy power management. It should clear PME Status regardless
of what sort of power management the driver supports, so do this before
checking for legacy power management.

This affects Root Ports and Root Complex Event Collectors, for which the
usual driver is the PCIe portdrv, which implements new power management, so
this change is just on principle, not to fix any actual defects.

Fixes: a39bd851 ("PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver")
Link: https://lore.kernel.org/r/20191014230016.240912-4-helgaas@kernel.orgSigned-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

ec6a75ef

PCI/PM: Always return devices to D0 when thawing · f2c33cca

由 Dexuan Cui 提交于 8月 14, 2019

pci_pm_thaw_noirq() is supposed to return the device to D0 and restore its
configuration registers, but previously it only did that for devices whose
drivers implemented the new power management ops.

Hibernation, e.g., via "echo disk > /sys/power/state", involves freezing
devices, creating a hibernation image, thawing devices, writing the image,
and powering off.  The fact that thawing did not return devices with legacy
power management to D0 caused errors, e.g., in this path:

  pci_pm_thaw_noirq
    if (pci_has_legacy_pm_support(pci_dev)) # true for Mellanox VF driver
      return pci_legacy_resume_early(dev)   # ... legacy PM skips the rest
    pci_set_power_state(pci_dev, PCI_D0)
    pci_restore_state(pci_dev)
  pci_pm_thaw
    if (pci_has_legacy_pm_support(pci_dev))
      pci_legacy_resume
	drv->resume
	  mlx4_resume
	    ...
	      pci_enable_msix_range
	        ...
		  if (dev->current_state != PCI_D0)  # <---
		    return -EINVAL;

which caused these warnings:

  mlx4_core a6d1:00:02.0: INTx is not supported in multi-function mode, aborting
  PM: dpm_run_callback(): pci_pm_thaw+0x0/0xd7 returns -95
  PM: Device a6d1:00:02.0 failed to thaw: error -95

Return devices to D0 and restore config registers for all devices, not just
those whose drivers support new power management.

[bhelgaas: also call pci_restore_state() before pci_legacy_resume_early(),
update comment, add stable tag, commit log]
Link: https://lore.kernel.org/r/KU1P153MB016637CAEAD346F0AA8E3801BFAD0@KU1P153MB0166.APCP153.PROD.OUTLOOK.COMSigned-off-by: NDexuan Cui <decui@microsoft.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org	# v4.13+

f2c33cca

03 7月, 2019 2 次提交

PCI: PM: Simplify bus-level hibernation callbacks · a78ae45a

由 Rafael J. Wysocki 提交于 7月 01, 2019

After a previous change causing all runtime-suspended PCI devices
to be resumed before creating a snapshot image of memory during
hibernation, it is not necessary to worry about the case in which
them might be left in runtime-suspend any more, so get rid of the
code related to that from bus-level PCI hibernation callbacks.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: NHans de Goede <hdegoede@redhat.com>

a78ae45a

PM: ACPI/PCI: Resume all devices during hibernation · 501debd4

由 Rafael J. Wysocki 提交于 7月 01, 2019

Both the PCI bus type and the ACPI PM domain avoid resuming
runtime-suspended devices with DPM_FLAG_SMART_SUSPEND set during
hibernation (before creating the snapshot image of system memory),
but that turns out to be a mistake. It leads to functional issues
and adds complexity that's hard to justify.

For this reason, resume all runtime-suspended PCI devices and all
devices in the ACPI PM domains before creating a snapshot image of
system memory during hibernation.

Fixes: 05087360 (ACPI / PM: Take SMART_SUSPEND driver flag into account)
Fixes: c4b65157 (PCI / PM: Take SMART_SUSPEND driver flag into account)
Link: https://lore.kernel.org/linux-acpi/917d4399-2e22-67b1-9d54-808561f9083f@uwyo.edu/T/#maf065fe6e4974f2a9d79f332ab99dfaba635f64cReported-by: NRobert R. Howell <RHowell@uwyo.edu>
Tested-by: NRobert R. Howell <RHowell@uwyo.edu>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: NHans de Goede <hdegoede@redhat.com>

501debd4

27 6月, 2019 2 次提交

PCI: PM/ACPI: Refresh all stale power state data in pci_pm_complete() · b51033e0

由 Rafael J. Wysocki 提交于 6月 25, 2019

In pci_pm_complete() there are checks to decide whether or not to
resume devices that were left in runtime-suspend during the preceding
system-wide transition into a sleep state.  They involve checking the
current power state of the device and comparing it with the power
state of it set before the preceding system-wide transition, but the
platform component of the device's power state is not handled
correctly in there.

Namely, on platforms with ACPI, the device power state information
needs to be updated with care, so that the reference counters of
power resources used by the device (if any) are set to ensure that
the refreshed power state of it will be maintained going forward.

To that end, introduce a new ->refresh_state() platform PM callback
for PCI devices, for asking the platform to refresh the device power
state data and ensure that the corresponding power state will be
maintained going forward, make it invoke acpi_device_update_power()
(for devices with ACPI PM) on platforms with ACPI and make
pci_pm_complete() use it, through a new pci_refresh_power_state()
wrapper function.

Fixes: a0d2a959 (PCI: Avoid unnecessary resume after direct-complete)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>

b51033e0

PCI: PM: Avoid skipping bus-level PM on platforms without ACPI · 471a739a

由 Rafael J. Wysocki 提交于 6月 26, 2019

There are platforms that do not call pm_set_suspend_via_firmware(),
so pm_suspend_via_firmware() returns 'false' on them, but the power
states of PCI devices (PCIe ports in particular) are changed as a
result of powering down core platform components during system-wide
suspend.  Thus the pm_suspend_via_firmware() checks in
pci_pm_suspend_noirq() and pci_pm_resume_noirq() introduced by
commit 3e26c5fe ("PCI: PM: Skip devices in D0 for suspend-to-
idle") are not sufficient to determine that devices left in D0
during suspend will remain in D0 during resume and so the bus-level
power management can be skipped for them.

For this reason, introduce a new global suspend flag,
PM_SUSPEND_FLAG_NO_PLATFORM, set it for suspend-to-idle only
and replace the pm_suspend_via_firmware() checks mentioned above
with checks against this flag.

Fixes: 3e26c5fe ("PCI: PM: Skip devices in D0 for suspend-to-idle")
Reported-by: NJon Hunter <jonathanh@nvidia.com>
Tested-by: NJon Hunter <jonathanh@nvidia.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>

471a739a

17 6月, 2019 1 次提交

PCI: PM: Replace pci_dev_keep_suspended() with two functions · 0c7376ad

由 Rafael J. Wysocki 提交于 6月 07, 2019

The code in pci_dev_keep_suspended() is relatively hard to follow due
to the negative checks in it and in its callers and the function has
a possible side-effect (disabling the PME) which doesn't really match
its role.

For this reason, move the PME disabling from pci_dev_keep_suspended()
to a separate function and change the semantics (and name) of the
rest of it, so that 'true' is returned when the device needs to be
resumed (and not the other way around).  Change the callers of
pci_dev_keep_suspended() accordingly.

While at it, make the code flow in pci_pm_poweroff() reflect the
pci_pm_suspend() more closely to avoid arbitrary differences between
them.

This is a cosmetic change with no intention to alter behavior.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>

0c7376ad

14 6月, 2019 2 次提交

PCI: PM: Skip devices in D0 for suspend-to-idle · 3e26c5fe

由 Rafael J. Wysocki 提交于 6月 13, 2019

Commit d491f2b7 ("PCI: PM: Avoid possible suspend-to-idle issue")
attempted to avoid a problem with devices whose drivers want them to
stay in D0 over suspend-to-idle and resume, but it did not go as far
as it should with that.

Namely, first of all, the power state of a PCI bridge with a
downstream device in D0 must be D0 (based on the PCI PM spec r1.2,
sec 6, table 6-1, if the bridge is not in D0, there can be no PCI
transactions on its secondary bus), but that is not actively enforced
during system-wide PM transitions, so use the skip_bus_pm flag
introduced by commit d491f2b7 for that.

Second, the configuration of devices left in D0 (whatever the reason)
during suspend-to-idle need not be changed and attempting to put them
into D0 again by force is pointless, so explicitly avoid doing that.

Fixes: d491f2b7 ("PCI: PM: Avoid possible suspend-to-idle issue")
Reported-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: NKai-Heng Feng <kai.heng.feng@canonical.com>

3e26c5fe

PCI: Always allow probing with driver_override · 2d2f4273

由 Alex Williamson 提交于 5月 09, 2019

Commit 0e7df224 ("PCI: Add sysfs sriov_drivers_autoprobe to control
VF driver binding") introduced the sriov_drivers_autoprobe attribute
which allows users to prevent the kernel from automatically probing a
driver for new VFs as they are created. This allows VFs to be spawned
without automatically binding the new device to a host driver, such as
in cases where the user intends to use the device only with a meta
driver like vfio-pci. However, the current implementation prevents any
use of drivers_probe with the VF while sriov_drivers_autoprobe=0. This
blocks the now current general practice of setting driver_override
followed by using drivers_probe to bind a device to a specified driver.

The kernel never automatically sets a driver_override therefore it seems
we can assume a driver_override reflects the intent of the user. Also,
probing a device using a driver_override match seems outside the scope
of the 'auto' part of sriov_drivers_autoprobe. Therefore, let's allow
driver_override matches regardless of sriov_drivers_autoprobe, which we
can do by simply testing if a driver_override is set for a device as a
'can probe' condition.

Fixes: 0e7df224 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding")
Link: https://lore.kernel.org/lkml/155742996741.21878.569845487290798703.stgit@gimli.home
Link: https://lore.kernel.org/linux-pci/155672991496.20698.4279330795743262888.stgit@gimli.home/T/#uSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

2d2f4273

30 5月, 2019 1 次提交

PCI: Return error if cannot probe VF · 76002d8b

由 Alex Williamson 提交于 5月 01, 2019

Commit 0e7df224 ("PCI: Add sysfs sriov_drivers_autoprobe to control
VF driver binding") allows the user to specify that drivers for VFs of
a PF should not be probed, but it actually causes pci_device_probe() to
return success back to the driver core in this case. Therefore by all
sysfs appearances the device is bound to a driver, the driver link from
the device exists as does the device link back from the driver, yet the
driver's probe function is never called on the device. We also fail to
do any sort of cleanup when we're prohibited from probing the device,
the IRQ setup remains in place and we even hold a device reference.

Instead, abort with errno before any setup or references are taken when
pci_device_can_probe() prevents us from trying to probe the device.

Link: https://lore.kernel.org/lkml/155672991496.20698.4279330795743262888.stgit@gimli.home
Fixes: 0e7df224 ("PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding")
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

76002d8b

27 5月, 2019 1 次提交

PCI: PM: Avoid possible suspend-to-idle issue · d491f2b7

由 Rafael J. Wysocki 提交于 5月 17, 2019

If a PCI driver leaves the device handled by it in D0 and calls
pci_save_state() on the device in its ->suspend() or ->suspend_late()
callback, it can expect the device to stay in D0 over the whole
s2idle cycle. However, that may not be the case if there is a
spurious wakeup while the system is suspended, because in that case
pci_pm_suspend_noirq() will run again after pci_pm_resume_noirq()
which calls pci_restore_state(), via pci_pm_default_resume_early(),
so state_saved is cleared and the second iteration of
pci_pm_suspend_noirq() will invoke pci_prepare_to_sleep() which
may change the power state of the device.

To avoid that, add a new internal flag, skip_bus_pm, that will be set
by pci_pm_suspend_noirq() when it runs for the first time during the
given system suspend-resume cycle if the state of the device has
been saved already and the device is still in D0. Setting that flag
will cause the next iterations of pci_pm_suspend_noirq() to set
state_saved for pci_pm_resume_noirq(), so that it always restores the
device state from the originally saved data, and avoid calling
pci_prepare_to_sleep() for the device.

Fixes: 33e4f80e ("ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle")
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NKeith Busch <keith.busch@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>

d491f2b7

09 4月, 2019 1 次提交

treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively · d75f773c

由 Sakari Ailus 提交于 3月 25, 2019

%pF and %pf are functionally equivalent to %pS and %ps conversion
specifiers. The former are deprecated, therefore switch the current users
to use the preferred variant.

The changes have been produced by the following command:

	git grep -l '%p[fF]' | grep -v '^\(tools\|Documentation\)/' | \
	while read i; do perl -i -pe 's/%pf/%ps/g; s/%pF/%pS/g;' $i; done

And verifying the result.

Link: http://lkml.kernel.org/r/20190325193229.23390-1-sakari.ailus@linux.intel.com
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: xen-devel@lists.xenproject.org
Cc: linux-acpi@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Cc: linux-mmc@vger.kernel.org
Cc: linux-nvdimm@lists.01.org
Cc: linux-pci@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-btrfs@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: linux-mm@kvack.org
Cc: ceph-devel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: NSakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: David Sterba <dsterba@suse.com> (for btrfs)
Acked-by: Mike Rapoport <rppt@linux.ibm.com> (for mm/memblock.c)
Acked-by: Bjorn Helgaas <bhelgaas@google.com> (for drivers/pci)
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NPetr Mladek <pmladek@suse.com>

d75f773c

09 2月, 2019 1 次提交

PCI: Clean up usage of __u32 type · 20a796a9

由 Logan Gunthorpe 提交于 2月 08, 2019

The double underscore types are meant for compatibility in userspace
headers which does not apply here. Therefore, change to use the standard
no-underscore types.

The origin of the double underscore types dates back to before the git era
so I was not able to find a commit to see the original justification.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

20a796a9

15 12月, 2018 1 次提交

PCI: Remove unused attr variable in pci_dma_configure · 66420441

由 Nathan Chancellor 提交于 12月 14, 2018

Clang warns:

drivers/pci/pci-driver.c:1603:21: error: unused variable 'attr'
[-Werror,-Wunused-variable]

Commit e5361ca2 ("ACPI / scan: Refactor _CCA enforcement") removed
attr's use and replaced it with its assigned value so it is no longer
needed.
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

66420441

14 12月, 2018 1 次提交

ACPI / scan: Refactor _CCA enforcement · e5361ca2

由 Robin Murphy 提交于 12月 06, 2018

Rather than checking the DMA attribute at each callsite, just pass it
through for acpi_dma_configure() to handle directly. That can then deal
with the relatively exceptional DEV_DMA_NOT_SUPPORTED case by explicitly
installing dummy DMA ops instead of just skipping setup entirely. This
will then free up the dev->dma_ops == NULL case for some valuable
fastpath optimisations.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NJesper Dangaard Brouer <brouer@redhat.com>
Tested-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NTony Luck <tony.luck@intel.com>

e5361ca2

13 12月, 2018 1 次提交

PCI / PM: Allow runtime PM without callback functions · c5eb1190

由 Jarkko Nikula 提交于 10月 23, 2018

a9c8088c ("i2c: i801: Don't restore config registers on runtime PM")
nullified the runtime PM suspend/resume callback pointers while keeping the
runtime PM enabled.

This caused the SMBus PCI device to stay in D0 with
/sys/devices/.../power/runtime_status showing "error" when the runtime PM
framework attempted to autosuspend the device.  This is due to PCI bus
runtime PM, which checks for driver runtime PM callbacks and returns
-ENOSYS if they are not set.

Since i2c-i801.c doesn't need to do anything device-specific for runtime
PM, Jean Delvare proposed this be fixed in the PCI core rather than adding
dummy runtime PM callback functions in the PCI drivers.

Change pci_pm_runtime_suspend()/pci_pm_runtime_resume() so they allow
changing the PCI device power state during runtime PM transitions even if
the driver supplies no runtime PM callbacks.

This fixes the runtime PM regression on i2c-i801.c.

It is not obvious why the code previously required the runtime PM
callbacks.  The test has been there since the code was introduced by
6cbf8214 ("PCI PM: Run-time callbacks for PCI bus type").

On the other hand, a similar change was done to generic runtime PM
callbacks in 05aa55dd ("PM / Runtime: Lenient generic runtime pm
callbacks").

Fixes: a9c8088c ("i2c: i801: Don't restore config registers on runtime PM")
Reported-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NJean Delvare <jdelvare@suse.de>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org	# v4.18+

c5eb1190

31 7月, 2018 1 次提交

PCI: Call dma_debug_add_bus() for pci_bus_type from PCI core · a8651194

由 Christoph Hellwig 提交于 7月 30, 2018

There is nothing arch-specific about PCI or dma-debug, so call
dma_debug_add_bus() from the PCI core just after registering the bus type.

Most of dma-debug is already generic; this just adds reporting of pending
dma-allocations on driver unload for arches other than powerpc, sh, and
x86.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

a8651194

30 6月, 2018 1 次提交

PCI/IOV: Reset total_VFs limit after detaching PF driver · 38972375

由 Jakub Kicinski 提交于 6月 29, 2018

The TotalVFs register in the SR-IOV capability is the hardware limit on the
number of VFs.  A PF driver can limit the number of VFs further with
pci_sriov_set_totalvfs().  When the PF driver is removed, reset any VF
limit that was imposed by the driver because that limit may not apply to
other drivers.

Before 8d85a7a4 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0"),
pci_sriov_set_totalvfs(pdev, 0) meant "we can enable TotalVFs virtual
functions", and the nfp driver used that to remove the VF limit when the
driver unloads.

8d85a7a4 broke that because instead of removing the VF limit,
pci_sriov_set_totalvfs(pdev, 0) actually sets the limit to zero, and that
limit persists even if another driver is loaded.

We could fix that by making the nfp driver reset the limit when it unloads,
but it seems more robust to do it in the PCI core instead of relying on the
driver.

The regression scenario is:

  nfp_pci_probe (driver 1)
  ...
  nfp_pci_remove
    pci_sriov_set_totalvfs(pf->pdev, 0)   # limits VFs to 0

  ...
  nfp_pci_probe (driver 2)
    nfp_rtsym_read_le("nfd_vf_cfg_max_vfs")
    # no VF limit from firmware

Now driver 2 is broken because the VF limit is still 0 from driver 1.

Fixes: 8d85a7a4 ("PCI/IOV: Allow PF drivers to limit total_VFs to 0")
Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com>
[bhelgaas: changelog, rename functions]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

38972375

24 5月, 2018 1 次提交

PCI / PM: Do not clear state_saved for devices that remain suspended · 656088aa

由 Rafael J. Wysocki 提交于 5月 18, 2018

The state_saved flag should not be cleared in pci_pm_suspend() if the
given device is going to remain suspended, or the device's config
space will not be restored properly during the subsequent resume.

Namely, if the device is going to stay in suspend, both the late
and noirq callbacks return early for it, so if its state_saved flag
is cleared in pci_pm_suspend(), it will remain unset throughout the
remaining part of suspend and resume and pci_restore_state() called
for the device going forward will return without doing anything.

For this reason, change pci_pm_suspend() to only clear state_saved
if the given device is not going to remain suspended.  [This is
analogous to what commit ae860a19 (PCI / PM: Do not clear
state_saved in pci_pm_freeze() when smart suspend is set) did for
hibernation.]

Fixes: c4b65157 (PCI / PM: Take SMART_SUSPEND driver flag into account)
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

656088aa

18 5月, 2018 1 次提交

PCI/AER: Factor out error reporting to drivers/pci/pcie/err.c · 2e28bc84

由 Oza Pawandeep 提交于 5月 17, 2018

Move the error reporting callbacks from aerdrv_core.c to err.c, where they
can be used by DPC in addition to AER.

As part of aerdrv_core.c, these callbacks were built under CONFIG_PCIEAER.
Moving them to the new err.c means they will now be built under
CONFIG_PCIEPORTBUS, so adjust the definition of pci_uevent_ers() to match.
Signed-off-by: NOza Pawandeep <poza@codeaurora.org>
[bhelgaas: in reset_link(), initialize "driver" even if CONFIG_PCIEAER is
unset, update pci_uevent_ers() #ifdef wrapper]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

2e28bc84

03 5月, 2018 2 次提交

drivers: remove force dma flag from buses · 3d6ce86e

由 Christoph Hellwig 提交于 5月 03, 2018

With each bus implementing its own DMA configuration callback, there is no
need for bus to explicitly set the force_dma flag.  Modify the
of_dma_configure function to accept an input parameter which specifies if
implicit DMA configuration is required when it is not described by the
firmware.
Signed-off-by: NNipun Gupta <nipun.gupta@nxp.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # PCI parts
Reviewed-by: NRob Herring <robh@kernel.org>
[hch: tweaked the changelog a bit]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

3d6ce86e

dma-mapping: move dma configuration to bus infrastructure · 07397df2

由 Nipun Gupta 提交于 4月 28, 2018

ACPI/OF support for configuration of DMA is a bus specific aspect, and
thus should be configured by the bus.  Introduces a 'dma_configure' bus
method so that busses can control their DMA capabilities.

Also update the PCI, Platform, ACPI and host1x buses to use the new
method.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NNipun Gupta <nipun.gupta@nxp.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # PCI parts
Acked-by: NThierry Reding <treding@nvidia.com>
Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
[hch: simplified host1x_dma_configure based on a comment from Thierry,
      rewrote changelog]
Signed-off-by: NChristoph Hellwig <hch@lst.de>

07397df2

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功