提交 · cdbeb633ca71a02b7b63bfeb94994bf4e1a0b894 · openeuler / Kernel

16 8月, 2017 1 次提交

PCI: fix oops when try to find Root Port for a PCI device · 0e405232

由 dingtianhong 提交于 8月 15, 2017

Eric report a oops when booting the system after applying
the commit a99b646a ("PCI: Disable PCIe Relaxed..."):

[    4.241029] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[    4.247001] IP: pci_find_pcie_root_port+0x62/0x80
[    4.253011] PGD 0
[    4.253011] P4D 0
[    4.253011]
[    4.258013] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[    4.262015] Modules linked in:
[    4.265005] CPU: 31 PID: 1 Comm: swapper/0 Not tainted 4.13.0-dbx-DEV #316
[    4.271002] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
[    4.279002] task: ffffa2ee38cfa040 task.stack: ffffa51ec0004000
[    4.285001] RIP: 0010:pci_find_pcie_root_port+0x62/0x80
[    4.290012] RSP: 0000:ffffa51ec0007ab8 EFLAGS: 00010246
[    4.295003] RAX: 0000000000000000 RBX: ffffa2ee36bae000 RCX: 0000000000000006
[    4.303002] RDX: 000000000000081c RSI: ffffa2ee38cfa8c8 RDI: ffffa2ee36bae000
[    4.310013] RBP: ffffa51ec0007b58 R08: 0000000000000001 R09: 0000000000000000
[    4.317001] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa51ec0007ad0
[    4.324005] R13: ffffa2ee36bae098 R14: 0000000000000002 R15: ffffa2ee37204818
[    4.331002] FS:  0000000000000000(0000) GS:ffffa2ee3fcc0000(0000) knlGS:0000000000000000
[    4.339002] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.345001] CR2: 0000000000000050 CR3: 000000401000f000 CR4: 00000000001406e0
[    4.351002] Call Trace:
[    4.354012]  ? pci_configure_device+0x19f/0x570
[    4.359002]  ? pci_conf1_read+0xb8/0xf0
[    4.363002]  ? raw_pci_read+0x23/0x40
[    4.366011]  ? pci_read+0x2c/0x30
[    4.370014]  ? pci_read_config_word+0x67/0x70
[    4.374012]  pci_device_add+0x28/0x230
[    4.378012]  ? pci_vpd_f0_read+0x50/0x80
[    4.382014]  pci_scan_single_device+0x96/0xc0
[    4.386012]  pci_scan_slot+0x79/0xf0
[    4.389001]  pci_scan_child_bus+0x31/0x180
[    4.394014]  acpi_pci_root_create+0x1c6/0x240
[    4.398013]  pci_acpi_scan_root+0x15f/0x1b0
[    4.402012]  acpi_pci_root_add+0x2e6/0x400
[    4.406012]  ? acpi_evaluate_integer+0x37/0x60
[    4.411002]  acpi_bus_attach+0xdf/0x200
[    4.415002]  acpi_bus_attach+0x6a/0x200
[    4.418014]  acpi_bus_attach+0x6a/0x200
[    4.422013]  acpi_bus_scan+0x38/0x70
[    4.426011]  acpi_scan_init+0x10c/0x271
[    4.429001]  acpi_init+0x2fa/0x348
[    4.433004]  ? acpi_sleep_proc_init+0x2d/0x2d
[    4.437001]  do_one_initcall+0x43/0x169
[    4.441001]  kernel_init_freeable+0x1d0/0x258
[    4.445003]  ? rest_init+0xe0/0xe0
[    4.449001]  kernel_init+0xe/0x150

====================== cut here =============================

It looks like the pci_find_pcie_root_port() was trying to
find the Root Port for the PCI device which is the Root
Port already, it will return NULL and trigger the problem,
so check the highest_pcie_bridge to fix thie problem.

Fixes: a99b646a ("PCI: Disable PCIe Relaxed Ordering if unsupported")
Fixes: c56d4450 ("PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum")
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDing Tianhong <dingtianhong@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e405232

02 8月, 2017 1 次提交

PCI: Add pci_reset_function_locked() · a477b9cd

由 Marc Zyngier 提交于 8月 01, 2017

The implementation of PCI workarounds may require that the device is reset
from its probe function.  This implies that the PCI device lock is already
held, and makes calling pci_reset_function() impossible (since it will
itself try to take that lock).

Add pci_reset_function_locked(), which is the equivalent of
pci_reset_function(), except that it requires the PCI device lock to be
already held by the caller.
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
[bhelgaas: folded in fix for conflict with 52354b9d ("PCI: Remove
__pci_dev_reset() and pci_dev_reset()")]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org	# 4.11: 52354b9d: PCI: Remove __pci_dev_reset() and pci_dev_reset()
Cc: stable@vger.kernel.org	# 4.11

a477b9cd

13 7月, 2017 1 次提交

PCI / PM: Restore PME Enable after config space restoration · 0ce3fcaf

由 Rafael J. Wysocki 提交于 7月 12, 2017

Commit dc15e71e (PCI / PM: Restore PME Enable if skipping wakeup
setup) introduced a mechanism by which the PME Enable bit can be
restored by pci_enable_wake() if dev->wakeup_prepared is set in
case it has been overwritten by PCI config space restoration.

However, that commit overlooked the fact that on some systems (Dell
XPS13 9360 in particular) the AML handling wakeup events checks PME
Status and PME Enable and it won't trigger a Notify() for devices
where those bits are not set while it is running.

That happens during resume from suspend-to-idle when pci_restore_state()
invoked by pci_pm_default_resume_early() clears PME Enable before the
wakeup events are processed by AML, effectively causing those wakeup
events to be ignored.

Fix this issue by restoring the PME Enable configuration right after
pci_restore_state() has been called instead of doing that in
pci_enable_wake().

Fixes: dc15e71e (PCI / PM: Restore PME Enable if skipping wakeup setup)
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

0ce3fcaf

03 7月, 2017 2 次提交

PCI: Remove __pci_dev_reset() and pci_dev_reset() · 52354b9d

由 Christoph Hellwig 提交于 6月 01, 2017

Implement the reset probing / reset chain directly in
__pci_probe_reset_function() and __pci_reset_function_locked()
respectively.

Link: http://lkml.kernel.org/r/20170601111039.8913-4-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

52354b9d

PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done() · 775755ed

由 Christoph Hellwig 提交于 6月 01, 2017

The pci_error_handlers->reset_notify() method had a flag to indicate
whether to prepare for or clean up after a reset.  The prepare and done
cases have no shared functionality whatsoever, so split them into separate
methods.

[bhelgaas: changelog, update locking comments]
Link: http://lkml.kernel.org/r/20170601111039.8913-3-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

775755ed

01 7月, 2017 1 次提交

PCI/PM: Avoid using device_may_wakeup() for runtime PM · 666ff6f8

由 Rafael J. Wysocki 提交于 6月 23, 2017

pci_target_state() calls device_may_wakeup() which checks whether or not
the device may wake up the system from sleep states, but pci_target_state()
is used for runtime PM too.

Since runtime PM is expected to always enable remote wakeup if possible,
modify pci_target_state() to take additional argument indicating whether or
not it should look for a state from which the device can signal wakeup and
pass either the return value of device_can_wakeup(), or "false" (if the
device itself is not wakeup-capable) to it from the code related to runtime
PM.

While at it, fix the comment in pci_dev_run_wake() which is not about sleep
states.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>

666ff6f8

28 6月, 2017 2 次提交

PM / core: Drop run_wake flag from struct dev_pm_info · de3ef1eb

由 Rafael J. Wysocki 提交于 6月 24, 2017

The run_wake flag in struct dev_pm_info is used to indicate whether
or not the device is capable of generating remote wakeup signals at
run time (or in the system working state), but the distinction
between runtime remote wakeup and system wakeup signaling has always
been rather artificial. The only practical reason for it to exist
at the core level was that ACPI and PCI treated those two cases
differently, but that's not the case any more after recent changes.

For this reason, get rid of the run_wake flag and, when applicable,
use device_set_wakeup_capable() and device_can_wakeup() instead of
device_set_run_wake() and device_run_wake(), respectively.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

de3ef1eb

PCI / PM: Simplify device wakeup settings code · 0847684c

由 Rafael J. Wysocki 提交于 6月 24, 2017

After previous changes it is not necessary to distinguish between
device wakeup for run time and device wakeup from system sleep states
any more, so rework the PCI device wakeup settings code accordingly.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

0847684c

17 6月, 2017 1 次提交

PCI: Test INTx masking during enumeration, not at run-time · 99b3c58f

由 Piotr Gregor 提交于 5月 26, 2017

The test for INTx masking via PCI_COMMAND_INTX_DISABLE performed in
pci_intx_mask_supported() should be done before the device can be used.
This is to avoid writing PCI_COMMAND while the driver owns the device, in
case that has any effect on MSI/MSI-X interrupts.

Move the content of pci_intx_mask_supported() to pci_intx_mask_broken() and
call it from pci_setup_device().

The test result can be queried at any time later using the same
pci_intx_mask_supported() interface as before (though with changed
implementation), so callers (uio, vfio) should be unaffected.
Signed-off-by: NPiotr Gregor <piotrgregor@rsyncme.org>
[bhelgaas: changelog, remove quirk check, remove locking, move
dev->broken_intx_masking assignment to caller]
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>

99b3c58f

15 6月, 2017 2 次提交

PCI: Protect pci_error_handlers->reset_notify() usage with device_lock() · b014e96d

由 Christoph Hellwig 提交于 6月 01, 2017

Every method in struct device_driver or structures derived from it like
struct pci_driver MUST provide exclusion vs the driver's ->remove() method,
usually by using device_lock().

Protect use of pci_error_handlers->reset_notify() by holding the device
lock while calling it.

Note:

  - pci_dev_lock() calls device_lock() in addition to blocking user-space
    config accesses.

  - pci_err_handlers->reset_notify() is used inside
    pci_dev_save_and_disable() and pci_dev_restore().  We could hold the
    device lock directly in pci_reset_notify(), but we expand the region
    since we have several calls following each other.

Without this, ->reset_notify() may race with ->remove() calls, which can be
easily triggered in NVMe.

[bhelgaas: changelog, add pci_reset_notify() comment]
[bhelgaas: fold in fix from Dan Carpenter <dan.carpenter@oracle.com>:
http://lkml.kernel.org/r/20170701135323.x5vaj4e2wcs2mcro@mwanda]
Link: http://lkml.kernel.org/r/20170601111039.8913-2-hch@lst.deReported-by: NRakesh Pandit <rakesh@tuxera.com>
Tested-by: NRakesh Pandit <rakesh@tuxera.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

b014e96d

PCI / PM: Restore PME Enable if skipping wakeup setup · dc15e71e

由 Rafael J. Wysocki 提交于 6月 12, 2017

The wakeup_prepared PCI device flag is used for preventing subsequent
changes of PCI device wakeup settings in the same way (e.g. enabling
device wakeup twice in a row).

However, in some cases PME Enable may be updated by things like PCI
configuration space restoration in the meantime and it may need to be
set again even though the rest of the settings need not change, so
modify __pci_enable_wake() to do that when it is about to return
early.

Also, it is reasonable to expect that __pci_enable_wake() will always
clear PME Status when invoked to disable device wakeup, so make it do
so even if it is going to return early then.
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NBjorn Helgaas <bhelgaas@google.com>

dc15e71e

31 5月, 2017 1 次提交

PCI: Restore PRI and PASID state after Function-Level Reset · 4ebeb1ec

由 CQ Tang 提交于 5月 30, 2017

After a Function-Level Reset, PCI states need to be restored.  Save PASID
features and PRI reqs cached.

[bhelgaas: search for capability only if PRI/PASID were enabled]
Signed-off-by: NCQ Tang <cq.tang@intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jean-Phillipe Brucker <jean-philippe.brucker@arm.com>
Cc: David Woodhouse <dwmw2@infradead.org>

4ebeb1ec

24 5月, 2017 1 次提交

PCI/PM: Add needs_resume flag to avoid suspend complete optimization · 4d071c32

由 Imre Deak 提交于 5月 23, 2017

Some drivers - like i915 - may not support the system suspend direct
complete optimization due to differences in their runtime and system
suspend sequence.  Add a flag that when set resumes the device before
calling the driver's system suspend handlers which effectively disables
the optimization.

Needed by a future patch fixing suspend/resume on i915.

Suggested by Rafael.
Signed-off-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org

4d071c32

18 5月, 2017 1 次提交

PCI: Do not disregard parent resources starting at 0x0 · 31342330

由 Ard Biesheuvel 提交于 4月 11, 2017

Commit f44116ae ("PCI: Remove pci_find_parent_resource() use for
allocation") updated the logic that iterates over all bus resources and
compares them to a given resource, in order to decide whether one is the
parent of the latter.

This change inadvertently causes pci_find_parent_resource() to disregard
resources starting at address 0x0, resulting in an error such as the one
below on ARM systems whose I/O window starts at 0x0.

  pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff window]
  pci_bus 0000:00: root bus resource [io  0x0000-0xffff window]
  pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff window]
  pci_bus 0000:00: root bus resource [bus 00-0f]
  pci 0000:00:01.0: PCI bridge to [bus 01]
  pci 0000:00:02.0: PCI bridge to [bus 02]
  pci 0000:00:03.0: PCI bridge to [bus 03]
  pci 0000:00:03.0: can't claim BAR 13 [io  0x0000-0x0fff]: no compatible bridge window
  pci 0000:03:01.0: can't claim BAR 0 [io  0x0000-0x001f]: no compatible bridge window

While this never happens on x86, it is perfectly legal in general for a PCI
MMIO or IO window to start at address 0x0, and it was supported in the code
before commit f44116ae.

Drop the test for res->start != 0; resource_contains() already checks
whether [start, end) completely covers the resource, and so it should be
redundant.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

31342330

25 4月, 2017 1 次提交

PCI: Implement devm_pci_remap_cfgspace() · 490cb6dd

由 Lorenzo Pieralisi 提交于 4月 19, 2017

The introduction of the pci_remap_cfgspace() interface allows PCI host
controller drivers to map PCI config space through a dedicated kernel
interface. Current PCI host controller drivers use the devm_ioremap_*()
devres interfaces to map PCI configuration space regions so in order to
update them to the new pci_remap_cfgspace() mapping interface a new set of
devres interfaces should be implemented so that PCI host controller drivers
can make use of them.

Introduce two new functions in the PCI kernel layer and Devres
documentation:

- devm_pci_remap_cfgspace()
- devm_pci_remap_cfg_resource()

so that PCI host controller drivers can make use of them to map PCI
configuration space regions.
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>

490cb6dd

21 4月, 2017 1 次提交

PCI: Export pci_remap_iospace() and pci_unmap_iospace() · f90b0875

由 Brian Norris 提交于 3月 09, 2017

These are useful for PCIe host drivers, and those drivers can be modules.

[bhelgaas: don't remove __weak; it's removed elsewhere]
Signed-off-by: NBrian Norris <briannorris@chromium.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NShawn Lin <shawn.lin@rock-chips.com>

f90b0875

20 4月, 2017 6 次提交

PCI: Export pcie_flr() · a60a2b73

由 Christoph Hellwig 提交于 4月 14, 2017

Currently we opencode the FLR sequence in lots of place; export a core
helper instead.  We split out the probing for FLR support as all the
non-core callers already know their hardware.

Note that in the new pci_has_flr() function the quirk check has been moved
before the capability check as there is no point in reading the capability
in this case.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

a60a2b73

PCI: Remove __weak tag from pci_remap_iospace() · 7b309aef

由 Lorenzo Pieralisi 提交于 4月 19, 2017

pci_remap_iospace() is marked as a weak symbol even though no architecture
is currently overriding it; given that its implementation internals have
already code paths that are arch specific (ie PCI_IOBASE and
ioremap_page_range() attributes) there is no need to leave the weak symbol
in the kernel since the same functionality can be achieved by customizing
per-arch the corresponding functionality.

Remove the __weak symbol from pci_remap_iospace().
Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>

7b309aef

PCI: Don't resize resources when realigning all devices in system · e3adec72

由 Yongji Xie 提交于 4月 10, 2017

The "pci=resource_alignment" argument aligns BARs of designated devices by
artificially increasing their size.  Increasing the size increases the
alignment and prevents other resources from being assigned in the same
alignment region, e.g., in the same page, but it can break drivers that use
the BAR size to locate things, e.g., ilo_map_device() does this:

  off = pci_resource_len(pdev, bar) - 0x2000;

The new pcibios_default_alignment() interface allows an arch to request
that *all* BARs in the system be aligned to a larger size.  In this case,
we don't need to artificially increase the resource size because we know
every BAR of every device will be realigned, so nothing will share the same
alignment region.

Use IORESOURCE_STARTALIGN to request realignment of PCI BARs when we know
we're realigning all BARs in the system.

[bhelgaas: comment, changelog]
Signed-off-by: NYongji Xie <elohimes@gmail.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

e3adec72

PCI: Don't reassign resources that are already aligned · 0dde1c08

由 Bjorn Helgaas 提交于 4月 17, 2017

The "pci=resource_alignment=" kernel argument designates devices for which
we want alignment greater than is required by the PCI specs.  Previously we
set IORESOURCE_UNSET for every MEM resource of those devices, even if the
resource was *already* sufficiently aligned.

If a resource is already sufficiently aligned, leave it alone and don't try
to reassign it.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

0dde1c08

PCI: Factor pci_reassigndev_resource_alignment() · 81a5e70e

由 Bjorn Helgaas 提交于 4月 14, 2017

Pull the BAR size adjustment out into a new function,
pci_request_resource_alignment(), and add a comment about how and why we
increase the resource size and alignment.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

81a5e70e

PCI: Add pcibios_default_alignment() for arch-specific alignment control · 0a701aa6

由 Yongji Xie 提交于 4月 10, 2017

When VFIO passes through a PCI device to a guest, it does not allow the
guest to mmap BARs that are smaller than PAGE_SIZE unless it can reserve
the rest of the page (see vfio_pci_probe_mmaps()). This is because a page
might contain several small BARs for unrelated devices and a guest should
not be able to access all of them.

VFIO emulates guest accesses to non-mappable BARs, which is functional but
slow. On systems with large page sizes, e.g., PowerNV with 64K pages, BARs
are more likely to share a page and performance is more likely to be a
problem.

Add a weak function to set default alignment for all PCI devices.  An arch
can override it to force the PCI core to place memory BARs on their own
pages.
Signed-off-by: NYongji Xie <elohimes@gmail.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

0a701aa6

19 4月, 2017 2 次提交

PCI: Freeze PME scan before suspending devices · ea00353f

由 Lukas Wunner 提交于 4月 18, 2017

Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790)
crashes during suspend tests.  Geert Uytterhoeven managed to reproduce the
issue on an M2-W Koelsch board (r8a7791):

  It occurs when the PME scan runs, once per second.  During PME scan, the
  PCI host bridge (rcar-pci) registers are accessed while its module clock
  has already been disabled, leading to the crash.

One reproducer is to configure s2ram to use "s2idle" instead of "deep"
suspend:

  # echo 0 > /sys/module/printk/parameters/console_suspend
  # echo s2idle > /sys/power/mem_sleep
  # echo mem > /sys/power/state

Another reproducer is to write either "platform" or "processors" to
/sys/power/pm_test.  It does not (or is less likely) to happen during full
system suspend ("core" or "none") because system suspend also disables
timers, and thus the workqueue handling PME scans no longer runs.  Geert
believes the issue may still happen in the small window between disabling
module clocks and disabling timers:

  # echo 0 > /sys/module/printk/parameters/console_suspend
  # echo platform > /sys/power/pm_test    # Or "processors"
  # echo mem > /sys/power/state

(Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.)

Rafael Wysocki agrees that PME scans should be suspended before the host
bridge registers become inaccessible.  To that end, queue the task on a
workqueue that gets frozen before devices suspend.

Rafael notes however that as a result, some wakeup events may be missed if
they are delivered via PME from a device without working IRQ (which hence
must be polled) and occur after the workqueue has been frozen.  If that
turns out to be an issue in practice, it may be possible to solve it by
calling pci_pme_list_scan() once directly from one of the host bridge's
pm_ops callbacks.

Stacktrace for posterity:

  PM: Syncing filesystems ... [   38.566237] done.
  PM: Preparing system for sleep (mem)
  Freezing user space processes ... [   38.579813] (elapsed 0.001 seconds) done.
  Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
  PM: Suspending system (mem)
  PM: suspend of devices complete after 152.456 msecs
  PM: late suspend of devices complete after 2.809 msecs
  PM: noirq suspend of devices complete after 29.863 msecs
  suspend debug: Waiting for 5 second(s).
  Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
  pgd = c0003000
  [00000000] *pgd=80000040004003, *pmd=00000000
  Internal error: : 1211 [#1] SMP ARM
  Modules linked in:
  CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted
  4.9.0-rc1-koelsch-00011-g68db9bc8 #3383
  Hardware name: Generic R8A7791 (Flattened Device Tree)
  Workqueue: events pci_pme_list_scan
  task: eb56e140 task.stack: eb58e000
  PC is at pci_generic_config_read+0x64/0x6c
  LR is at rcar_pci_cfg_base+0x64/0x84
  pc : [<c041d7b4>]    lr : [<c04309a0>]    psr: 600d0093
  sp : eb58fe98  ip : c041d750  fp : 00000008
  r10: c0e2283c  r9 : 00000000  r8 : 600d0013
  r7 : 00000008  r6 : eb58fed6  r5 : 00000002  r4 : eb58feb4
  r3 : 00000000  r2 : 00000044  r1 : 00000008  r0 : 00000000
  Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
  Control: 30c5387d  Table: 6a9f6c80  DAC: 55555555
  Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210)
  Stack: (0xeb58fe98 to 0xeb590000)
  fe80:                                                       00000002 00000044
  fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000
  fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830
  fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc
  ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100
  ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000
  ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380
  ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000
  ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0
  ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000
  ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd
  [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>]
  (pci_bus_read_config_word+0x58/0x80)
  [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>]
  (pci_check_pme_status+0x34/0x78)
  [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54)
  [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4)
  [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>]
  (process_one_work+0x1bc/0x308)
  [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0)
  [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc)
  [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c)
  Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000)
  ---[ end trace 667d43ba3aa9e589 ]---

Fixes: df17e62e ("PCI: Add support for polling PME state on suspended legacy PCI devices")
Reported-and-tested-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reported-and-tested-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org	# 2.6.37+
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Cc: Simon Horman <horms+renesas@verge.net.au>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>

ea00353f

PCI: Ignore requested alignment for IOV BARs · ea629d87

由 Yongji Xie 提交于 2月 15, 2017

We would call pci_reassigndev_resource_alignment() before
pci_init_capabilities().  So the requested alignment would never work for
IOV BARs.

Furthermore, it's meaningless to request additional alignment for IOV BARs,
the IOV BAR alignment is only determined by the VF BAR size.
Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>

ea629d87

04 4月, 2017 1 次提交

PCI: Avoid FLR for Intel 82579 NICs · f65fd1aa

由 Sasha Neftin 提交于 4月 03, 2017

Per Intel Specification Update 335553-002 (see link below), some 82579
network adapters advertise a Function Level Reset (FLR) capability, but
they can hang when an FLR is triggered.

To reproduce the problem, attach the device to a VM, then detach and try to
attach again.

Add a quirk to prevent the use of FLR on these devices.

[bhelgaas: changelog, comments]
Link: http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/82579lm-82579v-gigabit-network-connection-spec-update.pdfSigned-off-by: NSasha Neftin <sasha.neftin@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

f65fd1aa

30 3月, 2017 1 次提交

PCI: Short-circuit pci_device_is_present() for disconnected devices · fe2bd75b

由 Keith Busch 提交于 3月 29, 2017

If the PCI device is disconnected, return false immediately from
pci_device_is_present().  pci_device_is_present() uses the bus accessors,
so the early return in the device accessors doesn't help here.
Tested-by: NKrishna Dhulipala <krishnad@fb.com>
Signed-off-by: NKeith Busch <keith.busch@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NWei Zhang <wzhang@fb.com>

fe2bd75b

15 3月, 2017 1 次提交

PCI/PM: Don't sleep at all when d3_delay or d3cold_delay is zero · 50b2b540

由 Adrian Hunter 提交于 3月 14, 2017

msleep() still sleeps 1 jiffy even when told to sleep for zero
milliseconds.  That can end up being 1-2 milliseconds or more.  In the
cases of d3_delay and d3cold_delay, that unnecessarily increases suspend
and/or resume latencies.

Do not sleep at all for the respective cases if d3_delay is zero or
d3cold_delay is zero.
Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

50b2b540

03 2月, 2017 1 次提交

Revert "PCI: pciehp: Add runtime PM support for PCIe hotplug ports" · d98e0929

由 Bjorn Helgaas 提交于 2月 03, 2017

This reverts commit 68db9bc8.

Yinghai reported that the following manual hotplug sequence:

  # echo 0 > /sys/bus/pci/slots/8/power
  # echo 1 > /sys/bus/pci/slots/8/power

worked in v4.9, but fails in v4.10-rc1, and that reverting 68db9bc8
("PCI: pciehp: Add runtime PM support for PCIe hotplug ports") makes it
work again.

Fixes: 68db9bc8 ("PCI: pciehp: Add runtime PM support for PCIe hotplug ports")
Link: https://lkml.kernel.org/r/CAE9FiQVCMCa7iVyuwp9z6VrY0cE7V_xghuXip28Ft52=8QmTWw@mail.gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=193951Reported-by: NYinghai Lu <yinghai@kernel.org>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

d98e0929

30 11月, 2016 1 次提交

PCI: Remove pci_resource_bar() and pci_iov_resource_bar() · 286c2378

由 Bjorn Helgaas 提交于 11月 28, 2016

pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.

Compute the BAR address inline and remove pci_resource_bar(). That makes
pci_iov_resource_bar() unused, so remove that as well.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>

286c2378

29 11月, 2016 1 次提交

PCI: Ignore BAR updates on virtual functions · 63880b23

由 Bjorn Helgaas 提交于 11月 28, 2016

VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NGavin Shan <gwshan@linux.vnet.ibm.com>

63880b23

18 11月, 2016 7 次提交

PCI: pciehp: Add runtime PM support for PCIe hotplug ports · 68db9bc8

由 Lukas Wunner 提交于 10月 28, 2016

Linux 4.8 added support for runtime suspending PCIe ports to D3hot with
commit 006d44e4 ("PCI: Add runtime PM support for PCIe ports"), but
excluded hotplug ports. Those are now afforded runtime PM by the present
commit.

Hotplug ports require a few extra considerations:

- The configuration space of the port remains accessible in D3hot, so all
the functions to read or modify the Slot Status and Slot Control
registers need not be modified. Even turning on slot power doesn't seem
to require the port to be in D0, at least the PCIe spec doesn't say so
and I confirmed that by testing with a Thunderbolt controller.

- However D0 is required to access devices on the secondary bus. This
happens in pciehp_check_link_status() and pciehp_configure_device() (both
called from board_added()) and in pciehp_unconfigure_device() (called
from remove_board()), so acquire a runtime PM ref for their invocation.

- The hotplug port stays active as long as it has active children. If all
hotplugged devices below the port runtime suspend, the port is allowed to
runtime suspend as well. Plug and unplug detection continues to work in
D3hot.

- Hotplug interrupts are delivered in-band, so while the hotplug port
itself is allowed to go to D3hot, its parent ports must stay in D0 for
interrupts to come through. Add a corresponding restriction to
pci_dev_check_d3cold().

- Runtime PM may only be allowed if the hotplug port is handled natively by
the OS. On ACPI systems, the port may alternatively be handled by the
firmware and things break if the OS puts the port into D3 behind the
firmware's back: E.g. Thunderbolt hotplug ports on non-Macs are handled
by Intel's firmware in System Management Mode and the firmware is known
to access devices on the port's secondary bus without checking first if
the port is in D0: https://bugzilla.kernel.org/show_bug.cgi?id=53811Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>

68db9bc8

PCI: Unfold conditions to block runtime PM on PCIe ports · 718a0609

由 Lukas Wunner 提交于 10月 28, 2016

The conditions to block D3 on parent ports are currently condensed into a
single expression in pci_dev_check_d3cold().  Upcoming commits will add
further conditions for hotplug ports, making this expression fairly large
and impenetrable.  Unfold the conditions to maintain readability when they
are amended.

No functional change intended.
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
CC: Mika Westerberg <mika.westerberg@linux.intel.com>
CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

718a0609

PCI: Consolidate conditions to allow runtime PM on PCIe ports · 97a90aee

由 Lukas Wunner 提交于 10月 28, 2016

The conditions to allow runtime PM on PCIe ports are currently spread
across two different files: The condition relating to hotplug ports is
located in portdrv_pci.c whereas all other conditions are located in pci.c.

Consolidate all conditions in a single place in pci.c, thus making it
easier to follow the logic and amend conditions down the road.

Note that the condition relating to hotplug ports is inserted *before* the
condition relating to the "pcie_port_pm=force" command line option, so
runtime PM is not afforded to hotplug ports even if this option is given.
That's exactly how the code behaved up until now. If this is not desired,
the ordering of the conditions can simply be reversed.

No functional change intended.
Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

97a90aee

PCI: Activate runtime PM on a PCIe port only if it can suspend · c6a63307

由 Lukas Wunner 提交于 10月 28, 2016

Currently pcie_portdrv_probe() activates runtime PM on a PCIe port even
if it will never actually suspend because the BIOS is too old or the
"pcie_port_pm=off" option was specified on the kernel command line.

A few CPU cycles can be saved by not activating runtime PM at all in these
cases, because rpm_idle() and rpm_suspend() will bail out right at the
beginning when calling rpm_check_suspend_allowed(), instead of carrying out
various locking and assignments, invoking rpm_callback(), getting back
-EBUSY and rolling everything back.

The conditions checked in pci_bridge_d3_possible() are all static, they
never change during uptime of the system, hence it's safe to call this to
determine if runtime PM should be activated.

No functional change intended.
Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

c6a63307

PCI: Speed up algorithm in pci_bridge_d3_update() · e8559b71

由 Lukas Wunner 提交于 10月 28, 2016

After a device has been added, removed or had its D3cold attributes
changed, we recheck whether its parent bridge may runtime suspend to D3hot
with pci_bridge_d3_update().

The most naive algorithm would be to iterate over the bridge's children and
check if any of them are blocking D3.

The function already tries to be a bit smarter than that by first checking
the device that was changed.  If this device already blocks D3 on the
bridge, then walking over all the other children can be skipped.  A
drawback of this approach is that if the device is *not* blocking D3, it
will be checked a second time by pci_walk_bus().  But that's cheap and is
outweighed by the performance gain of potentially skipping pci_walk_bus()
altogether.

The algorithm can be optimized further by taking into account if D3 is
currently allowed for the bridge, as shown in the following truth table:

(a)  remove &&  bridge_d3:  D3 is currently allowed for the bridge and
                            removing one of its children won't change
                            that.  No action necessary.
(b)  remove && !bridge_d3:  D3 may now be allowed for the bridge if the
                            removed child was the only one blocking it.
                            Check all its siblings to verify that.
(c) !remove &&  bridge_d3:  D3 may now be disallowed but this can only
                            be caused by the added/changed child, not
                            any of its siblings.  Check only that single
                            device.
(d) !remove && !bridge_d3:  D3 may now be allowed for the bridge if the
                            changed child was the only one blocking it.
                            Check all its siblings to verify that.
                            By checking beforehand if the changed child
                            is blocking D3, we may be able to skip
                            checking its siblings.

Currently we do not special-case option (a) and in case of option (c) we
gratuitously call pci_walk_bus().  Speed up the algorithm by adding these
optimizations.  Reword the comments a bit in an attempt to improve clarity.

No functional change intended.
Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

e8559b71

PCI: Autosense device removal in pci_bridge_d3_update() · 1ed276a7

由 Lukas Wunner 提交于 10月 28, 2016

The algorithm to update the flag indicating whether a bridge may go to D3
makes a few optimizations based on whether the update was caused by the
removal of a device on the one hand, versus the addition of a device or the
change of its D3cold flags on the other hand.

The information whether the update pertains to a removal is currently
passed in by the caller, but the function may as well determine that itself
by examining the device in question, thereby allowing for a considerable
simplification and reduction of the code.

Out of several options to determine removal, I've chosen the function
device_is_registered() because it's cheap:  It merely returns the
dev->kobj.state_in_sysfs flag.  That flag is set through device_add() when
the root bus is scanned and cleared through device_remove().  The call to
pci_bridge_d3_update() happens after each of these calls, respectively, so
the ordering is correct.

No functional change intended.
Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

1ed276a7

PCI: Don't acquire ref on parent in pci_bridge_d3_update() · 738a7edb

由 Lukas Wunner 提交于 10月 28, 2016

This function is always called with an existing pci_dev struct, which
holds a reference on the pci_bus struct it resides on, which in turn
holds a reference on pci_bus->bridge, which is the pci_dev's parent.

Hence there's no need to acquire an additional ref on the parent.

More specifically, the pci_dev exists until pci_destroy_dev() drops the
final reference on it, so all calls to pci_bridge_d3_update() must be
finished before that.  It is arguably the caller's responsibility to ensure
that it doesn't call pci_bridge_d3_update() with a pci_dev that might
suddenly disappear, but in any case the existing callers are all safe:

- The call in pci_destroy_dev() happens before the call to put_device().
- The call in pci_bus_add_device() is synchronized with pci_destroy_dev()
  using pci_lock_rescan_remove().
- The calls to pci_d3cold_disable() from the xhci and nouveau drivers
  are safe because a ref on the pci_dev is held as long as it's bound to
  a driver.
- The calls to pci_d3cold_enable() / pci_d3cold_disable() when modifying
  the sysfs "d3cold_allowed" entry are also safe because kernfs_drain()
  waits for existing sysfs users to finish before removing the entry,
  and pci_destroy_dev() is called way after that.

No functional change intended.
Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

738a7edb

12 11月, 2016 1 次提交

PCI: Check for PME in targeted sleep state · 6496ebd7

由 Alan Stern 提交于 10月 21, 2016

One some systems, the firmware does not allow certain PCI devices to be put
in deep D-states. This can cause problems for wakeup signalling, if the
device does not support PME# in the deepest allowed suspend state. For
example, Pierre reports that on his system, ACPI does not permit his xHCI
host controller to go into D3 during runtime suspend -- but D3 is the only
state in which the controller can generate PME# signals. As a result, the
controller goes into runtime suspend but never wakes up, so it doesn't work
properly. USB devices plugged into the controller are never detected.

If the device relies on PME# for wakeup signals but is not capable of
generating PME# in the target state, the PCI core should accurately report
that it cannot do wakeup from runtime suspend. This patch modifies the
pci_dev_run_wake() routine to add this check.
Reported-by: NPierre de Villemereuil <flyos@mailoo.org>
Tested-by: NPierre de Villemereuil <flyos@mailoo.org>
Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org
CC: Lukas Wunner <lukas@wunner.de>

6496ebd7

29 9月, 2016 2 次提交

PCI: Ignore requested alignment for VF BARs · 62d9a78f

由 Yongji Xie 提交于 9月 13, 2016

Resource allocation for VFs is done via the VF BARx registers in the PF's
SR-IOV Capability, and the BARs in the VFs themselves are read-only zeros
(see SR-IOV spec r1.1, secs 3.3.14 and 3.4.1.11).

Even though the actual VF BARs are read-only zeros, the VF dev->resource[]
structs describe the space allocated for the VF (this is a piece of the
space described by the VF BARx register in the PF's SR-IOV capability).

It's meaningless to request additional alignment for a VF: the VF BAR
alignment is completely determined by the alignment of the VF BARx in the
PF and the size of the VF BAR.

Ignore the user's alignment requests for VF devices.
Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

62d9a78f

PCI: Ignore requested alignment for PROBE_ONLY and fixed resources · f0b99f70

由 Yongji Xie 提交于 9月 13, 2016

Users may request additional alignment of PCI resources, e.g., to align
BARs on page boundaries so they can be shared with guests via VFIO.  This
of course may require reallocation if firmware has already assigned the
BARs with smaller alignments.

If the platform has requested PCI_PROBE_ONLY, we should never change any
PCI BARs, so we can't provide any additional alignment.  Also, if a BAR is
marked as IORESOURCE_PCI_FIXED, e.g., for PCI Enhanced Allocation or if the
firmware depends on the current BAR value, we can't change the alignment.

In these cases, log a message and ignore the user's alignment requests.

[bhelgaas: changelog, use goto to simplify PCI_PROBE_ONLY check]
Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

f0b99f70

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功