1. 09 5月, 2019 2 次提交
  2. 18 4月, 2019 1 次提交
  3. 14 4月, 2019 1 次提交
  4. 13 4月, 2019 1 次提交
  5. 06 3月, 2019 1 次提交
    • A
      PCI: Fix "try" semantics of bus and slot reset · ddefc033
      Alex Williamson 提交于
      The commit referenced below introduced device locking around save and
      restore of state for each device during a PCI bus "try" reset, making it
      decidely non-"try" and prone to deadlock in the event that a device is
      already locked.  Restore __pci_reset_bus() and __pci_reset_slot() to their
      advertised locking semantics by pushing the save and restore functions into
      the branch where the entire tree is already locked.  Extend the helper
      function names with "_locked" and update the comment to reflect this
      calling requirement.
      
      Fixes: b014e96d ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()")
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NSinan Kaya <okaya@kernel.org>
      ddefc033
  6. 12 2月, 2019 1 次提交
    • B
      PCI/ASPM: Save LTR Capability for suspend/resume · dbbfadf2
      Bjorn Helgaas 提交于
      Latency Tolerance Reporting (LTR) allows Endpoints and Switch Upstream
      Ports to report their latency requirements to upstream components.  If ASPM
      L1 PM substates are enabled, the LTR information helps determine when a
      Link enters L1.2 [1].
      
      Software must set the maximum latency values in the LTR Capability based on
      characteristics of the platform, then set LTR Mechanism Enable in the
      Device Control 2 register in the PCIe Capability.  The device can then use
      LTR to report its latency tolerance.
      
      If the device reports a maximum latency value of zero, that means the
      device requires the highest possible performance and the ASPM L1.2 substate
      is effectively disabled.
      
      We put devices in D3 for suspend, and we assume their internal state is
      lost.  On resume, previously we did not restore the LTR Capability, but we
      did restore the LTR Mechanism Enable bit, so devices would request the
      highest possible performance and ASPM L1.2 wouldn't be used.
      
      [1] PCIe r4.0, sec 5.5.1
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201469Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      dbbfadf2
  7. 11 2月, 2019 1 次提交
    • M
      PCI: Blacklist power management of Gigabyte X299 DESIGNARE EX PCIe ports · 85b0cae8
      Mika Westerberg 提交于
      Gigabyte X299 DESIGNARE EX motherboard has one PCIe root port that is
      connected to an Alpine Ridge Thunderbolt controller.  This port has slot
      implemented bit set in the config space but other than that it is not
      hotplug capable in the sense we are expecting in Linux (it has
      dev->is_hotplug_bridge set to 0):
      
        00:1c.4 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #5
          Bus: primary=00, secondary=05, subordinate=46, sec-latency=0
          Memory behind bridge: 78000000-8fffffff [size=384M]
          Prefetchable memory behind bridge: 00003800f8000000-00003800ffffffff [size=128M]
          ...
          Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
          ...
            SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
      	      Slot #8, PowerLimit 25.000W; Interlock- NoCompl+
            SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
      	      Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
            SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
      	      Changed: MRL- PresDet+ LinkState+
      
      This system is using ACPI based hotplug to notify the OS that it needs to
      rescan the PCI bus (ACPI hotplug).
      
      If there is nothing connected in any of the Thunderbolt ports the root port
      will not have any runtime PM active children and is thus automatically
      runtime suspended pretty soon after boot by PCI PM core.  Now, when a
      device is connected the BIOS SMI handler responsible for enumerating newly
      added devices is not able to find anything because the port is in D3.
      
      Prevent this from happening by blacklisting PCI power management of this
      particular Gigabyte system.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=202031Reported-by: NKedar A Dongre <kedar.a.dongre@intel.com>
      Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      85b0cae8
  8. 31 1月, 2019 1 次提交
  9. 22 1月, 2019 1 次提交
  10. 17 1月, 2019 1 次提交
    • L
      PCI: Fix __initdata issue with "pci=disable_acs_redir" parameter · d2fd6e81
      Logan Gunthorpe 提交于
      The disable_acs_redir parameter stores a pointer to the string passed to
      pci_setup().  However, the string passed to PCI setup is actually a
      temporary copy allocated in static __initdata memory.  After init, once the
      memory is freed, it is no longer valid to reference this pointer.
      
      This bug was noticed in v5.0-rc1 after a change in commit c5eb1190
      ("PCI / PM: Allow runtime PM without callback functions") caused
      pci_disable_acs_redir() to be called during shutdown which manifested
      as an unable to handle kernel paging request at:
      
        RIP: 0010:pci_enable_acs+0x3f/0x1e0
        Call Trace:
           pci_restore_state.part.44+0x159/0x3c0
           pci_restore_standard_config+0x33/0x40
           pci_pm_runtime_resume+0x2b/0xd0
           ? pci_restore_standard_config+0x40/0x40
           __rpm_callback+0xbc/0x1b0
           rpm_callback+0x1f/0x70
           ? pci_restore_standard_config+0x40/0x40
            rpm_resume+0x4f9/0x710
           ? pci_conf1_read+0xb6/0xf0
           ? pci_conf1_write+0xb2/0xe0
           __pm_runtime_resume+0x47/0x70
           pci_device_shutdown+0x1e/0x60
           device_shutdown+0x14a/0x1f0
           kernel_restart+0xe/0x50
           __do_sys_reboot+0x1ee/0x210
           ? __fput+0x144/0x1d0
           do_writev+0x5e/0xf0
           ? do_writev+0x5e/0xf0
           do_syscall_64+0x48/0xf0
           entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      It was also likely possible to trigger this bug when hotplugging PCI
      devices.
      
      To fix this, instead of storing a pointer, we use kstrdup() to copy the
      disable_acs_redir_param to its own buffer which will never be freed.
      
      Fixes: aaca43fd ("PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support")
      Tested-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
      d2fd6e81
  11. 15 1月, 2019 1 次提交
  12. 01 12月, 2018 1 次提交
  13. 03 10月, 2018 3 次提交
  14. 29 9月, 2018 1 次提交
    • F
      PCI: Add support for Immediate Readiness · d6112f8d
      Felipe Balbi 提交于
      PCIe r4.0, sec 7.5.1.1.4 defines a new bit in the Status Register:
      
        Immediate Readiness – This optional bit, when Set, indicates the Function
        is guaranteed to be ready to successfully complete valid configuration
        accesses at any time following any reset that the host is capable of
        issuing Configuration Requests to this Function.
      
        When this bit is Set, for accesses to this Function, software is exempt
        from all requirements to delay configuration accesses following any type
        of reset, including but not limited to the timing requirements defined in
        Section 6.6.
      
      This means that all delays after a Conventional or Function Reset can be
      skipped.
      
      This patch reads such bit and caches its value in a flag inside struct
      pci_dev to be checked later if we should delay or can skip delays after a
      reset.  While at that, also move the explicit msleep(100) call from
      pcie_flr() and pci_af_flr() to pci_dev_wait().
      Signed-off-by: NFelipe Balbi <felipe.balbi@linux.intel.com>
      [bhelgaas: rename PCI_STATUS_IMMEDIATE to PCI_STATUS_IMM_READY]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      d6112f8d
  15. 28 9月, 2018 1 次提交
    • D
      PCI: Reprogram bridge prefetch registers on resume · 08387454
      Daniel Drake 提交于
      On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3
      suspend/resume.  The affected products include multiple generations of
      NVIDIA GPUs and Intel SoCs.  After resume, nouveau logs many errors such
      as:
      
        fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04
              [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
        DRM: failed to idle channel 0 [DRM]
      
      Similarly, the NVIDIA proprietary driver also fails after resume (black
      screen, 100% CPU usage in Xorg process).  We shipped a sample to NVIDIA for
      diagnosis, and their response indicated that it's a problem with the parent
      PCI bridge (on the Intel SoC), not the GPU.
      
      Runtime suspend/resume works fine, only S3 suspend is affected.
      
      We found a workaround: on resume, rewrite the Intel PCI bridge
      'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32).  In the
      cases that I checked, this register has value 0 and we just have to rewrite
      that value.
      
      Linux already saves and restores PCI config space during suspend/resume,
      but this register was being skipped because upon resume, it already has
      value 0 (the correct, pre-suspend value).
      
      Intel appear to have previously acknowledged this behaviour and the
      requirement to rewrite this register:
      https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23
      
      Based on that, rewrite the prefetch register values even when that appears
      unnecessary.
      
      We have confirmed this solution on all the affected models we have in-hands
      (X542UQ, UX533FD, X530UN, V272UN).
      
      Additionally, this solves an issue where r8169 MSI-X interrupts were broken
      after S3 suspend/resume on ASUS X441UAR.  This issue was recently worked
      around in commit 7bb05b85 ("r8169: don't use MSI-X on RTL8106e").  It
      also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop
      that we had not yet patched.  I suspect it will also fix the issue that was
      worked around in commit 7c53a722 ("r8169: don't use MSI-X on
      RTL8168g").
      
      Thomas Martitz reports that this change also solves an issue where the AMD
      Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3
      suspend/resume.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069Signed-off-by: NDaniel Drake <drake@endlessm.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-By: NPeter Wu <peter@lekensteyn.nl>
      CC: stable@vger.kernel.org
      08387454
  16. 22 9月, 2018 1 次提交
  17. 21 9月, 2018 1 次提交
  18. 19 9月, 2018 1 次提交
    • L
      PCI: hotplug: Constify hotplug_slot_ops · 81c4b5bf
      Lukas Wunner 提交于
      Hotplug drivers cannot declare their hotplug_slot_ops const, making them
      attractive targets for attackers, because upon registration of a hotplug
      slot, __pci_hp_initialize() writes to the "owner" and "mod_name" members
      in that struct.
      
      Fix by moving these members to struct hotplug_slot and constify every
      driver's hotplug_slot_ops except for pciehp.
      
      pciehp constructs its hotplug_slot_ops at runtime based on the PCIe
      port's capabilities, hence cannot declare them const.  It can be
      converted to __write_rarely once that's mainlined:
      http://www.openwall.com/lists/kernel-hardening/2016/11/16/3Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>  # drivers/pci/hotplug/rpa*
      Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> # drivers/platform/x86
      Cc: Len Brown <lenb@kernel.org>
      Cc: Scott Murray <scott@spiteful.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Oliver OHalloran <oliveroh@au1.ibm.com>
      Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Corentin Chary <corentin.chary@gmail.com>
      Cc: Darren Hart <dvhart@infradead.org>
      81c4b5bf
  19. 18 9月, 2018 2 次提交
  20. 12 9月, 2018 2 次提交
  21. 11 8月, 2018 1 次提交
    • A
      PCI: Check for PCIe Link downtraining · 2d1ce5ec
      Alexandru Gagniuc 提交于
      When both ends of a PCIe Link are capable of a higher bandwidth than is
      currently in use, the Link is said to be "downtrained".  A downtrained Link
      may indicate hardware or configuration problems in the system, but it's
      hard to identify such Links from userspace.
      
      Refactor pcie_print_link_status() so it continues to always print PCIe
      bandwidth information, as several NIC drivers desire.
      
      Add a new internal __pcie_print_link_status() to emit a message only when a
      device's bandwidth is constrained by the fabric and call it from the PCI
      core for all devices, which identifies all downtrained Links.  It also
      emits messages for a few cases that are technically not downtrained, such
      as a x4 device in an open-ended x1 slot.
      Signed-off-by: NAlexandru Gagniuc <mr.nuke.me@gmail.com>
      [bhelgaas: changelog, move __pcie_print_link_status() declaration to
      drivers/pci/, rename pcie_check_upstream_link() to
      pcie_report_downtraining()]
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      2d1ce5ec
  22. 10 8月, 2018 5 次提交
  23. 07 8月, 2018 1 次提交
  24. 01 8月, 2018 2 次提交
    • L
      PCI: Whitelist Thunderbolt ports for runtime D3 · 47a8e237
      Lukas Wunner 提交于
      Thunderbolt controllers can be runtime suspended to D3cold to save ~1.5W.
      This requires that runtime D3 is allowed on its PCIe ports, so whitelist
      them.
      
      The 2015 BIOS cutoff that we've instituted for runtime D3 on PCIe ports
      is unnecessary on Thunderbolt because we know that even the oldest
      controller, Light Ridge (2010), is able to suspend its ports to D3 just
      fine -- specifically including its hotplug ports.  And the power saving
      should be afforded to machines even if their BIOS predates 2015.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Andreas Noever <andreas.noever@gmail.com>
      47a8e237
    • L
      PCI: Whitelist native hotplug ports for runtime D3 · eb3b5bf1
      Lukas Wunner 提交于
      Previously we blacklisted PCIe hotplug ports for runtime D3 because:
      
      (a) Ports handled by the firmware must not be transitioned to D3 by the
          OS behind the firmware's back:
          https://bugzilla.kernel.org/show_bug.cgi?id=53811
      
      (b) Ports handled natively by the OS lacked runtime D3 support in the
          pciehp driver.
      
      We've just rectified the latter, so allow users to manually enable and
      test it by passing pcie_port_pm=force on the command line.  Vendors are
      thus put in a position to validate hotplug ports for runtime D3 and
      perhaps we can someday enable it by default, but with a BIOS cutoff date.
      
      Ashok Raj tested runtime D3 on hotplug ports of a SkyLake Xeon-SP in
      2017 and encountered Hardware Error NMIs, so this feature clearly cannot
      be enabled for everyone yet:
      https://lkml.kernel.org/r/20170503180426.GA4058@otc-nc-03
      
      While at it, remove an erroneous code comment I added with 97a90aee
      ("PCI: Consolidate conditions to allow runtime PM on PCIe ports") which
      claims that parents of a hotplug port must stay awake lest interrupts
      cannot be delivered.  That has turned out to be wrong at least for
      Thunderbolt hotplug ports.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      eb3b5bf1
  25. 20 7月, 2018 5 次提交
  26. 19 7月, 2018 1 次提交
    • S
      PCI: OF: Fix I/O space page leak · a5fb9fb0
      Sergei Shtylyov 提交于
      When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY
      driver was left disabled, the kernel crashed with this BUG:
      
        kernel BUG at lib/ioremap.c:72!
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
        Modules linked in:
        CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092
        Hardware name: Renesas Condor board based on r8a77980 (DT)
        Workqueue: events deferred_probe_work_func
        pstate: 80000005 (Nzcv daif -PAN -UAO)
        pc : ioremap_page_range+0x370/0x3c8
        lr : ioremap_page_range+0x40/0x3c8
        sp : ffff000008da39e0
        x29: ffff000008da39e0 x28: 00e8000000000f07
        x27: ffff7dfffee00000 x26: 0140000000000000
        x25: ffff7dfffef00000 x24: 00000000000fe100
        x23: ffff80007b906000 x22: ffff000008ab8000
        x21: ffff000008bb1d58 x20: ffff7dfffef00000
        x19: ffff800009c30fb8 x18: 0000000000000001
        x17: 00000000000152d0 x16: 00000000014012d0
        x15: 0000000000000000 x14: 0720072007200720
        x13: 0720072007200720 x12: 0720072007200720
        x11: 0720072007300730 x10: 00000000000000ae
        x9 : 0000000000000000 x8 : ffff7dffff000000
        x7 : 0000000000000000 x6 : 0000000000000100
        x5 : 0000000000000000 x4 : 000000007b906000
        x3 : ffff80007c61a880 x2 : ffff7dfffeefffff
        x1 : 0000000040000000 x0 : 00e80000fe100f07
        Process kworker/0:1 (pid: 39, stack limit = 0x        (ptrval))
        Call trace:
         ioremap_page_range+0x370/0x3c8
         pci_remap_iospace+0x7c/0xac
         pci_parse_request_of_pci_ranges+0x13c/0x190
         rcar_pcie_probe+0x4c/0xb04
         platform_drv_probe+0x50/0xbc
         driver_probe_device+0x21c/0x308
         __device_attach_driver+0x98/0xc8
         bus_for_each_drv+0x54/0x94
         __device_attach+0xc4/0x12c
         device_initial_probe+0x10/0x18
         bus_probe_device+0x90/0x98
         deferred_probe_work_func+0xb0/0x150
         process_one_work+0x12c/0x29c
         worker_thread+0x200/0x3fc
         kthread+0x108/0x134
         ret_from_fork+0x10/0x18
        Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000)
      
      It turned out that pci_remap_iospace() wasn't undone when the driver's
      probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER,
      the probe was retried, finally causing the BUG due to trying to remap
      already remapped pages.
      
      Introduce the devm_pci_remap_iospace() managed API and replace the
      pci_remap_iospace() call with it to fix the bug.
      
      Fixes: dbf9826d ("PCI: generic: Convert to DT resource parsing API")
      Signed-off-by: NSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      [lorenzo.pieralisi@arm.com: split commit/updated the commit log]
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NLinus Walleij <linus.walleij@linaro.org>
      a5fb9fb0