1. 18 11月, 2016 1 次提交
    • L
      PCI: Don't acquire ref on parent in pci_bridge_d3_update() · 738a7edb
      Lukas Wunner 提交于
      This function is always called with an existing pci_dev struct, which
      holds a reference on the pci_bus struct it resides on, which in turn
      holds a reference on pci_bus->bridge, which is the pci_dev's parent.
      
      Hence there's no need to acquire an additional ref on the parent.
      
      More specifically, the pci_dev exists until pci_destroy_dev() drops the
      final reference on it, so all calls to pci_bridge_d3_update() must be
      finished before that.  It is arguably the caller's responsibility to ensure
      that it doesn't call pci_bridge_d3_update() with a pci_dev that might
      suddenly disappear, but in any case the existing callers are all safe:
      
      - The call in pci_destroy_dev() happens before the call to put_device().
      - The call in pci_bus_add_device() is synchronized with pci_destroy_dev()
        using pci_lock_rescan_remove().
      - The calls to pci_d3cold_disable() from the xhci and nouveau drivers
        are safe because a ref on the pci_dev is held as long as it's bound to
        a driver.
      - The calls to pci_d3cold_enable() / pci_d3cold_disable() when modifying
        the sysfs "d3cold_allowed" entry are also safe because kernfs_drain()
        waits for existing sysfs users to finish before removing the entry,
        and pci_destroy_dev() is called way after that.
      
      No functional change intended.
      Tested-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      738a7edb
  2. 12 11月, 2016 1 次提交
    • A
      PCI: Check for PME in targeted sleep state · 6496ebd7
      Alan Stern 提交于
      One some systems, the firmware does not allow certain PCI devices to be put
      in deep D-states.  This can cause problems for wakeup signalling, if the
      device does not support PME# in the deepest allowed suspend state.  For
      example, Pierre reports that on his system, ACPI does not permit his xHCI
      host controller to go into D3 during runtime suspend -- but D3 is the only
      state in which the controller can generate PME# signals.  As a result, the
      controller goes into runtime suspend but never wakes up, so it doesn't work
      properly.  USB devices plugged into the controller are never detected.
      
      If the device relies on PME# for wakeup signals but is not capable of
      generating PME# in the target state, the PCI core should accurately report
      that it cannot do wakeup from runtime suspend.  This patch modifies the
      pci_dev_run_wake() routine to add this check.
      Reported-by: NPierre de Villemereuil <flyos@mailoo.org>
      Tested-by: NPierre de Villemereuil <flyos@mailoo.org>
      Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      CC: stable@vger.kernel.org
      CC: Lukas Wunner <lukas@wunner.de>
      6496ebd7
  3. 29 9月, 2016 5 次提交
    • Y
      PCI: Ignore requested alignment for VF BARs · 62d9a78f
      Yongji Xie 提交于
      Resource allocation for VFs is done via the VF BARx registers in the PF's
      SR-IOV Capability, and the BARs in the VFs themselves are read-only zeros
      (see SR-IOV spec r1.1, secs 3.3.14 and 3.4.1.11).
      
      Even though the actual VF BARs are read-only zeros, the VF dev->resource[]
      structs describe the space allocated for the VF (this is a piece of the
      space described by the VF BARx register in the PF's SR-IOV capability).
      
      It's meaningless to request additional alignment for a VF: the VF BAR
      alignment is completely determined by the alignment of the VF BARx in the
      PF and the size of the VF BAR.
      
      Ignore the user's alignment requests for VF devices.
      Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      62d9a78f
    • Y
      PCI: Ignore requested alignment for PROBE_ONLY and fixed resources · f0b99f70
      Yongji Xie 提交于
      Users may request additional alignment of PCI resources, e.g., to align
      BARs on page boundaries so they can be shared with guests via VFIO.  This
      of course may require reallocation if firmware has already assigned the
      BARs with smaller alignments.
      
      If the platform has requested PCI_PROBE_ONLY, we should never change any
      PCI BARs, so we can't provide any additional alignment.  Also, if a BAR is
      marked as IORESOURCE_PCI_FIXED, e.g., for PCI Enhanced Allocation or if the
      firmware depends on the current BAR value, we can't change the alignment.
      
      In these cases, log a message and ignore the user's alignment requests.
      
      [bhelgaas: changelog, use goto to simplify PCI_PROBE_ONLY check]
      Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      f0b99f70
    • L
      PCI: Recognize D3cold in pci_update_current_state() · a6a64026
      Lukas Wunner 提交于
      Whenever a device is resumed or its power state is changed using the
      platform, its new power state is read from the PM Control & Status Register
      and cached in pci_dev->current_state by calling pci_update_current_state().
      
      If the device is in D3cold, reading from config space typically results in
      a fabricated "all ones" response.  But if it's in D3hot, the two bits
      representing the power state in the PMCSR are *also* set to 1.  Thus D3hot
      and D3cold are not discernible by just reading the PMCSR.
      
      To account for this, pci_update_current_state() uses two workarounds:
      
      - When transitioning to D3cold using pci_platform_power_transition(), the
        new power state is set blindly by pci_update_current_state(), i.e.
        without verifying that the device actually *is* in D3cold.  This is
        achieved by setting the "state" argument to PCI_D3cold.  The "state"
        argument was originally intended to convey the new state in case the
        device doesn't have the PM capability.  It is *also* used to convey the
        device state if the PM capability is present and the new state is D3cold,
        but this was never explained in the kerneldoc.
      
      - Once the current_state is set to D3cold, further invocations of
        pci_update_current_state() will blindly assume that the device is still
        in D3cold and leave the current_state unmodified.  To get out of this
        impasse, the current_state has to be set directly, typically by calling
        pci_raw_set_power_state() or pci_enable_device().
      
      It would be desirable if pci_update_current_state() could reliably detect
      D3cold by itself.  That would allow us to do away with these workarounds,
      and it would allow for a smarter, more energy conserving runtime resume
      strategy after system sleep:  Currently devices which utilize
      direct_complete are mandatorily runtime resumed in their ->complete stage.
      This can be avoided if their power state after system sleep is the same as
      before, but it requires a mechanism to detect the power state reliably.
      
      We've just gained the ability to query the platform firmware for its
      opinion on the device's power state.  On platforms conforming to ACPI 4.0
      or newer, this allows recognition of D3cold.  Pre-4.0 platforms lack _PR3
      and therefore the deepest power state that will ever be reported is D3hot,
      even though the device may actually be in D3cold.  To detect D3cold in
      those cases, accessibility of the vendor ID in config space is probed using
      pci_device_is_present().  This also works for devices which are not
      platform-power-manageable at all, but can be suspended to D3cold using a
      nonstandard mechanism (e.g. some hybrid graphics laptops or Thunderbolt on
      the Mac).
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a6a64026
    • L
      PCI: Query platform firmware for device power state · cc7cc02b
      Lukas Wunner 提交于
      Usually the most accurate way to determine a PCI device's power state is to
      read its PM Control & Status Register.  There are two cases however when
      this is not an option:  If the device doesn't have the PM capability at
      all, or if it is in D3cold (in which case its config space is
      inaccessible).
      
      In both cases, we can alternatively query the platform firmware for its
      opinion on the device's power state.  To facilitate this, augment struct
      pci_platform_pm_ops with a ->get_power callback and implement it for
      acpi_pci_platform_pm (the only pci_platform_pm_ops existing so far).
      
      It is used by a forthcoming commit to let pci_update_current_state()
      recognize D3cold.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      cc7cc02b
    • L
      PCI: Afford direct-complete to devices with non-standard PM · 4132a577
      Lukas Wunner 提交于
      There are devices not power-manageable by the platform, but still able to
      runtime suspend to D3cold with a non-standard mechanism.  One example is
      laptop hybrid graphics where the discrete GPU and its built-in HDA
      controller are power-managed either with a _DSM (AMD PowerXpress, Nvidia
      Optimus) or a separate gmux controller (MacBook Pro).  Another example is
      Thunderbolt on Macs which is power-managed with custom ACPI methods.
      
      When putting the system to sleep, we currently handle such devices
      improperly by transitioning them from D3cold to D3hot (the default power
      state defined at the top of pci_target_state()).  This wastes energy and
      prolongs the suspend sequence (powering up the Thunderbolt controller takes
      2 seconds).
      
      Avoid that by assuming that a non-standard PM mechanism is at work if the
      device is not platform-power-manageable but currently in D3cold.
      
      If the device is wakeup enabled, we might still have to wake it up from
      D3cold if PME cannot be signaled from that power state.
      
      The check for devices without PM capability comes before the check for
      D3cold since such devices could in theory also be powered down by
      non-standard means and should then be afforded direct-complete as well.
      Signed-off-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      4132a577
  4. 17 9月, 2016 1 次提交
  5. 26 7月, 2016 1 次提交
  6. 22 7月, 2016 1 次提交
  7. 20 7月, 2016 1 次提交
  8. 22 6月, 2016 1 次提交
    • K
      PCI: Extending pci=resource_alignment to specify device/vendor IDs · 644a544f
      Koehrer Mathias (ETAS/ESW5) 提交于
      Some uio-based PCI drivers, e.g., uio_cif do not work if the assigned PCI
      memory resources are not page aligned.
      
      By using the kernel option "pci=resource_alignment" it is possible to force
      single PCI boards to use page alignment for their memory resources.
      However, this is fairly cumbersome if several of these boards are in use
      as the specification of the cards has to be done via PCI bus/slot/function
      number which might change, e.g., by adding another board.
      
      Extend the kernel option "pci=resource_alignment" to allow specification of
      relevant devices via PCI device/vendor (and subdevice/subvendor) IDs.  The
      specification of the devices via device/vendor is indicated by a leading
      string "pci:" as argument to "pci=resource_alignment".  The format of the
      specification is pci:<vendor>:<device>[:<subvendor>:<subdevice>]
      Signed-off-by: NMathias Koehrer <mathias.koehrer@etas.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      644a544f
  9. 14 6月, 2016 1 次提交
    • M
      PCI: Put PCIe ports into D3 during suspend · 9d26d3a8
      Mika Westerberg 提交于
      Currently the Linux PCI core does not touch power state of PCI bridges and
      PCIe ports when system suspend is entered.  Leaving them in D0 consumes
      power unnecessarily and may prevent the CPU from entering deeper C-states.
      
      With recent PCIe hardware we can power down the ports to save power given
      that we take into account few restrictions:
      
        - The PCIe port hardware is recent enough, starting from 2015.
      
        - Devices connected to PCIe ports are effectively in D3cold once the port
          is transitioned to D3 (the config space is not accessible anymore and
          the link may be powered down).
      
        - Devices behind the PCIe port need to be allowed to transition to D3cold
          and back.  There is a way both drivers and userspace can forbid this.
      
        - If the device behind the PCIe port is capable of waking the system it
          needs to be able to do so from D3cold.
      
      This patch adds a new flag to struct pci_device called 'bridge_d3'.  This
      flag is set and cleared by the PCI core whenever there is a change in power
      management state of any of the devices behind the PCIe port.  When system
      later on is suspended we only need to check this flag and if it is true
      transition the port to D3 otherwise we leave it in D0.
      
      Also provide override mechanism via command line parameter
      "pcie_port_pm=[off|force]" that can be used to disable or enable the
      feature regardless of the BIOS manufacturing date.
      Tested-by: NLukas Wunner <lukas@wunner.de>
      Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9d26d3a8
  10. 11 6月, 2016 5 次提交
  11. 17 5月, 2016 1 次提交
  12. 12 5月, 2016 1 次提交
  13. 20 4月, 2016 1 次提交
  14. 12 4月, 2016 3 次提交
  15. 06 4月, 2016 1 次提交
  16. 11 3月, 2016 1 次提交
  17. 08 3月, 2016 2 次提交
    • K
      PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr() · 54c6e2dd
      Krzysztof =?utf-8?Q?Ha=C5=82asa?= 提交于
      pci_create_root_bus() passes a "parent" pointer to
      pci_bus_assign_domain_nr().  When CONFIG_PCI_DOMAINS_GENERIC is defined,
      pci_bus_assign_domain_nr() dereferences that pointer.  Many callers of
      pci_create_root_bus() supply a NULL "parent" pointer, which leads to a NULL
      pointer dereference error.
      
      7c674700 ("PCI: Move domain assignment from arm64 to generic code")
      moved the "parent" dereference from arm64 to generic code.  Only arm64 used
      that code (because only arm64 defined CONFIG_PCI_DOMAINS_GENERIC), and it
      always supplied a valid "parent" pointer.  Other arches supplied NULL
      "parent" pointers but didn't defined CONFIG_PCI_DOMAINS_GENERIC, so they
      used a no-op version of pci_bus_assign_domain_nr().
      
      8c7d1474 ("ARM/PCI: Move to generic PCI domains") defined
      CONFIG_PCI_DOMAINS_GENERIC on ARM, and many ARM platforms use
      pci_common_init(), which supplies a NULL "parent" pointer.
      These platforms (cns3xxx, dove, footbridge, iop13xx, etc.) crash
      with a NULL pointer dereference like this while probing PCI:
      
        Unable to handle kernel NULL pointer dereference at virtual address 000000a4
        PC is at pci_bus_assign_domain_nr+0x10/0x84
        LR is at pci_create_root_bus+0x48/0x2e4
        Kernel panic - not syncing: Attempted to kill init!
      
      [bhelgaas: changelog, add "Reported:" and "Fixes:" tags]
      Reported: http://forum.doozan.com/read.php?2,17868,22070,quote=1
      Fixes: 8c7d1474 ("ARM/PCI: Move to generic PCI domains")
      Fixes: 7c674700 ("PCI: Move domain assignment from arm64 to generic code")
      Signed-off-by: NKrzysztof Hałasa <khalasa@piap.pl>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      CC: stable@vger.kernel.org	# v4.0+
      54c6e2dd
    • B
      PCI: Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h · fe537670
      Bjorn Helgaas 提交于
      Christoph added a generic include/linux/pci-dma-compat.h, so now there's
      one place with most of the PCI DMA interfaces.  Move more PCI DMA-related
      things there:
      
        - The PCI_DMA_* direction constants from linux/pci.h
        - The pci_set_dma_max_seg_size() and pci_set_dma_seg_boundary()
          CONFIG_PCI implementations from drivers/pci/pci.c
        - The pci_set_dma_max_seg_size() and pci_set_dma_seg_boundary()
          !CONFIG_PCI stubs from linux/pci.h
        - The pci_set_dma_mask() and pci_set_consistent_dma_mask()
          !CONFIG_PCI stubs from linux/pci.h
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      fe537670
  18. 06 2月, 2016 1 次提交
  19. 09 1月, 2016 2 次提交
  20. 09 12月, 2015 1 次提交
  21. 30 10月, 2015 4 次提交
  22. 23 10月, 2015 1 次提交
    • H
      PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum · c56d4450
      Hariprasad Shenai 提交于
      The Chelsio T5 has a PCIe compliance erratum that causes Malformed TLP or
      Unexpected Completion errors in some systems, which may cause device access
      timeouts.
      
      Per PCIe r3.0, sec 2.2.9, "Completion headers must supply the same values
      for the Attribute as were supplied in the header of the corresponding
      Request, except as explicitly allowed when IDO is used."
      
      Instead of copying the Attributes from the Request to the Completion, the
      T5 always generates Completions with zero Attributes.  The receiver of a
      Completion whose Attributes don't match the Request may accept it (which
      itself seems non-compliant based on sec 2.3.2), or it may handle it as a
      Malformed TLP or an Unexpected Completion, which will probably lead to a
      device access timeout.
      
      Work around this by disabling "Relaxed Ordering" and "No Snoop" in the Root
      Port so it always generate Requests with zero Attributes.
      
      This does affect all other devices which are downstream of that Root Port,
      but these are performance optimizations that should not make a functional
      difference.
      
      Note that Configuration Space accesses are never supposed to have TLP
      Attributes, so we're safe waiting till after any Configuration Space
      accesses to do the Root Port "fixup".
      
      Based on original work by Casey Leedom <leedom@chelsio.com>
      
      [bhelgaas: changelog, comments, rename to pci_find_pcie_root_port(), rework
      to use pci_upstream_bridge() and check for Root Port device type, edit
      diagnostics to clarify intent and devices affected]
      Signed-off-by: NHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      c56d4450
  23. 13 10月, 2015 1 次提交
    • R
      PCI / PM: Avoid resuming more devices during system suspend · 2cef548a
      Rafael J. Wysocki 提交于
      Commit bac2a909 (PCI / PM: Avoid resuming PCI devices during
      system suspend) introduced a mechanism by which some PCI devices that
      were runtime-suspended at the system suspend time might be left in
      that state for the duration of the system suspend-resume cycle.
      However, it overlooked devices that were marked as capable of waking
      up the system just because PME support was detected in their PCI
      config space.
      
      Namely, in that case, device_can_wakeup(dev) returns 'true' for the
      device and if the device is not configured for system wakeup,
      device_may_wakeup(dev) returns 'false' and it will be resumed during
      system suspend even though configuring it for system wakeup may not
      really make sense at all.
      
      To avoid this problem, simply disable PME for PCI devices that have
      not been configured for system wakeup and are runtime-suspended at
      the system suspend time for the duration of the suspend-resume cycle.
      
      If the device is in D3cold, its config space is not available and it
      shouldn't be written to, but that's only possible if the device
      has platform PM support and the platform code is responsible for
      checking whether or not the device's configuration is suitable for
      system suspend in that case.
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2cef548a
  24. 17 9月, 2015 1 次提交
  25. 14 9月, 2015 1 次提交