1. 19 7月, 2010 1 次提交
    • R
      PM: Make it possible to avoid races between wakeup and system sleep · c125e96f
      Rafael J. Wysocki 提交于
      One of the arguments during the suspend blockers discussion was that
      the mainline kernel didn't contain any mechanisms making it possible
      to avoid races between wakeup and system suspend.
      
      Generally, there are two problems in that area.  First, if a wakeup
      event occurs exactly when /sys/power/state is being written to, it
      may be delivered to user space right before the freezer kicks in, so
      the user space consumer of the event may not be able to process it
      before the system is suspended.  Second, if a wakeup event occurs
      after user space has been frozen, it is not generally guaranteed that
      the ongoing transition of the system into a sleep state will be
      aborted.
      
      To address these issues introduce a new global sysfs attribute,
      /sys/power/wakeup_count, associated with a running counter of wakeup
      events and three helper functions, pm_stay_awake(), pm_relax(), and
      pm_wakeup_event(), that may be used by kernel subsystems to control
      the behavior of this attribute and to request the PM core to abort
      system transitions into a sleep state already in progress.
      
      The /sys/power/wakeup_count file may be read from or written to by
      user space.  Reads will always succeed (unless interrupted by a
      signal) and return the current value of the wakeup events counter.
      Writes, however, will only succeed if the written number is equal to
      the current value of the wakeup events counter.  If a write is
      successful, it will cause the kernel to save the current value of the
      wakeup events counter and to abort the subsequent system transition
      into a sleep state if any wakeup events are reported after the write
      has returned.
      
      [The assumption is that before writing to /sys/power/state user space
      will first read from /sys/power/wakeup_count.  Next, user space
      consumers of wakeup events will have a chance to acknowledge or
      veto the upcoming system transition to a sleep state.  Finally, if
      the transition is allowed to proceed, /sys/power/wakeup_count will
      be written to and if that succeeds, /sys/power/state will be written
      to as well.  Still, if any wakeup events are reported to the PM core
      by kernel subsystems after that point, the transition will be
      aborted.]
      
      Additionally, put a wakeup events counter into struct dev_pm_info and
      make these per-device wakeup event counters available via sysfs,
      so that it's possible to check the activity of various wakeup event
      sources within the kernel.
      
      To illustrate how subsystems can use pm_wakeup_event(), make the
      low-level PCI runtime PM wakeup-handling code use it.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Acked-by: Nmarkgross <markgross@thegnar.org>
      Reviewed-by: NAlan Stern <stern@rowland.harvard.edu>
      c125e96f
  2. 11 5月, 2010 1 次提交
  3. 23 2月, 2010 4 次提交
    • R
      PCI PM: Run-time callbacks for PCI bus type · 6cbf8214
      Rafael J. Wysocki 提交于
      Introduce run-time PM callbacks for the PCI bus type.  Make the new
      callbacks work in analogy with the existing system sleep PM
      callbacks, so that the drivers already converted to struct dev_pm_ops
      can use their suspend and resume routines for run-time PM without
      modifications.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      6cbf8214
    • R
      PCI / ACPI / PM: Platform support for PCI PME wake-up · b67ea761
      Rafael J. Wysocki 提交于
      Although the majority of PCI devices can generate PMEs that in
      principle may be used to wake up devices suspended at run time,
      platform support is generally necessary to convert PMEs into wake-up
      events that can be delivered to the kernel.  If ACPI is used for this
      purpose, PME signals generated by a PCI device will trigger the ACPI
      GPE associated with the device to generate an ACPI wake-up event that
      we can set up a handler for, provided that everything is configured
      correctly.
      
      Unfortunately, the subset of PCI devices that have GPEs associated
      with them is quite limited.  The devices without dedicated GPEs have
      to rely on the GPEs associated with other devices (in the majority of
      cases their upstream bridges and, possibly, the root bridge) to
      generate ACPI wake-up events in response to PME signals from them.
      
      Add ACPI platform support for PCI PME wake-up:
      o Add a framework making is possible to use ACPI system notify
        handlers for run-time PM.
      o Add new PCI platform callback ->run_wake() to struct
        pci_platform_pm_ops allowing us to enable/disable the platform to
        generate wake-up events for given device.  Implemet this callback
        for the ACPI platform.
      o Define ACPI wake-up handlers for PCI devices and PCI root buses and
        make the PCI-ACPI binding code register wake-up notifiers for all
        PCI devices present in the ACPI tables.
      o Add function pci_dev_run_wake() which can be used by PCI drivers to
        check if given device is capable of generating wake-up events at
        run time.
      
      Developed in cooperation with Matthew Garrett <mjg@redhat.com>.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      b67ea761
    • R
      PCI PM: Add function for checking PME status of devices · 58ff4633
      Rafael J. Wysocki 提交于
      Add function pci_check_pme_status() that will check the PME status
      bit of given device and clear it along with the PME enable bit.  It
      will be necessary for PCI run-time power management.
      
      Based on a patch from Shaohua Li <shaohua.li@intel.com>
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      58ff4633
    • R
      PCI: Clean up build for CONFIG_PCI_QUIRKS unset · 93177a74
      Rafael J. Wysocki 提交于
      Currently, drivers/pci/quirks.c is built unconditionally, but if
      CONFIG_PCI_QUIRKS is unset, the only things actually built in this
      file are definitions of global variables and empty functions (due to
      the #ifdef CONFIG_PCI_QUIRKS embracing all of the code inside the
      file).  This is not particularly nice and if someone overlooks
      the #ifdef CONFIG_PCI_QUIRKS, build errors are introduced.
      
      To clean that up, move the definitions of the global variables in
      quirks.c that are always built to pci.c, move the definitions of
      the empty functions (compiled when CONFIG_PCI_QUIRKS is unset) to
      headers (additionally make these functions static inline) and modify
      drivers/pci/Makefile so that quirks.c is only built if
      CONFIG_PCI_QUIRKS is set.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      93177a74
  4. 01 1月, 2010 1 次提交
  5. 17 12月, 2009 1 次提交
  6. 05 11月, 2009 1 次提交
    • A
      PCI: acs p2p upsteram forwarding enabling · ae21ee65
      Allen Kay 提交于
      Note: dom0 checking in v4 has been separated out into 2/2.
      
      This patch enables P2P upstream forwarding in ACS capable PCIe switches.
      It solves two potential problems in virtualization environment where a PCIe
      device is assigned to a guest domain using a HW iommu such as VT-d:
      
      1) Unintentional failure caused by guest physical address programmed
         into the device's DMA that happens to match the memory address range
         of other downstream ports in the same PCIe switch.  This causes the PCI
         transaction to go to the matching downstream port instead of go to the
         root complex to get translated by VT-d as it should be.
      
      2) Malicious guest software intentionally attacks another downstream
         PCIe device by programming the DMA address into the assigned device
         that matches memory address range of the downstream PCIe port.
      
      We are in process of implementing device filtering software in KVM/XEN
      management software to allow device assignment of PCIe devices behind a PCIe
      switch only if it has ACS capability and with the P2P upstream forwarding bits
      enabled.  This patch is intended to work for both KVM and Xen environments.
      Signed-off-by: NAllen Kay <allen.m.kay@intel.com>
      Reviewed-by: NMathew Wilcox <willy@linux.intel.com>
      Reviewed-by: NChris Wright <chris@sous-sol.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      ae21ee65
  7. 10 9月, 2009 2 次提交
    • E
      PCI: Simplify hotplug mch quirk. · 0ba379ec
      Eric W. Biederman 提交于
      There is a very old quirk for the intel E7502 E7320 and E7525 memory
      controller hubs that disables usage of msi interrupts on pcie hotplug
      bridges of those devices, and disables changing the affinity of irqs.
      
      Today all we have to do to disable msi on a specific device is to set
      dev->no_msi, which is much more straightforward than the previous
      logic.
      
      The re-running of this fixup after pci hotplug happens below these
      devices is totally bogus.  All of the state we change is pure software
      state and we don't change the hardware at all.  Which means hotplug on
      the lower devices doesn't have a chance to change this state.  So we
      can safely remove the special case from the pciehp driver and the pcie
      portdriver.
      
      I suspect the special case was someone's expermental debug code that
      slipped in. Certainly it isn't mentioned in commit
      6fb8880a61510295aece04a542767161f624dffe aka BKrev:
      41966101LJ_ogfOU0m2aE6teZfQnuQ where the code first appears.
      Reviewed-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      0ba379ec
    • M
      PCI: expose function reset capability in sysfs · 711d5779
      Michael S. Tsirkin 提交于
      Some devices allow an individual function to be reset without affecting
      other functions in the same device: that's what pci_reset_function does.
      For devices that have this support, expose reset attribite in sysfs.
      
      This is useful e.g. for virtualization, where a qemu userspace
      process wants to reset the device when the guest is reset,
      to emulate machine reboot as closely as possible.
      Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      711d5779
  8. 30 8月, 2009 1 次提交
    • C
      PCI SR-IOV: correct broken resource alignment calculations · 6faf17f6
      Chris Wright 提交于
      An SR-IOV capable device includes an SR-IOV PCIe capability which
      describes the Virtual Function (VF) BAR requirements.  A typical SR-IOV
      device can support multiple VFs whose BARs must be in a contiguous region,
      effectively an array of VF BARs.  The BAR reports the size requirement
      for a single VF.  We calculate the full range needed by simply multiplying
      the VF BAR size with the number of possible VFs and create a resource
      spanning the full range.
      
      This all seems sane enough except it artificially inflates the alignment
      requirement for the VF BAR.  The VF BAR need only be aligned to the size
      of a single BAR not the contiguous range of VF BARs.  This can cause us
      to fail to allocate resources for the BAR despite the fact that we
      actually have enough space.
      
      This patch adds a thin PCI specific layer over the generic
      resource_alignment() function which is aware of the special nature of
      VF BARs and does sorting and allocation based on the smaller alignment
      requirement.
      
      I recognize that while resource_alignment is generic, it's basically a
      PCI helper.  An alternative to this patch is to add PCI VF BAR specific
      information to struct resource.  I opted for the extra layer rather than
      adding such PCI specific information to struct resource.  This does
      have the slight downside that we don't cache the BAR size and re-read
      for each alignment query (happens a small handful of times during boot
      for each VF BAR).
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Yu Zhao <yu.zhao@intel.com>
      Cc: stable@kernel.org
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      6faf17f6
  9. 18 5月, 2009 2 次提交
  10. 31 3月, 2009 1 次提交
  11. 21 3月, 2009 8 次提交
  12. 14 2月, 2009 1 次提交
  13. 17 1月, 2009 1 次提交
    • R
      PCI PM: Restore standard config registers of all devices early · aa8c6c93
      Rafael J. Wysocki 提交于
      There is a problem in our handling of suspend-resume of PCI devices that
      many of them have their standard config registers restored with
      interrupts enabled and they are put into the full power state with
      interrupts enabled as well.  This may lead to the following scenario:
        * an interrupt vector is shared between two or more devices
        * one device is resumed earlier and generates an interrupt
        * the interrupt handler of another device tries to handle it and
          attempts to access the device the config space of which hasn't been
          restored yet and/or which still is in a low power state
        * the system crashes as a result
      
      To prevent this from happening we should restore the standard
      configuration registers of all devices with interrupts disabled and we
      should put them into the D0 power state right after that.
      Unfortunately, this cannot be done using the existing
      pci_set_power_state(), because it can sleep.  Also, to do it we have to
      make sure that the config spaces of all devices were actually saved
      during suspend.
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      aa8c6c93
  14. 08 1月, 2009 11 次提交
  15. 23 10月, 2008 1 次提交
  16. 21 10月, 2008 3 次提交