1. 27 10月, 2016 1 次提交
    • V
      vfio/pci: Fix integer overflows, bitmask check · 05692d70
      Vlad Tsyrklevich 提交于
      The VFIO_DEVICE_SET_IRQS ioctl did not sufficiently sanitize
      user-supplied integers, potentially allowing memory corruption. This
      patch adds appropriate integer overflow checks, checks the range bounds
      for VFIO_IRQ_SET_DATA_NONE, and also verifies that only single element
      in the VFIO_IRQ_SET_DATA_TYPE_MASK bitmask is set.
      VFIO_IRQ_SET_ACTION_TYPE_MASK is already correctly checked later in
      vfio_pci_set_irqs_ioctl().
      
      Furthermore, a kzalloc is changed to a kcalloc because the use of a
      kzalloc with an integer multiplication allowed an integer overflow
      condition to be reached without this patch. kcalloc checks for overflow
      and should prevent a similar occurrence.
      Signed-off-by: NVlad Tsyrklevich <vlad@tsyrklevich.net>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      05692d70
  2. 09 7月, 2016 1 次提交
    • Y
      vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive · 05f0c03f
      Yongji Xie 提交于
      Current vfio-pci implementation disallows to mmap
      sub-page(size < PAGE_SIZE) MMIO BARs because these BARs' mmio
      page may be shared with other BARs. This will cause some
      performance issues when we passthrough a PCI device with
      this kind of BARs. Guest will be not able to handle the mmio
      accesses to the BARs which leads to mmio emulations in host.
      
      However, not all sub-page BARs will share page with other BARs.
      We should allow to mmap the sub-page MMIO BARs which we can
      make sure will not share page with other BARs.
      
      This patch adds support for this case. And we try to add a
      dummy resource to reserve the remainder of the page which
      hot-add device's BAR might be assigned into. But it's not
      necessary to handle the case when the BAR is not page aligned.
      Because we can't expect the BAR will be assigned into the same
      location in a page in guest when we passthrough the BAR. And
      it's hard to access this BAR in userspace because we have
      no way to get the BAR's location in a page.
      Signed-off-by: NYongji Xie <xyjxie@linux.vnet.ibm.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      05f0c03f
  3. 29 4月, 2016 1 次提交
    • A
      vfio/pci: Hide broken INTx support from user · 45074405
      Alex Williamson 提交于
      INTx masking has two components, the first is that we need the ability
      to prevent the device from continuing to assert INTx.  This is
      provided via the DisINTx bit in the command register and is the only
      thing we can really probe for when testing if INTx masking is
      supported.  The second component is that the device needs to indicate
      if INTx is asserted via the interrupt status bit in the device status
      register.  With these two features we can generically determine if one
      of the devices we own is asserting INTx, signal the user, and mask the
      interrupt while the user services the device.
      
      Generally if one or both of these components is broken we resort to
      APIC level interrupt masking, which requires an exclusive interrupt
      since we have no way to determine the source of the interrupt in a
      shared configuration.  This often makes it difficult or impossible to
      configure the system for userspace use of the device, for an interrupt
      mode that the user may not need.
      
      One possible configuration of broken INTx masking is that the DisINTx
      support is fully functional, but the interrupt status bit never
      signals interrupt assertion.  In this case we do have the ability to
      prevent the device from asserting INTx, but lack the ability to
      identify the interrupt source.  For this case we can simply pretend
      that the device lacks INTx support entirely, keeping DisINTx set on
      the physical device, virtualizing this bit for the user, and
      virtualizing the interrupt pin register to indicate no INTx support.
      We already support virtualization of the DisINTx bit and already
      virtualize the interrupt pin for platforms without INTx support.  By
      tying these components together, setting DisINTx on open and reset,
      and identifying devices broken in this particular way, we can provide
      support for them w/o the handicap of APIC level INTx masking.
      
      Intel i40e (XL710/X710) 10/20/40GbE NICs have been identified as being
      broken in this specific way.  We leave the vfio-pci.nointxmask option
      as a mechanism to bypass this support, enabling INTx on the device
      with all the requirements of APIC level masking.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Cc: John Ronciak <john.ronciak@intel.com>
      Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
      45074405
  4. 28 2月, 2016 1 次提交
  5. 26 2月, 2016 1 次提交
  6. 23 2月, 2016 5 次提交
  7. 22 12月, 2015 1 次提交
    • A
      vfio: Include No-IOMMU mode · 03a76b60
      Alex Williamson 提交于
      There is really no way to safely give a user full access to a DMA
      capable device without an IOMMU to protect the host system.  There is
      also no way to provide DMA translation, for use cases such as device
      assignment to virtual machines.  However, there are still those users
      that want userspace drivers even under those conditions.  The UIO
      driver exists for this use case, but does not provide the degree of
      device access and programming that VFIO has.  In an effort to avoid
      code duplication, this introduces a No-IOMMU mode for VFIO.
      
      This mode requires building VFIO with CONFIG_VFIO_NOIOMMU and enabling
      the "enable_unsafe_noiommu_mode" option on the vfio driver.  This
      should make it very clear that this mode is not safe.  Additionally,
      CAP_SYS_RAWIO privileges are necessary to work with groups and
      containers using this mode.  Groups making use of this support are
      named /dev/vfio/noiommu-$GROUP and can only make use of the special
      VFIO_NOIOMMU_IOMMU for the container.  Use of this mode, specifically
      binding a device without a native IOMMU group to a VFIO bus driver
      will taint the kernel and should therefore not be considered
      supported.  This patch includes no-iommu support for the vfio-pci bus
      driver only.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      03a76b60
  8. 04 12月, 2015 1 次提交
  9. 20 11月, 2015 1 次提交
  10. 05 11月, 2015 1 次提交
    • A
      vfio: Include No-IOMMU mode · 033291ec
      Alex Williamson 提交于
      There is really no way to safely give a user full access to a DMA
      capable device without an IOMMU to protect the host system.  There is
      also no way to provide DMA translation, for use cases such as device
      assignment to virtual machines.  However, there are still those users
      that want userspace drivers even under those conditions.  The UIO
      driver exists for this use case, but does not provide the degree of
      device access and programming that VFIO has.  In an effort to avoid
      code duplication, this introduces a No-IOMMU mode for VFIO.
      
      This mode requires building VFIO with CONFIG_VFIO_NOIOMMU and enabling
      the "enable_unsafe_noiommu_mode" option on the vfio driver.  This
      should make it very clear that this mode is not safe.  Additionally,
      CAP_SYS_RAWIO privileges are necessary to work with groups and
      containers using this mode.  Groups making use of this support are
      named /dev/vfio/noiommu-$GROUP and can only make use of the special
      VFIO_NOIOMMU_IOMMU for the container.  Use of this mode, specifically
      binding a device without a native IOMMU group to a VFIO bus driver
      will taint the kernel and should therefore not be considered
      supported.  This patch includes no-iommu support for the vfio-pci bus
      driver only.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      033291ec
  11. 10 6月, 2015 1 次提交
    • A
      vfio/pci: Fix racy vfio_device_get_from_dev() call · 20f30017
      Alex Williamson 提交于
      Testing the driver for a PCI device is racy, it can be all but
      complete in the release path and still report the driver as ours.
      Therefore we can't trust drvdata to be valid.  This race can sometimes
      be seen when one port of a multifunction device is being unbound from
      the vfio-pci driver while another function is being released by the
      user and attempting a bus reset.  The device in the remove path is
      found as a dependent device for the bus reset of the release path
      device, the driver is still set to vfio-pci, but the drvdata has
      already been cleared, resulting in a null pointer dereference.
      
      To resolve this, fix vfio_device_get_from_dev() to not take the
      dev_get_drvdata() shortcut and instead traverse through the
      iommu_group, vfio_group, vfio_device path to get a reference we
      can trust.  Once we have that reference, we know the device isn't
      in transition and we can test to make sure the driver is still what
      we expect, so that we don't interfere with devices we don't own.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      20f30017
  12. 02 5月, 2015 1 次提交
  13. 08 4月, 2015 6 次提交
  14. 17 3月, 2015 2 次提交
  15. 11 2月, 2015 1 次提交
  16. 08 1月, 2015 1 次提交
  17. 08 11月, 2014 1 次提交
  18. 30 9月, 2014 1 次提交
    • A
      vfio-pci: Fix remove path locking · 93899a67
      Alex Williamson 提交于
      Locking both the remove() and release() path results in a deadlock
      that should have been obvious.  To fix this we can get and hold the
      vfio_device reference as we evaluate whether to do a bus/slot reset.
      This will automatically block any remove() calls, allowing us to
      remove the explict lock.  Fixes 61d79256.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Cc: stable@vger.kernel.org	[3.17]
      93899a67
  19. 09 8月, 2014 1 次提交
  20. 08 8月, 2014 3 次提交
    • A
      vfio-pci: Attempt bus/slot reset on release · bc4fba77
      Alex Williamson 提交于
      Each time a device is released, mark whether a local reset was
      successful or whether a bus/slot reset is needed.  If a reset is
      needed and all of the affected devices are bound to vfio-pci and
      unused, allow the reset.  This is most useful when the userspace
      driver is killed and releases all the devices in an unclean state,
      such as when a QEMU VM quits.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      bc4fba77
    • A
      vfio-pci: Use mutex around open, release, and remove · 61d79256
      Alex Williamson 提交于
      Serializing open/release allows us to fix a refcnt error if we fail
      to enable the device and lets us prevent devices from being unbound
      or opened, giving us an opportunity to do bus resets on release.  No
      restriction added to serialize binding devices to vfio-pci while the
      mutex is held though.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      61d79256
    • A
      vfio-pci: Release devices with BusMaster disabled · 9c22e660
      Alex Williamson 提交于
      Our current open/release path looks like this:
      
      vfio_pci_open
        vfio_pci_enable
          pci_enable_device
          pci_save_state
          pci_store_saved_state
      
      vfio_pci_release
        vfio_pci_disable
          pci_disable_device
          pci_restore_state
      
      pci_enable_device() doesn't modify PCI_COMMAND_MASTER, so if a device
      comes to us with it enabled, it persists through the open and gets
      stored as part of the device saved state.  We then restore that saved
      state when released, which can allow the device to attempt to continue
      to do DMA.  When the group is disconnected from the domain, this will
      get caught by the IOMMU, but if there are other devices in the group,
      the device may continue running and interfere with the user.  Even in
      the former case, IOMMUs don't necessarily behave well and a stream of
      blocked DMA can result in unpleasant behavior on the host.
      
      Explicitly disable Bus Master as we're enabling the device and
      slightly re-work release to make sure that pci_disable_device() is
      the last thing that touches the device.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      9c22e660
  21. 05 8月, 2014 1 次提交
  22. 31 5月, 2014 2 次提交
  23. 16 1月, 2014 1 次提交
    • A
      vfio-pci: Use pci "try" reset interface · 890ed578
      Alex Williamson 提交于
      PCI resets will attempt to take the device_lock for any device to be
      reset.  This is a problem if that lock is already held, for instance
      in the device remove path.  It's not sufficient to simply kill the
      user process or skip the reset if called after .remove as a race could
      result in the same deadlock.  Instead, we handle all resets as "best
      effort" using the PCI "try" reset interfaces.  This prevents the user
      from being able to induce a deadlock by triggering a reset.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      890ed578
  24. 15 1月, 2014 1 次提交
  25. 05 9月, 2013 1 次提交
    • A
      vfio-pci: PCI hot reset interface · 8b27ee60
      Alex Williamson 提交于
      The current VFIO_DEVICE_RESET interface only maps to PCI use cases
      where we can isolate the reset to the individual PCI function.  This
      means the device must support FLR (PCIe or AF), PM reset on D3hot->D0
      transition, device specific reset, or be a singleton device on a bus
      for a secondary bus reset.  FLR does not have widespread support,
      PM reset is not very reliable, and bus topology is dictated by the
      system and device design.  We need to provide a means for a user to
      induce a bus reset in cases where the existing mechanisms are not
      available or not reliable.
      
      This device specific extension to VFIO provides the user with this
      ability.  Two new ioctls are introduced:
       - VFIO_DEVICE_PCI_GET_HOT_RESET_INFO
       - VFIO_DEVICE_PCI_HOT_RESET
      
      The first provides the user with information about the extent of
      devices affected by a hot reset.  This is essentially a list of
      devices and the IOMMU groups they belong to.  The user may then
      initiate a hot reset by calling the second ioctl.  We must be
      careful that the user has ownership of all the affected devices
      found via the first ioctl, so the second ioctl takes a list of file
      descriptors for the VFIO groups affected by the reset.  Each group
      must have IOMMU protection established for the ioctl to succeed.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      8b27ee60
  26. 25 7月, 2013 1 次提交
    • A
      vfio-pci: Avoid deadlock on remove · d24cdbfd
      Alex Williamson 提交于
      If an attempt is made to unbind a device from vfio-pci while that
      device is in use, the request is blocked until the device becomes
      unused.  Unfortunately, that unbind path still grabs the device_lock,
      which certain things like __pci_reset_function() also want to take.
      This means we need to try to acquire the locks ourselves and use the
      pre-locked version, __pci_reset_function_locked().
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      d24cdbfd
  27. 29 6月, 2013 1 次提交