1. 24 8月, 2018 2 次提交
  2. 21 8月, 2018 1 次提交
    • A
      vfio/spapr: Allow backing bigger guest IOMMU pages with smaller physical pages · c26bc185
      Alexey Kardashevskiy 提交于
      At the moment the PPC64/pseries guest only supports 4K/64K/16M IOMMU
      pages and POWER8 CPU supports the exact same set of page size so
      so far things worked fine.
      
      However POWER9 supports different set of sizes - 4K/64K/2M/1G and
      the last two - 2M and 1G - are not even allowed in the paravirt interface
      (RTAS DDW) so we always end up using 64K IOMMU pages, although we could
      back guest's 16MB IOMMU pages with 2MB pages on the host.
      
      This stores the supported host IOMMU page sizes in VFIOContainer and uses
      this later when creating a new DMA window. This uses the system page size
      (64k normally, 2M/16M/1G if hugepages used) as the upper limit of
      the IOMMU pagesize.
      
      This changes the type of @pagesize to uint64_t as this is what
      memory_region_iommu_get_min_page_size() returns and clz64() takes.
      
      There should be no behavioral changes on platforms other than pseries.
      The guest will keep using the IOMMU page size selected by the PHB pagesize
      property as this only changes the underlying hardware TCE table
      granularity.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      c26bc185
  3. 17 8月, 2018 2 次提交
    • A
      vfio/ccw/pci: Allow devices to opt-in for ballooning · 238e9172
      Alex Williamson 提交于
      If a vfio assigned device makes use of a physical IOMMU, then memory
      ballooning is necessarily inhibited due to the page pinning, lack of
      page level granularity at the IOMMU, and sufficient notifiers to both
      remove the page on balloon inflation and add it back on deflation.
      However, not all devices are backed by a physical IOMMU.  In the case
      of mediated devices, if a vendor driver is well synchronized with the
      guest driver, such that only pages actively used by the guest driver
      are pinned by the host mdev vendor driver, then there should be no
      overlap between pages available for the balloon driver and pages
      actively in use by the device.  Under these conditions, ballooning
      should be safe.
      
      vfio-ccw devices are always mediated devices and always operate under
      the constraints above.  Therefore we can consider all vfio-ccw devices
      as balloon compatible.
      
      The situation is far from straightforward with vfio-pci.  These
      devices can be physical devices with physical IOMMU backing or
      mediated devices where it is unknown whether a physical IOMMU is in
      use or whether the vendor driver is well synchronized to the working
      set of the guest driver.  The safest approach is therefore to assume
      all vfio-pci devices are incompatible with ballooning, but allow user
      opt-in should they have further insight into mediated devices.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      238e9172
    • A
      vfio: Inhibit ballooning based on group attachment to a container · c65ee433
      Alex Williamson 提交于
      We use a VFIOContainer to associate an AddressSpace to one or more
      VFIOGroups.  The VFIOContainer represents the DMA context for that
      AdressSpace for those VFIOGroups and is synchronized to changes in
      that AddressSpace via a MemoryListener.  For IOMMU backed devices,
      maintaining the DMA context for a VFIOGroup generally involves
      pinning a host virtual address in order to create a stable host
      physical address and then mapping a translation from the associated
      guest physical address to that host physical address into the IOMMU.
      
      While the above maintains the VFIOContainer synchronized to the QEMU
      memory API of the VM, memory ballooning occurs outside of that API.
      Inflating the memory balloon (ie. cooperatively capturing pages from
      the guest for use by the host) simply uses MADV_DONTNEED to "zap"
      pages from QEMU's host virtual address space.  The page pinning and
      IOMMU mapping above remains in place, negating the host's ability to
      reuse the page, but the host virtual to host physical mapping of the
      page is invalidated outside of QEMU's memory API.
      
      When the balloon is later deflated, attempting to cooperatively
      return pages to the guest, the page is simply freed by the guest
      balloon driver, allowing it to be used in the guest and incurring a
      page fault when that occurs.  The page fault maps a new host physical
      page backing the existing host virtual address, meanwhile the
      VFIOContainer still maintains the translation to the original host
      physical address.  At this point the guest vCPU and any assigned
      devices will map different host physical addresses to the same guest
      physical address.  Badness.
      
      The IOMMU typically does not have page level granularity with which
      it can track this mapping without also incurring inefficiencies in
      using page size mappings throughout.  MMU notifiers in the host
      kernel also provide indicators for invalidating the mapping on
      balloon inflation, not for updating the mapping when the balloon is
      deflated.  For these reasons we assume a default behavior that the
      mapping of each VFIOGroup into the VFIOContainer is incompatible
      with memory ballooning and increment the balloon inhibitor to match
      the attached VFIOGroups.
      Reviewed-by: NPeter Xu <peterx@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      c65ee433
  4. 12 7月, 2018 1 次提交
    • C
      vfio/pci: do not set the PCIDevice 'has_rom' attribute · 26c0ae56
      Cédric Le Goater 提交于
      PCI devices needing a ROM allocate an optional MemoryRegion with
      pci_add_option_rom(). pci_del_option_rom() does the cleanup when the
      device is destroyed. The only action taken by this routine is to call
      vmstate_unregister_ram() which clears the id string of the optional
      ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This
      was recently added by commit b895de50 ("migration: discard
      non-migratable RAMBlocks"), .
      
      VFIO devices do their own loading of the PCI option ROM in
      vfio_pci_size_rom(). The memory region is switched to an I/O region
      and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM
      region is not allocated. When the associated PCI device is deleted,
      pci_del_option_rom() calls vmstate_unregister_ram() which tries to
      flag a NULL RAMBlock, leading to a SEGV.
      
      It seems that 'has_rom' was set to have memory_region_destroy()
      called, but since commit 469b046e ("memory: remove
      memory_region_destroy") this is not necessary anymore as the
      MemoryRegion is freed automagically.
      
      Remove the PCIDevice 'has_rom' attribute setting in vfio.
      
      Fixes: b895de50 ("migration: discard non-migratable RAMBlocks")
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Reviewed-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      26c0ae56
  5. 02 7月, 2018 1 次提交
  6. 18 6月, 2018 1 次提交
    • H
      vfio-ccw: add force unlimited prefetch property · 9a51c9ee
      Halil Pasic 提交于
      There is at least one guest (OS) such that although it does not rely on
      the guarantees provided by ORB 1 word 9 bit (aka unlimited prefetch, aka
      P bit) not being set, it fails to tell this to the machine.
      
      Usually this ain't a big deal, as the original purpose of the P bit is to
      allow for performance optimizations. vfio-ccw however can not provide the
      guarantees required if the bit is not set.
      
      It is not possible to implement support for the P bit not set without
      transitioning to lower level protocols for vfio-ccw.  So let's give the
      user the opportunity to force setting the P bit, if the user knows this
      is safe.  For self modifying channel programs forcing the P bit is not
      safe.  If the P bit is forced for a self modifying channel program things
      are expected to break in strange ways.
      
      Let's also avoid warning multiple about P bit not set in the ORB in case
      P bit is not told to be forced, and designate the affected vfio-ccw
      device.
      Signed-off-by: NHalil Pasic <pasic@linux.ibm.com>
      Suggested-by: NDong Jia Shi <bjsdjshi@linux.ibm.com>
      Acked-by: NJason J. Herne <jjherne@linux.ibm.com>
      Tested-by: NJason J. Herne <jjherne@linux.ibm.com>
      Message-Id: <20180524175828.3143-2-pasic@linux.ibm.com>
      Signed-off-by: NCornelia Huck <cohuck@redhat.com>
      9a51c9ee
  7. 15 6月, 2018 1 次提交
    • P
      iommu: Add IOMMU index argument to notifier APIs · cb1efcf4
      Peter Maydell 提交于
      Add support for multiple IOMMU indexes to the IOMMU notifier APIs.
      When initializing a notifier with iommu_notifier_init(), the caller
      must pass the IOMMU index that it is interested in. When a change
      happens, the IOMMU implementation must pass
      memory_region_notify_iommu() the IOMMU index that has changed and
      that notifiers must be called for.
      
      IOMMUs which support only a single index don't need to change.
      Callers which only really support working with IOMMUs with a single
      index can use the result of passing MEMTXATTRS_UNSPECIFIED to
      memory_region_iommu_attrs_to_index().
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Reviewed-by: NAlex Bennée <alex.bennee@linaro.org>
      Message-id: 20180604152941.20374-3-peter.maydell@linaro.org
      cb1efcf4
  8. 05 6月, 2018 5 次提交
  9. 01 6月, 2018 1 次提交
  10. 31 5月, 2018 1 次提交
  11. 30 4月, 2018 1 次提交
    • G
      vfio-ccw: introduce vfio_ccw_get_device() · c96f2c2a
      Greg Kurz 提交于
      A recent patch fixed leaks of the dynamically allocated vcdev->vdev.name
      field in vfio_ccw_realize(), but we now have three freeing sites for it.
      This is unfortunate and seems to indicate something is wrong with its
      life cycle.
      
      The root issue is that vcdev->vdev.name is set before vfio_get_device()
      is called, which theoretically prevents to call vfio_put_device() to
      do the freeing. Well actually, we could call it anyway  because
      vfio_put_base_device() is a nop if the device isn't attached, but this
      would be confusing.
      
      This patch hence moves all the logic of attaching the device, including
      the "already attached" check, to a separate vfio_ccw_get_device() function,
      counterpart of vfio_put_device(). While here, vfio_put_device() is renamed
      to vfio_ccw_put_device() for consistency.
      Signed-off-by: NGreg Kurz <groug@kaod.org>
      Message-Id: <152326891065.266543.9487977590811413472.stgit@bahia.lan>
      Signed-off-by: NCornelia Huck <cohuck@redhat.com>
      c96f2c2a
  12. 27 4月, 2018 1 次提交
    • T
      ui: introduce vfio_display_reset · 8983e3e3
      Tina Zhang 提交于
      During guest OS reboot, guest framebuffer is invalid. It will cause
      bugs, if the invalid guest framebuffer is still used by host.
      
      This patch is to introduce vfio_display_reset which is invoked
      during vfio display reset. This vfio_display_reset function is used
      to release the invalid display resource, disable scanout mode and
      replace the invalid surface with QemuConsole's DisplaySurafce.
      
      This patch can fix the GPU hang issue caused by gd_egl_draw during
      guest OS reboot.
      
      Changes v3->v4:
       - Move dma-buf based display check into the vfio_display_reset().
         (Gerd)
      
      Changes v2->v3:
       - Limit vfio_display_reset to dma-buf based vfio display. (Gerd)
      
      Changes v1->v2:
       - Use dpy_gfx_update_full() update screen after reset. (Gerd)
       - Remove dpy_gfx_switch_surface(). (Gerd)
      Signed-off-by: NTina Zhang <tina.zhang@intel.com>
      Message-id: 1524820266-27079-3-git-send-email-tina.zhang@intel.com
      Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
      8983e3e3
  13. 09 4月, 2018 1 次提交
    • G
      vfio-ccw: fix memory leaks in vfio_ccw_realize() · be4d026f
      Greg Kurz 提交于
      If the subchannel is already attached or if vfio_get_device() fails, the
      code jumps to the 'out_device_err' label and doesn't free the string it
      has just allocated.
      
      The code should be reworked so that vcdev->vdev.name only gets set when
      the device has been attached, and freed when it is about to be detached.
      This could be achieved  with the addition of a vfio_ccw_get_device()
      function that would be the counterpart of vfio_put_device(). But this is
      a more elaborate cleanup that should be done in a follow-up. For now,
      let's just add calls to g_free() on the buggy error paths.
      Signed-off-by: NGreg Kurz <groug@kaod.org>
      Message-Id: <152311222681.203086.8874800175539040298.stgit@bahia>
      Signed-off-by: NCornelia Huck <cohuck@redhat.com>
      be4d026f
  14. 06 4月, 2018 1 次提交
    • E
      vfio: Use a trace point when a RAM section cannot be DMA mapped · 5c086005
      Eric Auger 提交于
      Commit 567b5b30 ("vfio/pci: Relax DMA map errors for MMIO regions")
      added an error message if a passed memory section address or size
      is not aligned to the page size and thus cannot be DMA mapped.
      
      This patch fixes the trace by printing the region name and the
      memory region section offset within the address space (instead of
      offset_within_region).
      
      We also turn the error_report into a trace event. Indeed, In some
      cases, the traces can be confusing to non expert end-users and
      let think the use case does not work (whereas it works as before).
      
      This is the case where a BAR is successively mapped at different
      GPAs and its sections are not compatible with dma map. The listener
      is called several times and traces are issued for each intermediate
      mapping.  The end-user cannot easily match those GPAs against the
      final GPA output by lscpi. So let's keep those information to
      informed users. In mid term, the plan is to advise the user about
      BAR relocation relevance.
      
      Fixes: 567b5b30 ("vfio/pci: Relax DMA map errors for MMIO regions")
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Reviewed-by: NPhilippe Mathieu-Daudé <f4bug@amsat.org>
      Reviewed-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      5c086005
  15. 14 3月, 2018 7 次提交
  16. 08 3月, 2018 1 次提交
  17. 06 3月, 2018 1 次提交
  18. 09 2月, 2018 2 次提交
  19. 07 2月, 2018 9 次提交
    • P
      vfio: listener unregister before unset container · 36968626
      Peter Xu 提交于
      After next patch, listener unregister will need the container to be
      alive.  Let's move this unregister phase to be before unset container,
      since that operation will free the backend container in kernel,
      otherwise we'll get these after next patch:
      
      qemu-system-x86_64: VFIO_UNMAP_DMA: -22
      qemu-system-x86_64: vfio_dma_unmap(0x559bf53a4590, 0x0, 0xa0000) = -22 (Invalid argument)
      Signed-off-by: NPeter Xu <peterx@redhat.com>
      Message-Id: <20180122060244.29368-4-peterx@redhat.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      36968626
    • A
      vfio/pci: Add option to disable GeForce quirks · db32d0f4
      Alex Williamson 提交于
      These quirks are necessary for GeForce, but not for Quadro/GRID/Tesla
      assignment.  Leaving them enabled is fully functional and provides the
      most compatibility, but due to the unique NVIDIA MSI ACK behavior[1],
      it also introduces latency in re-triggering the MSI interrupt.  This
      overhead is typically negligible, but has been shown to adversely
      affect some (very) high interrupt rate applications.  This adds the
      vfio-pci device option "x-no-geforce-quirks=" which can be set to
      "on" to disable this additional overhead.
      
      A follow-on optimization for GeForce might be to make use of an
      ioeventfd to allow KVM to trigger an irqfd in the kernel vfio-pci
      driver, avoiding the bounce through userspace to handle this device
      write.
      
      [1] Background: the NVIDIA driver has been observed to issue a write
      to the MMIO mirror of PCI config space in BAR0 in order to allow the
      MSI interrupt for the device to retrigger.  Older reports indicated a
      write of 0xff to the (read-only) MSI capability ID register, while
      more recently a write of 0x0 is observed at config space offset 0x704,
      non-architected, extended config space of the device (BAR0 offset
      0x88704).  Virtualization of this range is only required for GeForce.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      db32d0f4
    • A
      vfio/common: Remove redundant copy of local variable · a5b04f7c
      Alexey Kardashevskiy 提交于
      There is already @hostwin in vfio_listener_region_add() so there is no
      point in having the other one.
      
      Fixes: 2e4109de ("vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)")
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      a5b04f7c
    • E
      hw/vfio/platform: Init the interrupt mutex · 89202c6f
      Eric Auger 提交于
      Add the initialization of the mutex protecting the interrupt list.
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      89202c6f
    • A
      vfio/pci: Allow relocating MSI-X MMIO · 89d5202e
      Alex Williamson 提交于
      Recently proposed vfio-pci kernel changes (v4.16) remove the
      restriction preventing userspace from mmap'ing PCI BARs in areas
      overlapping the MSI-X vector table.  This change is primarily intended
      to benefit host platforms which make use of system page sizes larger
      than the PCI spec recommendation for alignment of MSI-X data
      structures (ie. not x86_64).  In the case of POWER systems, the SPAPR
      spec requires the VM to program MSI-X using hypercalls, rendering the
      MSI-X vector table unused in the VM view of the device.  However,
      ARM64 platforms also support 64KB pages and rely on QEMU emulation of
      MSI-X.  Regardless of the kernel driver allowing mmaps overlapping
      the MSI-X vector table, emulation of the MSI-X vector table also
      prevents direct mapping of device MMIO spaces overlapping this page.
      Thanks to the fact that PCI devices have a standard self discovery
      mechanism, we can try to resolve this by relocating the MSI-X data
      structures, either by creating a new PCI BAR or extending an existing
      BAR and updating the MSI-X capability for the new location.  There's
      even a very slim chance that this could benefit devices which do not
      adhere to the PCI spec alignment guidelines on x86_64 systems.
      
      This new x-msix-relocation option accepts the following choices:
      
        off: Disable MSI-X relocation, use native device config (default)
        auto: Use a known good combination for the platform/device (none yet)
        bar0..bar5: Specify the target BAR for MSI-X data structures
      
      If compatible, the target BAR will either be created or extended and
      the new portion will be used for MSI-X emulation.
      
      The first obvious user question with this option is how to determine
      whether a given platform and device might benefit from this option.
      In most cases, the answer is that it won't, especially on x86_64.
      Devices often dedicate an entire BAR to MSI-X and therefore no
      performance sensitive registers overlap the MSI-X area.  Take for
      example:
      
      # lspci -vvvs 0a:00.0
      0a:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection
      	...
      	Region 0: Memory at db680000 (32-bit, non-prefetchable) [size=512K]
      	Region 3: Memory at db7f8000 (32-bit, non-prefetchable) [size=16K]
      	...
      	Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
      		Vector table: BAR=3 offset=00000000
      		PBA: BAR=3 offset=00002000
      
      This device uses the 16K bar3 for MSI-X with the vector table at
      offset zero and the pending bits arrary at offset 8K, fully honoring
      the PCI spec alignment guidance.  The data sheet specifically refers
      to this as an MSI-X BAR.  This device would not see a benefit from
      MSI-X relocation regardless of the platform, regardless of the page
      size.
      
      However, here's another example:
      
      # lspci -vvvs 02:00.0
      02:00.0 Serial Attached SCSI controller: xxxxxxxx
      	...
      	Region 0: I/O ports at c000 [size=256]
      	Region 1: Memory at ef640000 (64-bit, non-prefetchable) [size=64K]
      	Region 3: Memory at ef600000 (64-bit, non-prefetchable) [size=256K]
      	...
      	Capabilities: [c0] MSI-X: Enable+ Count=16 Masked-
      		Vector table: BAR=1 offset=0000e000
      		PBA: BAR=1 offset=0000f000
      
      Here the MSI-X data structures are placed on separate 4K pages at the
      end of a 64KB BAR.  If our host page size is 4K, we're likely fine,
      but at 64KB page size, MSI-X emulation at that location prevents the
      entire BAR from being directly mapped into the VM address space.
      Overlapping performance sensitive registers then starts to be a very
      likely scenario on such a platform.  At this point, the user could
      enable tracing on vfio_region_read and vfio_region_write to determine
      more conclusively if device accesses are being trapped through QEMU.
      
      Upon finding a device and platform in need of MSI-X relocation, the
      next problem is how to choose target PCI BAR to host the MSI-X data
      structures.  A few key rules to keep in mind for this selection
      include:
      
       * There are only 6 BAR slots, bar0..bar5
       * 64-bit BARs occupy two BAR slots, 'lspci -vvv' lists the first slot
       * PCI BARs are always a power of 2 in size, extending == doubling
       * The maximum size of a 32-bit BAR is 2GB
       * MSI-X data structures must reside in an MMIO BAR
      
      Using these rules, we can evaluate each BAR of the second example
      device above as follows:
      
       bar0: I/O port BAR, incompatible with MSI-X tables
       bar1: BAR could be extended, incurring another 64KB of MMIO
       bar2: Unavailable, bar1 is 64-bit, this register is used by bar1
       bar3: BAR could be extended, incurring another 256KB of MMIO
       bar4: Unavailable, bar3 is 64bit, this register is used by bar3
       bar5: Available, empty BAR, minimum additional MMIO
      
      A secondary optimization we might wish to make in relocating MSI-X
      is to minimize the additional MMIO required for the device, therefore
      we might test the available choices in order of preference as bar5,
      bar1, and finally bar3.  The original proposal for this feature
      included an 'auto' option which would choose bar5 in this case, but
      various drivers have been found that make assumptions about the
      properties of the "first" BAR or the size of BARs such that there
      appears to be no foolproof automatic selection available, requiring
      known good combinations to be sourced from users.  This patch is
      pre-enabled for an 'auto' selection making use of a validated lookup
      table, but no entries are yet identified.
      Tested-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      89d5202e
    • A
      vfio/pci: Emulate BARs · 04f336b0
      Alex Williamson 提交于
      The kernel provides similar emulation of PCI BAR register access to
      QEMU, so up until now we've used that for things like BAR sizing and
      storing the BAR address.  However, if we intend to resize BARs or add
      BARs that don't exist on the physical device, we need to switch to the
      pure QEMU emulation of the BAR.
      Tested-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      04f336b0
    • A
      vfio/pci: Add base BAR MemoryRegion · 3a286732
      Alex Williamson 提交于
      Add one more layer to our stack of MemoryRegions, this base region
      allows us to register BARs independently of the vfio region or to
      extend the size of BARs which do map to a region.  This will be
      useful when we want hypervisor defined BARs or sections of BARs,
      for purposes such as relocating MSI-X emulation.  We therefore call
      msix_init() based on this new base MemoryRegion, while the quirks,
      which only modify regions still operate on those sub-MemoryRegions.
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      3a286732
    • A
      vfio/pci: Fixup VFIOMSIXInfo comment · edd09278
      Alex Williamson 提交于
      The fields were removed in the referenced commit, but the comment
      still mentions them.
      
      Fixes: 2fb9636e ("vfio-pci: Remove unused fields from VFIOMSIXInfo")
      Tested-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      edd09278
    • A
      vfio/spapr: Use iommu memory region's get_attr() · 07bc681a
      Alexey Kardashevskiy 提交于
      In order to enable TCE operations support in KVM, we have to inform
      the KVM about VFIO groups being attached to specific LIOBNs. The KVM
      already knows about VFIO groups, the only bit missing is which
      in-kernel TCE table (the one with user visible TCEs) should update
      the attached broups. There is an KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
      attribute of the VFIO KVM device which receives a groupfd/tablefd couple.
      
      This uses a new memory_region_iommu_get_attr() helper to get the IOMMU fd
      and calls KVM to establish the link.
      
      As get_attr() is not implemented yet, this should cause no behavioural
      change.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      07bc681a