1. 03 3月, 2022 3 次提交
    • J
      vfio: Extend the device migration protocol with RUNNING_P2P · 8cb3d83b
      Jason Gunthorpe 提交于
      The RUNNING_P2P state is designed to support multiple devices in the same
      VM that are doing P2P transactions between themselves. When in RUNNING_P2P
      the device must be able to accept incoming P2P transactions but should not
      generate outgoing P2P transactions.
      
      As an optional extension to the mandatory states it is defined as
      in between STOP and RUNNING:
         STOP -> RUNNING_P2P -> RUNNING -> RUNNING_P2P -> STOP
      
      For drivers that are unable to support RUNNING_P2P the core code
      silently merges RUNNING_P2P and RUNNING together. Unless driver support
      is present, the new state cannot be used in SET_STATE.
      Drivers that support this will be required to implement 4 FSM arcs
      beyond the basic FSM. 2 of the basic FSM arcs become combination
      transitions.
      
      Compared to the v1 clarification, NDMA is redefined into FSM states and is
      described in terms of the desired P2P quiescent behavior, noting that
      halting all DMA is an acceptable implementation.
      
      Link: https://lore.kernel.org/all/20220224142024.147653-11-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Tested-by: NShameer Kolothum <shameerali.kolothum.thodi@huawei.com>
      Reviewed-by: NKevin Tian <kevin.tian@intel.com>
      Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: NYishai Hadas <yishaih@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      8cb3d83b
    • J
      vfio: Define device migration protocol v2 · 115dcec6
      Jason Gunthorpe 提交于
      Replace the existing region based migration protocol with an ioctl based
      protocol. The two protocols have the same general semantic behaviors, but
      the way the data is transported is changed.
      
      This is the STOP_COPY portion of the new protocol, it defines the 5 states
      for basic stop and copy migration and the protocol to move the migration
      data in/out of the kernel.
      
      Compared to the clarification of the v1 protocol Alex proposed:
      
      https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen
      
      This has a few deliberate functional differences:
      
       - ERROR arcs allow the device function to remain unchanged.
      
       - The protocol is not required to return to the original state on
         transition failure. Instead userspace can execute an unwind back to
         the original state, reset, or do something else without needing kernel
         support. This simplifies the kernel design and should userspace choose
         a policy like always reset, avoids doing useless work in the kernel
         on error handling paths.
      
       - PRE_COPY is made optional, userspace must discover it before using it.
         This reflects the fact that the majority of drivers we are aware of
         right now will not implement PRE_COPY.
      
       - segmentation is not part of the data stream protocol, the receiver
         does not have to reproduce the framing boundaries.
      
      The hybrid FSM for the device_state is described as a Mealy machine by
      documenting each of the arcs the driver is required to implement. Defining
      the remaining set of old/new device_state transitions as 'combination
      transitions' which are naturally defined as taking multiple FSM arcs along
      the shortest path within the FSM's digraph allows a complete matrix of
      transitions.
      
      A new VFIO_DEVICE_FEATURE of VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE is
      defined to replace writing to the device_state field in the region. This
      allows returning a brand new FD whenever the requested transition opens
      a data transfer session.
      
      The VFIO core code implements the new feature and provides a helper
      function to the driver. Using the helper the driver only has to
      implement 6 of the FSM arcs and the other combination transitions are
      elaborated consistently from those arcs.
      
      A new VFIO_DEVICE_FEATURE of VFIO_DEVICE_FEATURE_MIGRATION is defined to
      report the capability for migration and indicate which set of states and
      arcs are supported by the device. The FSM provides a lot of flexibility to
      make backwards compatible extensions but the VFIO_DEVICE_FEATURE also
      allows for future breaking extensions for scenarios that cannot support
      even the basic STOP_COPY requirements.
      
      The VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE with the GET option (i.e.
      VFIO_DEVICE_FEATURE_GET) can be used to read the current migration state
      of the VFIO device.
      
      Data transfer sessions are now carried over a file descriptor, instead of
      the region. The FD functions for the lifetime of the data transfer
      session. read() and write() transfer the data with normal Linux stream FD
      semantics. This design allows future expansion to support poll(),
      io_uring, and other performance optimizations.
      
      The complicated mmap mode for data transfer is discarded as current qemu
      doesn't take meaningful advantage of it, and the new qemu implementation
      avoids substantially all the performance penalty of using a read() on the
      region.
      
      Link: https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Tested-by: NShameer Kolothum <shameerali.kolothum.thodi@huawei.com>
      Reviewed-by: NKevin Tian <kevin.tian@intel.com>
      Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NYishai Hadas <yishaih@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      115dcec6
    • J
      vfio: Have the core code decode the VFIO_DEVICE_FEATURE ioctl · 445ad495
      Jason Gunthorpe 提交于
      Invoke a new device op 'device_feature' to handle just the data array
      portion of the command. This lifts the ioctl validation to the core code
      and makes it simpler for either the core code, or layered drivers, to
      implement their own feature values.
      
      Provide vfio_check_feature() to consolidate checking the flags/etc against
      what the driver supports.
      
      Link: https://lore.kernel.org/all/20220224142024.147653-9-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com>
      Tested-by: NShameer Kolothum <shameerali.kolothum.thodi@huawei.com>
      Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NYishai Hadas <yishaih@nvidia.com>
      Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
      445ad495
  2. 01 12月, 2021 1 次提交
    • R
      vfio: remove all kernel-doc notation · 3b9a2d57
      Randy Dunlap 提交于
      vfio.c abuses (misuses) "/**", which indicates the beginning of
      kernel-doc notation in the kernel tree. This causes a bunch of
      kernel-doc complaints about this source file, so quieten all of
      them by changing all "/**" to "/*".
      
      vfio.c:236: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * IOMMU driver registration
      vfio.c:236: warning: missing initial short description on line:
        * IOMMU driver registration
      vfio.c:295: warning: expecting prototype for Container objects(). Prototype was for vfio_container_get() instead
      vfio.c:317: warning: expecting prototype for Group objects(). Prototype was for __vfio_group_get_from_iommu() instead
      vfio.c:496: warning: Function parameter or member 'device' not described in 'vfio_device_put'
      vfio.c:496: warning: expecting prototype for Device objects(). Prototype was for vfio_device_put() instead
      vfio.c:599: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * Async device support
      vfio.c:599: warning: missing initial short description on line:
        * Async device support
      vfio.c:693: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * VFIO driver API
      vfio.c:693: warning: missing initial short description on line:
        * VFIO driver API
      vfio.c:835: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * Get a reference to the vfio_device for a device.  Even if the
      vfio.c:835: warning: missing initial short description on line:
        * Get a reference to the vfio_device for a device.  Even if the
      vfio.c:969: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * VFIO base fd, /dev/vfio/vfio
      vfio.c:969: warning: missing initial short description on line:
        * VFIO base fd, /dev/vfio/vfio
      vfio.c:1187: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * VFIO Group fd, /dev/vfio/$GROUP
      vfio.c:1187: warning: missing initial short description on line:
        * VFIO Group fd, /dev/vfio/$GROUP
      vfio.c:1540: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * VFIO Device fd
      vfio.c:1540: warning: missing initial short description on line:
        * VFIO Device fd
      vfio.c:1615: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * External user API, exported by symbols to be linked dynamically.
      vfio.c:1615: warning: missing initial short description on line:
        * External user API, exported by symbols to be linked dynamically.
      vfio.c:1663: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * External user API, exported by symbols to be linked dynamically.
      vfio.c:1663: warning: missing initial short description on line:
        * External user API, exported by symbols to be linked dynamically.
      vfio.c:1742: warning: Function parameter or member 'caps' not described in 'vfio_info_cap_add'
      vfio.c:1742: warning: Function parameter or member 'size' not described in 'vfio_info_cap_add'
      vfio.c:1742: warning: Function parameter or member 'id' not described in 'vfio_info_cap_add'
      vfio.c:1742: warning: Function parameter or member 'version' not described in 'vfio_info_cap_add'
      vfio.c:1742: warning: expecting prototype for Sub(). Prototype was for vfio_info_cap_add() instead
      vfio.c:2276: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
        * Module/class support
      vfio.c:2276: warning: missing initial short description on line:
        * Module/class support
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Reported-by: Nkernel test robot <lkp@intel.com>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Eric Auger <eric.auger@redhat.com>
      Cc: Cornelia Huck <cohuck@redhat.com>
      Cc: kvm@vger.kernel.org
      Link: https://lore.kernel.org/r/38a9cb92-a473-40bf-b8f9-85cc5cfc2da4@infradead.orgReviewed-by: NJason Gunthorpe <jgg@nvidia.com>
      Reviewed-by: NCornelia Huck <cohuck@redhat.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      3b9a2d57
  3. 16 10月, 2021 5 次提交
  4. 01 10月, 2021 10 次提交
  5. 11 8月, 2021 3 次提交
  6. 16 6月, 2021 1 次提交
  7. 07 4月, 2021 6 次提交
  8. 02 2月, 2021 1 次提交
  9. 11 12月, 2020 1 次提交
    • L
      vfio/type1: Add vfio_group_iommu_domain() · bdfae1c9
      Lu Baolu 提交于
      Add the API for getting the domain from a vfio group. This could be used
      by the physical device drivers which rely on the vfio/mdev framework for
      mediated device user level access. The typical use case like below:
      
      	unsigned int pasid;
      	struct vfio_group *vfio_group;
      	struct iommu_domain *iommu_domain;
      	struct device *dev = mdev_dev(mdev);
      	struct device *iommu_device = mdev_get_iommu_device(dev);
      
      	if (!iommu_device ||
      	    !iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX))
      		return -EINVAL;
      
      	vfio_group = vfio_group_get_external_user_from_dev(dev);
      	if (IS_ERR_OR_NULL(vfio_group))
      		return -EFAULT;
      
      	iommu_domain = vfio_group_iommu_domain(vfio_group);
      	if (IS_ERR_OR_NULL(iommu_domain)) {
      		vfio_group_put_external_user(vfio_group);
      		return -EFAULT;
      	}
      
      	pasid = iommu_aux_get_pasid(iommu_domain, iommu_device);
      	if (pasid < 0) {
      		vfio_group_put_external_user(vfio_group);
      		return -EFAULT;
      	}
      
      	/* Program device context with pasid value. */
      	...
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      bdfae1c9
  10. 23 9月, 2020 1 次提交
  11. 22 9月, 2020 1 次提交
    • Y
      vfio: add a singleton check for vfio_group_pin_pages · 7ef32e52
      Yan Zhao 提交于
      Page pinning is used both to translate and pin device mappings for DMA
      purpose, as well as to indicate to the IOMMU backend to limit the dirty
      page scope to those pages that have been pinned, in the case of an IOMMU
      backed device.
      To support this, the vfio_pin_pages() interface limits itself to only
      singleton groups such that the IOMMU backend can consider dirty page
      scope only at the group level.  Implement the same requirement for the
      vfio_group_pin_pages() interface.
      
      Fixes: 95fc87b4 ("vfio: Selective dirty page tracking if IOMMU backed device pins pages")
      Signed-off-by: NYan Zhao <yan.y.zhao@intel.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      7ef32e52
  12. 28 7月, 2020 1 次提交
  13. 29 5月, 2020 1 次提交
  14. 24 3月, 2020 4 次提交
  15. 23 10月, 2019 1 次提交