- 03 3月, 2022 3 次提交
-
-
由 Jason Gunthorpe 提交于
The RUNNING_P2P state is designed to support multiple devices in the same VM that are doing P2P transactions between themselves. When in RUNNING_P2P the device must be able to accept incoming P2P transactions but should not generate outgoing P2P transactions. As an optional extension to the mandatory states it is defined as in between STOP and RUNNING: STOP -> RUNNING_P2P -> RUNNING -> RUNNING_P2P -> STOP For drivers that are unable to support RUNNING_P2P the core code silently merges RUNNING_P2P and RUNNING together. Unless driver support is present, the new state cannot be used in SET_STATE. Drivers that support this will be required to implement 4 FSM arcs beyond the basic FSM. 2 of the basic FSM arcs become combination transitions. Compared to the v1 clarification, NDMA is redefined into FSM states and is described in terms of the desired P2P quiescent behavior, noting that halting all DMA is an acceptable implementation. Link: https://lore.kernel.org/all/20220224142024.147653-11-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com> Tested-by: NShameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NYishai Hadas <yishaih@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
-
由 Jason Gunthorpe 提交于
Replace the existing region based migration protocol with an ioctl based protocol. The two protocols have the same general semantic behaviors, but the way the data is transported is changed. This is the STOP_COPY portion of the new protocol, it defines the 5 states for basic stop and copy migration and the protocol to move the migration data in/out of the kernel. Compared to the clarification of the v1 protocol Alex proposed: https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen This has a few deliberate functional differences: - ERROR arcs allow the device function to remain unchanged. - The protocol is not required to return to the original state on transition failure. Instead userspace can execute an unwind back to the original state, reset, or do something else without needing kernel support. This simplifies the kernel design and should userspace choose a policy like always reset, avoids doing useless work in the kernel on error handling paths. - PRE_COPY is made optional, userspace must discover it before using it. This reflects the fact that the majority of drivers we are aware of right now will not implement PRE_COPY. - segmentation is not part of the data stream protocol, the receiver does not have to reproduce the framing boundaries. The hybrid FSM for the device_state is described as a Mealy machine by documenting each of the arcs the driver is required to implement. Defining the remaining set of old/new device_state transitions as 'combination transitions' which are naturally defined as taking multiple FSM arcs along the shortest path within the FSM's digraph allows a complete matrix of transitions. A new VFIO_DEVICE_FEATURE of VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE is defined to replace writing to the device_state field in the region. This allows returning a brand new FD whenever the requested transition opens a data transfer session. The VFIO core code implements the new feature and provides a helper function to the driver. Using the helper the driver only has to implement 6 of the FSM arcs and the other combination transitions are elaborated consistently from those arcs. A new VFIO_DEVICE_FEATURE of VFIO_DEVICE_FEATURE_MIGRATION is defined to report the capability for migration and indicate which set of states and arcs are supported by the device. The FSM provides a lot of flexibility to make backwards compatible extensions but the VFIO_DEVICE_FEATURE also allows for future breaking extensions for scenarios that cannot support even the basic STOP_COPY requirements. The VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE with the GET option (i.e. VFIO_DEVICE_FEATURE_GET) can be used to read the current migration state of the VFIO device. Data transfer sessions are now carried over a file descriptor, instead of the region. The FD functions for the lifetime of the data transfer session. read() and write() transfer the data with normal Linux stream FD semantics. This design allows future expansion to support poll(), io_uring, and other performance optimizations. The complicated mmap mode for data transfer is discarded as current qemu doesn't take meaningful advantage of it, and the new qemu implementation avoids substantially all the performance penalty of using a read() on the region. Link: https://lore.kernel.org/all/20220224142024.147653-10-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com> Tested-by: NShameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NYishai Hadas <yishaih@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
-
由 Jason Gunthorpe 提交于
Invoke a new device op 'device_feature' to handle just the data array portion of the command. This lifts the ioctl validation to the core code and makes it simpler for either the core code, or layered drivers, to implement their own feature values. Provide vfio_check_feature() to consolidate checking the flags/etc against what the driver supports. Link: https://lore.kernel.org/all/20220224142024.147653-9-yishaih@nvidia.comSigned-off-by: NJason Gunthorpe <jgg@nvidia.com> Tested-by: NShameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NYishai Hadas <yishaih@nvidia.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com>
-
- 01 12月, 2021 1 次提交
-
-
由 Randy Dunlap 提交于
vfio.c abuses (misuses) "/**", which indicates the beginning of kernel-doc notation in the kernel tree. This causes a bunch of kernel-doc complaints about this source file, so quieten all of them by changing all "/**" to "/*". vfio.c:236: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * IOMMU driver registration vfio.c:236: warning: missing initial short description on line: * IOMMU driver registration vfio.c:295: warning: expecting prototype for Container objects(). Prototype was for vfio_container_get() instead vfio.c:317: warning: expecting prototype for Group objects(). Prototype was for __vfio_group_get_from_iommu() instead vfio.c:496: warning: Function parameter or member 'device' not described in 'vfio_device_put' vfio.c:496: warning: expecting prototype for Device objects(). Prototype was for vfio_device_put() instead vfio.c:599: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Async device support vfio.c:599: warning: missing initial short description on line: * Async device support vfio.c:693: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * VFIO driver API vfio.c:693: warning: missing initial short description on line: * VFIO driver API vfio.c:835: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Get a reference to the vfio_device for a device. Even if the vfio.c:835: warning: missing initial short description on line: * Get a reference to the vfio_device for a device. Even if the vfio.c:969: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * VFIO base fd, /dev/vfio/vfio vfio.c:969: warning: missing initial short description on line: * VFIO base fd, /dev/vfio/vfio vfio.c:1187: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * VFIO Group fd, /dev/vfio/$GROUP vfio.c:1187: warning: missing initial short description on line: * VFIO Group fd, /dev/vfio/$GROUP vfio.c:1540: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * VFIO Device fd vfio.c:1540: warning: missing initial short description on line: * VFIO Device fd vfio.c:1615: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * External user API, exported by symbols to be linked dynamically. vfio.c:1615: warning: missing initial short description on line: * External user API, exported by symbols to be linked dynamically. vfio.c:1663: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * External user API, exported by symbols to be linked dynamically. vfio.c:1663: warning: missing initial short description on line: * External user API, exported by symbols to be linked dynamically. vfio.c:1742: warning: Function parameter or member 'caps' not described in 'vfio_info_cap_add' vfio.c:1742: warning: Function parameter or member 'size' not described in 'vfio_info_cap_add' vfio.c:1742: warning: Function parameter or member 'id' not described in 'vfio_info_cap_add' vfio.c:1742: warning: Function parameter or member 'version' not described in 'vfio_info_cap_add' vfio.c:1742: warning: expecting prototype for Sub(). Prototype was for vfio_info_cap_add() instead vfio.c:2276: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Module/class support vfio.c:2276: warning: missing initial short description on line: * Module/class support Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Reported-by: Nkernel test robot <lkp@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/38a9cb92-a473-40bf-b8f9-85cc5cfc2da4@infradead.orgReviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 16 10月, 2021 5 次提交
-
-
由 Jason Gunthorpe 提交于
Modernize how vfio is creating the group char dev and sysfs presence. These days drivers with state should use cdev_device_add() and cdev_device_del() to manage the cdev and sysfs lifetime. This API requires the driver to put the struct device and struct cdev inside its state struct (vfio_group), and then use the usual device_initialize()/cdev_device_add()/cdev_device_del() sequence. Split the code to make this possible: - vfio_group_alloc()/vfio_group_release() are pair'd functions to alloc/free the vfio_group. release is done under the struct device kref. - vfio_create_group()/vfio_group_put() are pairs that manage the sysfs/cdev lifetime. Once the uses count is zero the vfio group's userspace presence is destroyed. - The IDR is replaced with an IDA. container_of(inode->i_cdev) is used to get back to the vfio_group during fops open. The IDA assigns unique minor numbers. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v3-2fdfe4ca2cc6+18c-vfio_group_cdev_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
The next patch adds a struct device to the struct vfio_group, and it is confusing/bad practice to have two krefs in the same struct. This kref is controlling the period when the vfio_group is registered in sysfs, and visible in the internal lookup. Switch it to a refcount_t instead. The refcount_dec_and_mutex_lock() is still required because we need atomicity of the list searches and sysfs presence. Reviewed-by: NLiu Yi L <yi.l.liu@intel.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v3-2fdfe4ca2cc6+18c-vfio_group_cdev_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
If vfio_create_group() searches the group list and returns an already existing group it does not put back the iommu_group reference that the caller passed in. Change the semantic of vfio_create_group() to not move the reference in from the caller, but instead obtain a new reference inside and leave the caller's reference alone. The two callers must now call iommu_group_put(). This is an unlikely race as the only caller that could hit it has already searched the group list before attempting to create the group. Fixes: cba3345c ("vfio: VFIO core") Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v3-2fdfe4ca2cc6+18c-vfio_group_cdev_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
Split vfio_group_get_from_iommu() into __vfio_group_get_from_iommu() so that vfio_create_group() can call it to consolidate this duplicated code. Reviewed-by: NLiu Yi L <yi.l.liu@intel.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v3-2fdfe4ca2cc6+18c-vfio_group_cdev_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
iommu_group_register_notifier()/iommu_group_unregister_notifier() are built using a blocking_notifier_chain which integrates a rwsem. The notifier function cannot be running outside its registration. When considering how the notifier function interacts with create/destroy of the group there are two fringe cases, the notifier starts before list_add(&vfio.group_list) and the notifier runs after the kref becomes 0. Prior to vfio_create_group() unlocking and returning we have container_users == 0 device_list == empty And this cannot change until the mutex is unlocked. After the kref goes to zero we must also have container_users == 0 device_list == empty Both are required because they are balanced operations and a 0 kref means some caller became unbalanced. Add the missing assertion that container_users must be zero as well. These two facts are important because when checking each operation we see: - IOMMU_GROUP_NOTIFY_ADD_DEVICE Empty device_list avoids the WARN_ON in vfio_group_nb_add_dev() 0 container_users ends the call - IOMMU_GROUP_NOTIFY_BOUND_DRIVER 0 container_users ends the call Finally, we have IOMMU_GROUP_NOTIFY_UNBOUND_DRIVER, which only deletes items from the unbound list. During creation this list is empty, during kref == 0 nothing can read this list, and it will be freed soon. Since the vfio_group_release() doesn't hold the appropriate lock to manipulate the unbound_list and could race with the notifier, move the cleanup to directly before the kfree. This allows deleting all of the deferred group put code. Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NLiu Yi L <yi.l.liu@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v3-2fdfe4ca2cc6+18c-vfio_group_cdev_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 01 10月, 2021 10 次提交
-
-
由 Christoph Hellwig 提交于
Pass the group flags to ->attach_group and remove the messy check for the bus type. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-12-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Christoph Hellwig 提交于
Create a new private drivers/vfio/vfio.h header for the interface between the VFIO core and the iommu drivers. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-10-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Christoph Hellwig 提交于
The read, write and mmap methods are never implemented, so remove them. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-9-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Christoph Hellwig 提交于
Reuse the logic in vfio_noiommu_group_alloc to allocate a fake single-device iommu group for mediated devices by factoring out a common function, and replacing the noiommu boolean field in struct vfio_group with an enum to distinguish the three different kinds of groups. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-8-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Christoph Hellwig 提交于
Just pass a noiommu argument to vfio_create_group and set up the ->noiommu flag directly. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-7-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Christoph Hellwig 提交于
Split the actual noiommu group creation from vfio_iommu_group_get into a new helper, and open code the rest of vfio_iommu_group_get in its only caller. This creates an entirely separate and clear code path for the noiommu group creation. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-6-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Christoph Hellwig 提交于
Factor out a helper to find or allocate the vfio_group to reduce the spagetthi code in vfio_register_group_dev a little. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-5-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Christoph Hellwig 提交于
vfio_noiommu_attach_group has two callers: 1) __vfio_container_attach_groups is called by vfio_ioctl_set_iommu, which just called vfio_iommu_driver_allowed 2) vfio_group_set_container requires already checks ->noiommu on the vfio_group, which is propagated from the iommudata in vfio_create_group so this check is entirely superflous and can be removed. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-4-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com> -
由 Christoph Hellwig 提交于
Factor out a little helper to make the checks for the noiommu driver less ugly. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-3-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
We don't need to hold a reference to the group in the driver as well as obtain a reference to the same group as the first thing vfio_register_group_dev() does. Since the drivers never use the group move this all into the core code. Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210924155705.4258-2-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 11 8月, 2021 3 次提交
-
-
由 Jason Gunthorpe 提交于
Nothing uses this anymore, delete it. Signed-off-by: NYishai Hadas <yishaih@nvidia.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/14-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
Currently the driver ops have an open/release pair that is called once each time a device FD is opened or closed. Add an additional set of open/close_device() ops which are called when the device FD is opened for the first time and closed for the last time. An analysis shows that all of the drivers require this semantic. Some are open coding it as part of their reflck implementation, and some are just buggy and miss it completely. To retain the current semantics PCI and FSL depend on, introduce the idea of a "device set" which is a grouping of vfio_device's that share the same lock around opening. The device set is established by providing a 'set_id' pointer. All vfio_device's that provide the same pointer will be joined to the same singleton memory and lock across the whole set. This effectively replaces the oddly named reflck. After conversion the set_id will be sourced from: - A struct device from a fsl_mc_device (fsl) - A struct pci_slot (pci) - A struct pci_bus (pci) - The struct vfio_device (everything) The design ensures that the above pointers are live as long as the vfio_device is registered, so they form reliable unique keys to group vfio_devices into sets. This implementation uses xarray instead of searching through the driver core structures, which simplifies the somewhat tricky locking in this area. Following patches convert all the drivers. Signed-off-by: NYishai Hadas <yishaih@nvidia.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Max Gurtovoy 提交于
This pairs with vfio_init_group_dev() and allows undoing any state that is stored in the vfio_device unrelated to registration. Add appropriately placed calls to all the drivers. The following patch will use this to add pre-registration state for the device set. Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 16 6月, 2021 1 次提交
-
-
由 Max Gurtovoy 提交于
Remove code duplication and move module refcounting to the subsystem module. Signed-off-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20210518192133.59195-2-mgurtovoy@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 07 4月, 2021 6 次提交
-
-
由 Jason Gunthorpe 提交于
There are no longer any users, so it can go away. Everything is using container_of now. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Message-Id: <14-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
This is the standard kernel pattern, the ops associated with a struct get the struct pointer in for typesafety. The expected design is to use container_of to cleanly go from the subsystem level type to the driver level type without having any type erasure in a void *. Reviewed-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Message-Id: <12-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
mdev gets little benefit because it doesn't actually do anything, however it is the last user, so move the vfio_init/register/unregister_group_dev() code here for now. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NLiu Yi L <yi.l.liu@intel.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Message-Id: <10-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
This makes the struct vfio_device part of the public interface so it can be used with container_of and so forth, as is typical for a Linux subystem. This is the first step to bring some type-safety to the vfio interface by allowing the replacement of 'void *' and 'struct device *' inputs with a simple and clear 'struct vfio_device *' For now the self-allocating vfio_add_group_dev() interface is kept so each user can be updated as a separate patch. The expected usage pattern is driver core probe() function: my_device = kzalloc(sizeof(*mydevice)); vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice); /* other driver specific prep */ vfio_register_group_dev(&my_device->vdev); dev_set_drvdata(dev, my_device); driver core remove() function: my_device = dev_get_drvdata(dev); vfio_unregister_group_dev(&my_device->vdev); /* other driver specific tear down */ kfree(my_device); Allowing the driver to be able to use the drvdata and vfio_device to go to/from its own data. The pattern also makes it clear that vfio_register_group_dev() must be last in the sequence, as once it is called the core code can immediately start calling ops. The init/register gap is provided to allow for the driver to do setup before ops can be called and thus avoid races. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NLiu Yi L <yi.l.liu@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Message-Id: <3-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> -
由 Jason Gunthorpe 提交于
The vfio_device is using a 'sleep until all refs go to zero' pattern for its lifetime, but it is indirectly coded by repeatedly scanning the group list waiting for the device to be removed on its own. Switch this around to be a direct representation, use a refcount to count the number of places that are blocking destruction and sleep directly on a completion until that counter goes to zero. kfree the device after other accesses have been excluded in vfio_del_group_dev(). This is a fairly common Linux idiom. Due to this we can now remove kref_put_mutex(), which is very rarely used in the kernel. Here it is being used to prevent a zero ref device from being seen in the group list. Instead allow the zero ref device to continue to exist in the device_list and use refcount_inc_not_zero() to exclude it once refs go to zero. This patch is organized so the next patch will be able to alter the API to allow drivers to provide the kfree. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Message-Id: <2-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Jason Gunthorpe 提交于
The vfio_device->group value has a get obtained during vfio_add_group_dev() which gets moved from the stack to vfio_device->group in vfio_group_create_device(). The reference remains until we reach the end of vfio_del_group_dev() when it is put back. Thus anything that already has a kref on the vfio_device is guaranteed a valid group pointer. Remove all the extra reference traffic. It is tricky to see, but the get at the start of vfio_del_group_dev() is actually pairing with the put hidden inside vfio_device_put() a few lines below. A later patch merges vfio_group_create_device() into vfio_add_group_dev() which makes the ownership and error flow on the create side easier to follow. Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Message-Id: <1-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 02 2月, 2021 1 次提交
-
-
由 Steve Sistare 提交于
Define a vfio_iommu_driver_ops notify callback, for sending events to the driver. Drivers are not required to provide the callback, and may ignore any events. The handling of events is driver specific. Define the CONTAINER_CLOSE event, called when the container's file descriptor is closed. This event signifies that no further state changes will occur via container ioctl's. Signed-off-by: NSteve Sistare <steven.sistare@oracle.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 11 12月, 2020 1 次提交
-
-
由 Lu Baolu 提交于
Add the API for getting the domain from a vfio group. This could be used by the physical device drivers which rely on the vfio/mdev framework for mediated device user level access. The typical use case like below: unsigned int pasid; struct vfio_group *vfio_group; struct iommu_domain *iommu_domain; struct device *dev = mdev_dev(mdev); struct device *iommu_device = mdev_get_iommu_device(dev); if (!iommu_device || !iommu_dev_feature_enabled(iommu_device, IOMMU_DEV_FEAT_AUX)) return -EINVAL; vfio_group = vfio_group_get_external_user_from_dev(dev); if (IS_ERR_OR_NULL(vfio_group)) return -EFAULT; iommu_domain = vfio_group_iommu_domain(vfio_group); if (IS_ERR_OR_NULL(iommu_domain)) { vfio_group_put_external_user(vfio_group); return -EFAULT; } pasid = iommu_aux_get_pasid(iommu_domain, iommu_device); if (pasid < 0) { vfio_group_put_external_user(vfio_group); return -EFAULT; } /* Program device context with pasid value. */ ... Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 23 9月, 2020 1 次提交
-
-
由 Yan Zhao 提交于
When error occurs, need to put vfio group after a successful get. Fixes: 95fc87b4 ("vfio: Selective dirty page tracking if IOMMU backed device pins pages") Signed-off-by: NYan Zhao <yan.y.zhao@intel.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 22 9月, 2020 1 次提交
-
-
由 Yan Zhao 提交于
Page pinning is used both to translate and pin device mappings for DMA purpose, as well as to indicate to the IOMMU backend to limit the dirty page scope to those pages that have been pinned, in the case of an IOMMU backed device. To support this, the vfio_pin_pages() interface limits itself to only singleton groups such that the IOMMU backend can consider dirty page scope only at the group level. Implement the same requirement for the vfio_group_pin_pages() interface. Fixes: 95fc87b4 ("vfio: Selective dirty page tracking if IOMMU backed device pins pages") Signed-off-by: NYan Zhao <yan.y.zhao@intel.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 28 7月, 2020 1 次提交
-
-
由 Alex Williamson 提交于
No functional change, avoid non-inclusive naming schemes. Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 29 5月, 2020 1 次提交
-
-
由 Kirti Wankhede 提交于
Added a check such that only singleton IOMMU groups can pin pages. >From the point when vendor driver pins any pages, consider IOMMU group dirty page scope to be limited to pinned pages. To optimize to avoid walking list often, added flag pinned_page_dirty_scope to indicate if all of the vfio_groups for each vfio_domain in the domain_list dirty page scope is limited to pinned pages. This flag is updated on first pinned pages request for that IOMMU group and on attaching/detaching group. Signed-off-by: NKirti Wankhede <kwankhede@nvidia.com> Reviewed-by: NNeo Jia <cjia@nvidia.com> Reviewed-by: NYan Zhao <yan.y.zhao@intel.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 24 3月, 2020 4 次提交
-
-
由 Alex Williamson 提交于
Allow bus drivers to provide their own callback to match a device to the user provided string. Reviewed-by: NCornelia Huck <cohuck@redhat.com> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Yan Zhao 提交于
vfio_group_pin_pages() and vfio_group_unpin_pages() are introduced to avoid inefficient search/check/ref/deref opertions associated with VFIO group as those in each calling into vfio_pin_pages() and vfio_unpin_pages(). VFIO group is taken as arg directly. The callers combine search/check/ref/deref operations associated with VFIO group by calling vfio_group_get_external_user()/vfio_group_get_external_user_from_dev() beforehand, and vfio_group_put_external_user() afterwards. Suggested-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NYan Zhao <yan.y.zhao@intel.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Yan Zhao 提交于
vfio_dma_rw will read/write a range of user space memory pointed to by IOVA into/from a kernel buffer without enforcing pinning the user space memory. TODO: mark the IOVAs to user space memory dirty if they are written in vfio_dma_rw(). Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: NYan Zhao <yan.y.zhao@intel.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
由 Yan Zhao 提交于
external user calls vfio_group_get_external_user_from_dev() with a device pointer to get the VFIO group associated with this device. The VFIO group is checked to be vialbe and have IOMMU set. Then container user counter is increased and VFIO group reference is hold to prevent the VFIO group from disposal before external user exits. when the external user finishes using of the VFIO group, it calls vfio_group_put_external_user() to dereference the VFIO group and the container user counter. Suggested-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NYan Zhao <yan.y.zhao@intel.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
-
- 23 10月, 2019 1 次提交
-
-
由 Arnd Bergmann 提交于
Each of these drivers has a copy of the same trivial helper function to convert the pointer argument and then call the native ioctl handler. We now have a generic implementation of that, so use it. Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: NJarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NJiri Kosina <jkosina@suse.cz> Reviewed-by: NStefan Hajnoczi <stefanha@redhat.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-