提交 · 23b99237f86df34cbcefa81d1fa45bc316b4a124 · openeuler / Kernel

22 9月, 2022 1 次提交

s390/vfio-ap: bypass unnecessary processing of AP resources · 1918f2b2

由 Tony Krowiak 提交于 8月 23, 2022

It is not necessary to go through the process of validation, linking of
queues to mdev and vice versa and filtering the APQNs assigned to the
matrix mdev to build an AP configuration for a guest if an adapter or
domain being assigned is already assigned to the matrix mdev. Likewise, it
is not necessary to proceed through the process the unassignment of an
adapter, domain or control domain if it is not assigned to the matrix mdev.

Since it is not necessary to process assignment of a resource already
assigned or process unassignment of a resource that is been assigned,
this patch will bypass all assignment/unassignment operations for an
adapter, domain or control domain under these circumstances.

Not only is assignment of a duplicate adapter or domain unnecessary, it
will also cause a hang situation when removing the matrix mdev to which it is
assigned. The reason is because the same vfio_ap_queue objects with an
APQN containing the APID of the adapter or APQI of the domain being
assigned will get added multiple times to the hashtable that holds them.
This results in the pprev and next pointers of the hlist_node (mdev_qnode
field in the vfio_ap_queue object) pointing to the queue object itself
resulting in an interminable loop when the mdev is removed and the queue
table is iterated to reset the queues.

Cc: stable@vger.kernel.org
Fixes: 11cb2419 ("s390/vfio-ap: manage link between queue struct and matrix mdev")
Reported-by: NMatthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

1918f2b2

26 7月, 2022 3 次提交

vfio: Replace phys_pfn with pages for vfio_pin_pages() · 34a255e6

由 Nicolin Chen 提交于 7月 22, 2022

Most of the callers of vfio_pin_pages() want "struct page *" and the
low-level mm code to pin pages returns a list of "struct page *" too.
So there's no gain in converting "struct page *" to PFN in between.

Replace the output parameter "phys_pfn" list with a "pages" list, to
simplify callers. This also allows us to replace the vfio_iommu_type1
implementation with a more efficient one.

And drop the pfn_valid check in the gvt code, as there is no need to
do such a check at a page-backed struct page pointer.

For now, also update vfio_iommu_type1 to fit this new parameter too.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Acked-by: NEric Farman <farman@linux.ibm.com>
Tested-by: NTerrence Xu <terrence.xu@intel.com>
Tested-by: NEric Farman <farman@linux.ibm.com>
Signed-off-by: NNicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-11-nicolinc@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

34a255e6

vfio/ap: Change saved_pfn to saved_iova · 3fad3a26

由 Nicolin Chen 提交于 7月 22, 2022

The vfio_ap_ops code maintains both nib address and its PFN, which
is redundant, merely because vfio_pin/unpin_pages API wanted pfn.
Since vfio_pin/unpin_pages() now accept "iova", change "saved_pfn"
to "saved_iova" and remove pfn in the vfio_ap_validate_nib().
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Tested-by: NEric Farman <farman@linux.ibm.com>
Signed-off-by: NNicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-7-nicolinc@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

3fad3a26

vfio: Pass in starting IOVA to vfio_pin/unpin_pages API · 44abdd16

由 Nicolin Chen 提交于 7月 22, 2022

The vfio_pin/unpin_pages() so far accepted arrays of PFNs of user IOVA.
Among all three callers, there was only one caller possibly passing in
a non-contiguous PFN list, which is now ensured to have contiguous PFN
inputs too.

Pass in the starting address with "iova" alone to simplify things, so
callers no longer need to maintain a PFN list or to pin/unpin one page
at a time. This also allows VFIO to use more efficient implementations
of pin/unpin_pages.

For now, also update vfio_iommu_type1 to fit this new parameter too,
while keeping its input intact (being user_iova) since we don't want
to spend too much effort swapping its parameters and local variables
at that level.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Acked-by: NEric Farman <farman@linux.ibm.com>
Tested-by: NTerrence Xu <terrence.xu@intel.com>
Tested-by: NEric Farman <farman@linux.ibm.com>
Signed-off-by: NNicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-6-nicolinc@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

44abdd16

23 7月, 2022 1 次提交

vfio/ap: Pass in physical address of ind to ap_aqic() · 10e19d49

由 Nicolin Chen 提交于 7月 22, 2022

The ap_aqic() is called by vfio_ap_irq_enable() where it passes in a
virt value that's casted from a physical address "h_nib". Inside the
ap_aqic(), it does virt_to_phys() again.

Since ap_aqic() needs a physical address, let's just pass in a pa of
ind directly. So change the "ind" to "pa_ind".
Reviewed-by: NHarald Freudenberger <freude@linux.ibm.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Tested-by: NEric Farman <farman@linux.ibm.com>
Signed-off-by: NNicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-4-nicolinc@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

10e19d49

21 7月, 2022 1 次提交

vfio: Replace the DMA unmapping notifier with a callback · ce4b4657

由 Jason Gunthorpe 提交于 7月 19, 2022

Instead of having drivers register the notifier with explicit code just
have them provide a dma_unmap callback op in their driver ops and rely on
the core code to wire it up.
Suggested-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NEric Farman <farman@linux.ibm.com>
Reviewed-by: NZhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v4-681e038e30fd+78-vfio_unmap_notif_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

ce4b4657

19 7月, 2022 16 次提交

s390/vfio-ap: handle config changed and scan complete notification · eeb386ae

由 Tony Krowiak 提交于 4月 04, 2022

This patch implements two new AP driver callbacks:

void (*on_config_changed)(struct ap_config_info *new_config_info,
                  struct ap_config_info *old_config_info);

void (*on_scan_complete)(struct ap_config_info *new_config_info,
                 struct ap_config_info *old_config_info);

The on_config_changed callback is invoked at the start of the AP bus scan
function when it determines that the host AP configuration information
has changed since the previous scan.

The vfio_ap device driver registers a callback function for this callback
that performs the following operations:

1. Unplugs the adapters, domains and control domains removed from the
host's AP configuration from the guests to which they are
assigned in a single operation.

2. Stores bitmaps identifying the adapters, domains and control domains
added to the host's AP configuration with the structure representing
the mediated device. When the vfio_ap device driver's probe callback is
subsequently invoked, the probe function will recognize that the
queue is being probed due to a change in the host's AP configuration
and the plugging of the queue into the guest will be bypassed.

The on_scan_complete callback is invoked after the ap bus scan is
completed if the host AP configuration data has changed. The vfio_ap
device driver registers a callback function for this callback that hot
plugs each queue and control domain added to the AP configuration for each
guest using them in a single hot plug operation.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

eeb386ae

s390/vfio-ap: sysfs attribute to display the guest's matrix · f7f795c5

由 Tony Krowiak 提交于 9月 10, 2019

The matrix of adapters and domains configured in a guest's APCB may
differ from the matrix of adapters and domains assigned to the matrix mdev,
so this patch introduces a sysfs attribute to display the matrix of
adapters and domains that are or will be assigned to the APCB of a guest
that is or will be using the matrix mdev. For a matrix mdev denoted by
$uuid, the guest matrix can be displayed as follows:

   cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

f7f795c5

s390/vfio-ap: implement in-use callback for vfio_ap driver · 3f85d1df

由 Tony Krowiak 提交于 9月 30, 2021

Let's implement the callback to indicate when an APQN
is in use by the vfio_ap device driver. The callback is
invoked whenever a change to the apmask or aqmask would
result in one or more queue devices being removed from the driver. The
vfio_ap device driver will indicate a resource is in use
if the APQN of any of the queue devices to be removed are assigned to
any of the matrix mdevs under the driver's control.

There is potential for a deadlock condition between the
matrix_dev->guests_lock used to lock the guest during assignment of
adapters and domains and the ap_perms_mutex locked by the AP bus when
changes are made to the sysfs apmask/aqmask attributes.

The AP Perms lock controls access to the objects that store the adapter
numbers (ap_perms) and domain numbers (aq_perms) for the sysfs
/sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes
identify which queues are reserved for the zcrypt default device drivers.
Before allowing a bit to be removed from either mask, the AP bus must check
with the vfio_ap device driver to verify that none of the queues are
assigned to any of its mediated devices.

The apmask/aqmask attributes can be written or read at any time from
userspace, so care must be taken to prevent a deadlock with asynchronous
operations that might be taking place in the vfio_ap device driver. For
example, consider the following:

1. A system administrator assigns an adapter to a mediated device under the
   control of the vfio_ap device driver. The driver will need to first take
   the matrix_dev->guests_lock to potentially hot plug the adapter into
   the KVM guest.
2. At the same time, a system administrator sets a bit in the sysfs
   /sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus
   must:
   a. Take the ap_perms_mutex lock to update the object storing the values
      for the /sys/bus/ap/ap_mask attribute.
   b. Call the vfio_ap device driver's in-use callback to verify that the
      queues now being reserved for the default zcrypt drivers are not
      assigned to a mediated device owned by the vfio_ap device driver. To
      do the verification, the in-use callback function takes the
      matrix_dev->guests_lock, but has to wait because it is already held
      by the operation in 1 above.
3. The vfio_ap device driver calls an AP bus function to verify that the
   new queues resulting from the assignment of the adapter in step 1 are
   not reserved for the default zcrypt device driver. This AP bus function
   tries to take the ap_perms_mutex lock but gets stuck waiting for the
   waiting for the lock due to step 2a above.

Consequently, we have the following deadlock situation:

matrix_dev->guests_lock locked (1)
ap_perms_mutex lock locked (2a)
Waiting for matrix_dev->gusts_lock (2b) which is currently held (1)
Waiting for ap_perms_mutex lock (3) which is currently held (2a)

To prevent this deadlock scenario, the function called in step 3 will no
longer take the ap_perms_mutex lock and require the caller to take the
lock. The lock will be the first taken by the adapter/domain assignment
functions in the vfio_ap device driver to maintain the proper locking
order.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

3f85d1df

s390/vfio-ap: reset queues after adapter/domain unassignment · 70aeefe5

由 Tony Krowiak 提交于 8月 25, 2021

When an adapter or domain is unassigned from an mdev attached to a KVM
guest, one or more of the guest's queues may get dynamically removed. Since
the removed queues could get re-assigned to another mdev, they need to be
reset. So, when an adapter or domain is unassigned from the mdev, the
queues that are removed from the guest's AP configuration (APCB) will be
reset.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

70aeefe5

s390/vfio-ap: hot plug/unplug of AP devices when probed/removed · 09d31ff7

由 Tony Krowiak 提交于 2月 01, 2022

When an AP queue device is probed or removed, if the mediated device is
attached to a KVM guest, the mediated device's adapter, domain and
control domain bitmaps must be filtered to update the guest's APCB and if
any changes are detected, the guest's APCB must then be hot plugged into
the guest to reflect those changes to the guest.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

09d31ff7

s390/vfio-ap: allow hot plug/unplug of AP devices when assigned/unassigned · 51dc562a

由 Tony Krowiak 提交于 1月 31, 2022

Let's hot plug an adapter, domain or control domain into the guest when it
is assigned to a matrix mdev that is attached to a KVM guest. Likewise,
let's hot unplug an adapter, domain or control domain from the guest when
it is unassigned from a matrix_mdev that is attached to a KVM guest.

Whenever an assignment or unassignment of an adapter, domain or control
domain is performed, the APQNs and control domains assigned to the matrix
mdev will be filtered and assigned to the AP control block
(APCB) that supplies the AP configuration to the guest so that no
adapter, domain or control domain that is not in the host's AP
configuration nor any APQN that does not reference a queue device bound
to the vfio_ap device driver is assigned.

After updating the APCB, if the mdev is in use by a KVM guest, it is
hot plugged into the guest to dynamically provide access to the adapters,
domains and control domains provided via the newly refreshed APCB.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

51dc562a

s390/vfio-ap: prepare for dynamic update of guest's APCB on queue probe/remove · 2c1ee898

由 Tony Krowiak 提交于 3月 17, 2022

The callback functions for probing and removing a queue device must take
and release the locks required to perform a dynamic update of a guest's
APCB in the proper order.

The proper order for taking the locks is:

        matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock

The proper order for releasing the locks is:

        matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock

A new helper function is introduced to be used by the probe callback to
acquire the required locks. Since the probe callback only has
access to a queue device when it is called, the helper function will find
the ap_matrix_mdev object to which the queue device's APQN is assigned and
return it so the KVM guest to which the mdev is attached can be dynamically
updated.

Note that in order to find the ap_matrix_mdev (matrix_mdev) object, it is
necessary to search the matrix_dev->mdev_list. This presents a
locking order dilemma because the matrix_dev->mdevs_lock can't be taken to
protect against changes to the list while searching for the matrix_mdev to
which a queue device's APQN is assigned. This is due to the fact that the
proper locking order requires that the matrix_dev->mdevs_lock be taken
after both the matrix_mdev->kvm->lock and the matrix_dev->mdevs_lock.
Consequently, the matrix_dev->guests_lock will be used to protect against
removal of a matrix_mdev object from the list while a queue device is
being probed. This necessitates changes to the mdev probe/remove
callback functions to take the matrix_dev->guests_lock prior to removing
a matrix_mdev object from the list.

A new macro is also introduced to acquire the locks required to dynamically
update the guest's APCB in the proper order when a queue device is
removed.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

2c1ee898

s390/vfio-ap: prepare for dynamic update of guest's APCB on assign/unassign · 8ee13ad9

由 Tony Krowiak 提交于 3月 16, 2022

The functions backing the matrix mdev's sysfs attribute interfaces to
assign/unassign adapters, domains and control domains must take and
release the locks required to perform a dynamic update of a guest's APCB
in the proper order.

The proper order for taking the locks is:

matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock

The proper order for releasing the locks is:

matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock

Two new macros are introduced for this purpose: One to take the locks and
the other to release the locks. These macros will be used by the
assignment/unassignment functions to prepare for dynamic update of
the KVM guest's APCB.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

8ee13ad9

s390/vfio-ap: use proper locking order when setting/clearing KVM pointer · b84eb8e0

由 Tony Krowiak 提交于 3月 16, 2022

The group notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM event must
use the required locks in proper locking order to dynamically update the
guest's APCB. The proper locking order is:

       1. matrix_dev->guests_lock: required to use the KVM pointer to
          update a KVM guest's APCB.

       2. matrix_mdev->kvm->lock: required to update a KVM guest's APCB.

       3. matrix_dev->mdevs_lock: required to store or access the data
          stored in a struct ap_matrix_mdev instance.

Two macros are introduced to acquire and release the locks in the proper
order. These macros are now used by the group notifier functions.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

b84eb8e0

s390/vfio-ap: rename matrix_dev->lock mutex to matrix_dev->mdevs_lock · d0786556

由 Tony Krowiak 提交于 3月 16, 2022

The matrix_dev->lock mutex is being renamed to matrix_dev->mdevs_lock to
better reflect its purpose, which is to control access to the state of the
mediated devices under the control of the vfio_ap device driver.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

d0786556

s390/vfio-ap: allow assignment of unavailable AP queues to mdev device · e2126a73

由 Tony Krowiak 提交于 12月 14, 2020

The current implementation does not allow assignment of an AP adapter or
domain to an mdev device if each APQN resulting from the assignment
does not reference an AP queue device that is bound to the vfio_ap device
driver. This patch allows assignment of AP resources to the matrix mdev as
long as the APQNs resulting from the assignment:
   1. Are not reserved by the AP BUS for use by the zcrypt device drivers.
   2. Are not assigned to another matrix mdev.

The rationale behind this is that the AP architecture does not preclude
assignment of APQNs to an AP configuration profile that are not available
to the system.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

e2126a73

s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev · 48cae940

由 Tony Krowiak 提交于 8月 11, 2021

Refresh the guest's APCB by filtering the APQNs and control domain numbers
assigned to the matrix mdev.

Filtering of APQNs:
-----------------
APQNs that do not reference an AP queue device bound to the vfio_ap device
driver must be filtered from the APQNs assigned to the matrix mdev before
they can be assigned to the guest's APCB. Given that the APQNs are
configured in the guest's APCB as a matrix of APIDs (adapters) and APQIs
(domains), it is not possible to filter an individual APQN. For example,
suppose the matrix of APQNs is structured as follows:

                   APIDs
             3      4      5
        0  (3,0)  (4,0)  (5,0)
APQIs   1  (3,1)  (4,1)  (5,1)
        2  (3,2)  (4,2)  (5,2)

Now suppose APQN (4,1) does not reference a queue device bound to the
vfio_ap device driver. If we filter APID 4, the APQNs (4,0), (4,1) and
(4,2) will be removed. Similarly, if we filter domain 1, APQNs (3,1),
(4,1) and (5,1) will be removed.

To resolve this dilemma, the choice was made to filter the APID - in this
case 4 - from the guest's APCB. The reason for this design decision is
because the APID references an AP adapter which is a real hardware device
that can be physically installed, removed, enabled or disabled; whereas, a
domain is a partition within the adapter. It therefore better reflects
reality to remove the APID from the guest's APCB.

Filtering of control domains:
----------------------------
Any control domains that are not assigned to the host's AP configuration
will be filtered from those assigned to the matrix mdev before assigning
them to the guest's APCB.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

48cae940

s390/vfio-ap: introduce shadow APCB · 49b0109f

由 Tony Krowiak 提交于 12月 11, 2019

The APCB is a field within the CRYCB that provides the AP configuration
to a KVM guest. Let's introduce a shadow copy of the KVM guest's APCB and
maintain it for the lifespan of the guest.

The shadow APCB serves the following purposes:

1. The shadow APCB can be maintained even when the mediated device is not
currently in use by a KVM guest. Since the mediated device's AP
configuration is filtered to ensure that no AP queues are passed through
to the KVM guest that are not bound to the vfio_ap device driver or
available to the host, the mediated device's AP configuration may differ
from the guest's. Having a shadow of a guest's APCB allows us to provide
a sysfs interface to view the guest's APCB even if the mediated device
is not currently passed through to a KVM guest. This can aid in
problem determination when the guest is unexpectedly missing AP
resources.

2. If filtering was done in-place for the real APCB, the guest could pick
up a transient state. Doing the filtering on a shadow and transferring
the AP configuration to the real APCB after the guest is started or when
AP resources are assigned to or unassigned from the mediated device, or
when the host configuration changes, the guest's AP configuration will
never be in a transient state.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

49b0109f

s390/vfio-ap: manage link between queue struct and matrix mdev · 11cb2419

由 Tony Krowiak 提交于 2月 01, 2021

Let's create links between each queue device bound to the vfio_ap device
driver and the matrix mdev to which the queue's APQN is assigned. The idea
is to facilitate efficient retrieval of the objects representing the queue
devices and matrix mdevs as well as to verify that a queue assigned to
a matrix mdev is bound to the driver.

The links will be created as follows:

 * When the queue device is probed, if its APQN is assigned to a matrix
   mdev, the structures representing the queue device and the matrix mdev
   will be linked.

 * When an adapter or domain is assigned to a matrix mdev, for each new
   APQN assigned that references a queue device bound to the vfio_ap
   device driver, the structures representing the queue device and the
   matrix mdev will be linked.

The links will be removed as follows:

 * When the queue device is removed, if its APQN is assigned to a matrix
   mdev, the link from the structure representing the matrix mdev to the
   structure representing the queue will be removed. Since the storage
   allocated for the vfio_ap_queue will be freed, there is no need to
   remove the link to the matrix_mdev to which the queue's APQN is
   assigned.

 * When an adapter or domain is unassigned from a matrix mdev, for each
   APQN unassigned that references a queue device bound to the vfio_ap
   device driver, the structures representing the queue device and the
   matrix mdev will be unlinked.

 * When an mdev is removed, the link from any queues assigned to the mdev
   to the mdev will be removed.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

11cb2419

s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c · 260f3ea1

由 Tony Krowiak 提交于 10月 14, 2020

Let's move the probe and remove callbacks into the vfio_ap_ops.c
file to keep all code related to managing queues in a single file. This
way, all functions related to queue management can be removed from the
vfio_ap_private.h header file defining the public interfaces for the
vfio_ap device driver.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

260f3ea1

s390/vfio-ap: use new AP bus interface to search for queue devices · 034921cd

由 Tony Krowiak 提交于 1月 28, 2021

This patch refactors the vfio_ap device driver to use the AP bus's
ap_get_qdev() function to retrieve the vfio_ap_queue struct containing
information about a queue that is bound to the vfio_ap device driver.
The bus's ap_get_qdev() function retrieves the queue device from a
hashtable keyed by APQN. This is much more efficient than looping over
the list of devices attached to the AP bus by several orders of
magnitude.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>

034921cd

24 5月, 2022 1 次提交

vfio: remove VFIO_GROUP_NOTIFY_SET_KVM · 421cfe65

由 Matthew Rosato 提交于 5月 19, 2022

Rather than relying on a notifier for associating the KVM with
the group, let's assume that the association has already been
made prior to device_open.  The first time a device is opened
associate the group KVM with the device.

This fixes a user-triggerable oops in GVT.
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NMatthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Acked-by: NZhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20220519183311.582380-2-mjrosato@linux.ibm.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

421cfe65

12 5月, 2022 2 次提交

vfio/mdev: Pass in a struct vfio_device * to vfio_pin/unpin_pages() · 8e432bb0

由 Jason Gunthorpe 提交于 5月 11, 2022

Every caller has a readily available vfio_device pointer, use that instead
of passing in a generic struct device. The struct vfio_device already
contains the group we need so this avoids complexity, extra refcountings,
and a confusing lifecycle model.
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NEric Farman <farman@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

8e432bb0

vfio: Make vfio_(un)register_notifier accept a vfio_device · 09ea48ef

由 Jason Gunthorpe 提交于 5月 11, 2022

All callers have a struct vfio_device trivially available, pass it in
directly and avoid calling the expensive vfio_group_get_from_dev().
Acked-by: NEric Farman <farman@linux.ibm.com>
Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

09ea48ef

21 4月, 2022 1 次提交

vfio/mdev: Remove mdev_parent_ops · 6b42f491

由 Jason Gunthorpe 提交于 4月 11, 2022

The last useful member in this struct is the supported_type_groups, move
it to the mdev_driver and delete mdev_parent_ops.

Replace it with mdev_driver as an argument to mdev_register_device()
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NZhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20220411141403.86980-33-hch@lst.deReviewed-by: NKirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: NZhi Wang <zhi.a.wang@intel.com>

6b42f491

28 3月, 2022 1 次提交

s390/vfio-ap: fix kernel doc and signature of group notifier functions · 71078220

由 Tony Krowiak 提交于 3月 18, 2022

The vfio_ap device driver registers a group notifier function to handle
the VFIO_GROUP_NOTIFY_SET_KVM event signalling the KVM pointer has been
set or cleared. There are two helper functions invoked by the handler
function: One called when the KVM pointer has been set, and the other
when the pointer is cleared.

The kernel doc for both of these functions contains a comment introduced
by commit 0cc00c8d (s390/vfio-ap: fix circular lockdep when
setting/clearing crypto masks) that is no longer valid. This patch removes
this comment from the kernel doc of each helper function.

Commit 86956e70 (s390/vfio-ap: replace open coded locks for
VFIO_GROUP_NOTIFY_SET_KVM notification) added a parameter to the signature
of the helper function that handles the event indicating the KVM pointer
has been cleared. The parameter added was the KVM pointer itself.
One of the function's primary purposes is to clear the KVM pointer from the
ap_matrix_mdev instance in which it is stored. Since the callers of this
function derive the KVM pointer passed to the function from the
ap_matrix_mdev object itself, it is completely unnecessary to include this
parameter in the function's signature since it can simply be retrieved from
the ap_matrix_mdev object which is also passed in. This patch removes the
KVM pointer from the function's signature.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

71078220

07 2月, 2022 2 次提交

s390/vfio-ap: add s390dbf logging to the vfio_ap_irq_enable function · 783f0a3c

由 Tony Krowiak 提交于 1月 04, 2022

This patch adds s390dbf logging to the function that executes the
PQAP(AQIC) instruction on behalf of the guest to which the queue for which
interrupts are being enabled or disabled is attached.

Currently, the vfio_ap_irq_enable function sets status response code 06
(notification indicator byte address (nib) invalid) in the status word
when the vfio_pin_pages function - called to pin the page containing the
nib - returns an error or a different number of pages pinned than
requested.

Setting the response code returned to userspace without also logging a
message in the kernel makes it impossible to determine whether the response
was due to an error detected by the vfio_ap device driver or because the
response code was returned by the firmware in response to the PQAP(AQIC)
instruction.

In addition to logging a warning for the situation above, this patch adds
the following:

* A function to validate the nib address invoked prior to calling the
  vfio_pin_pages function. This allows for logging a message informing
  the reader of the reason the page containing the nib can not be pinned
  if the nib address is not valid. Response code 06 (invalid nib address)
  will be set in the status word returned to the guest from the
  instruction.

* Checks the return value from the kvm_s390_gisc_register and logs a
  message informing the reader of the failure. Status response code 08
  (invalid gisa) will be set in the status word returned to the guest from
  the PQAP(AQIC) instruction.

* Checks the status response code returned from execution of the PQAP(AQIC)
  instruction and if it indicates an error, logs a message informing the
  reader.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NMatthew Rosato <mjrosato@linux.ibm.com>
Acked-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

783f0a3c

s390/vfio-ap: add s390dbf logging to the handle_pqap function · 68f554b7

由 Tony Krowiak 提交于 11月 04, 2021

This patch adds s390dbf logging to the function that handles interception
of the PQAP(AQIC) instruction. Several items of data are validated before
ultimately calling the functions that execute the PQAP(AQIC) instruction on
behalf of the guest to which the queue for which interrupts are being
enabled or disabled is attached.

Currently, the handle_pqap function sets status response code 01 (queue not
available) in the status word that is normally returned from the
PQAP(AQIC) instruction under the following conditions:

* Set when the function pointer to the handler is not set in the
  kvm_s390_crypto object (i.e., the PQAP hook is not registered).

* Set when the KVM pointer is not set in the ap_matrix_mdev object
  (i.e., the matrix mdev is not passed through to a guest).

* Set when the queue for which interrupts are being enabled or
  disabled is either not bound to the vfio_ap device driver or not assigned
  to the matrix mdev.

Setting the response code returned to userspace without also logging a
message in the kernel makes it impossible to determine whether the response
was due to an error detected by the vfio_ap device driver or because the
response code was returned by the firmware in response to the PQAP(AQIC)
instruction, so this patch logs a message to the s390dbf log for the
vfio_ap device driver for each of the situations described above.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NMatthew Rosato <mjrosato@linux.ibm.com>
Acked-by: NHeiko Carstens <hca@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

68f554b7

26 10月, 2021 1 次提交

s390/vfio-ap: s390/crypto: fix all kernel-doc warnings · 5ef4f710

由 Tony Krowiak 提交于 10月 19, 2021

Fixes the kernel-doc warnings in the following source files:

* drivers/s390/crypto/vfio_ap_private.h
* drivers/s390/crypto/vfio_ap_drv.c
* drivers/s390/crypto/vfio_ap_ops.c
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

5ef4f710

01 10月, 2021 1 次提交

vfio: simplify iommu group allocation for mediated devices · c68ea0d0

由 Christoph Hellwig 提交于 9月 24, 2021

Reuse the logic in vfio_noiommu_group_alloc to allocate a fake
single-device iommu group for mediated devices by factoring out a common
function, and replacing the noiommu boolean field in struct vfio_group
with an enum to distinguish the three different kinds of groups.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20210924155705.4258-8-hch@lst.deSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

c68ea0d0

25 9月, 2021 1 次提交

vfio/ap_ops: Add missed vfio_uninit_group_dev() · 42de956c

由 Jason Gunthorpe 提交于 9月 21, 2021

Without this call an xarray entry is leaked when the vfio_ap device is
unprobed. It was missed when the below patch was rebased across the
dev_set patch. Keep the remove function in the same order as the error
unwind in probe.

Fixes: eb0feefd ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Tested-by: NTony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/r/0-v3-f9b50340cdbb+e4-ap_uninit_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

42de956c

26 8月, 2021 1 次提交

vfio/ap_ops: Convert to use vfio_register_group_dev() · eb0feefd

由 Jason Gunthorpe 提交于 8月 23, 2021

This is straightforward conversion, the ap_matrix_mdev is actually serving
as the vfio_device and we can replace all the mdev_get_drvdata()'s with a
simple container_of() or a dev_get_drvdata() for sysfs paths.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v4-0203a4ab0596+f7-vfio_ap_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

eb0feefd

25 8月, 2021 3 次提交

s390/crypto: fix all kernel-doc warnings in vfio_ap_ops.c · 0c1abe7c

由 Randy Dunlap 提交于 8月 05, 2021

The 0day bot reported some kernel-doc warnings in this file so clean up
all of the kernel-doc and use proper kernel-doc formatting.
There are no more kernel-doc errors or warnings reported in this file.
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Reported-by: Nkernel test robot <lkp@intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Herne <jjherne@linux.ibm.com>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Reviewed-by: NTony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/r/20210806050149.9614-1-rdunlap@infradead.orgSigned-off-by: NHeiko Carstens <hca@linux.ibm.com>

0c1abe7c

s390/vfio-ap: replace open coded locks for VFIO_GROUP_NOTIFY_SET_KVM notification · 86956e70

由 Tony Krowiak 提交于 8月 23, 2021

It was pointed out during an unrelated patch review that locks should not
be open coded - i.e., writing the algorithm of a standard lock in a
function instead of using a lock from the standard library. The setting and
testing of a busy flag and sleeping on a wait_event is the same thing
a lock does. The open coded locks are invisible to lockdep, so potential
locking problems are not detected.

This patch removes the open coded locks used during
VFIO_GROUP_NOTIFY_SET_KVM notification. The busy flag
and wait queue were introduced to resolve a possible circular locking
dependency reported by lockdep when starting a secure execution guest
configured with AP adapters and domains. Reversing the order in which
the kvm->lock mutex and matrix_dev->lock mutex are locked resolves the
issue reported by lockdep, thus enabling the removal of the open coded
locks.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Acked-by: NHalil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/20210823212047.1476436-3-akrowiak@linux.ibm.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

86956e70

s390/vfio-ap: r/w lock for PQAP interception handler function pointer · 1e753732

由 Tony Krowiak 提交于 8月 23, 2021

The function pointer to the interception handler for the PQAP instruction
can get changed during the interception process. Let's add a
semaphore to struct kvm_s390_crypto to control read/write access to the
function pointer contained therein.

The semaphore must be locked for write access by the vfio_ap device driver
when notified that the KVM pointer has been set or cleared. It must be
locked for read access by the interception framework when the PQAP
instruction is intercepted.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20210823212047.1476436-2-akrowiak@linux.ibm.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

1e753732

11 8月, 2021 1 次提交

vfio/ap,ccw: Fix open/close when multiple device FDs are open · 9b0d6b7e

由 Jason Gunthorpe 提交于 8月 05, 2021

The user can open multiple device FDs if it likes, however these open()
functions call vfio_register_notifier() on some device global
state. Calling vfio_register_notifier() twice in will trigger a WARN_ON
from notifier_chain_register() and the first close will wrongly delete the
notifier and more.

Since these really want the new open/close_device() semantics just change
the functions over.
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/12-v4-9ea22c5e6afb+1adf-vfio_reflck_jgg@nvidia.comSigned-off-by: NAlex Williamson <alex.williamson@redhat.com>

9b0d6b7e

21 6月, 2021 1 次提交

s390/vfio-ap: clean up mdev resources when remove callback invoked · 8c0795d2

由 Tony Krowiak 提交于 6月 09, 2021

The mdev remove callback for the vfio_ap device driver bails out with
-EBUSY if the mdev is in use by a KVM guest (i.e., the KVM pointer in the
struct ap_matrix_mdev is not NULL). The intended purpose was
to prevent the mdev from being removed while in use. There are two
problems with this scenario:

1. Returning a non-zero return code from the remove callback does not
   prevent the removal of the mdev.

2. The KVM pointer in the struct ap_matrix_mdev will always be NULL because
   the remove callback will not get invoked until the mdev fd is closed.
   When the mdev fd is closed, the mdev release callback is invoked and
   clears the KVM pointer from the struct ap_matrix_mdev.

Let's go ahead and remove the check for KVM in the remove callback and
allow the cleanup of mdev resources to proceed.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20210609224634.575156-2-akrowiak@linux.ibm.comSigned-off-by: NVasily Gorbik <gor@linux.ibm.com>

8c0795d2

13 4月, 2021 2 次提交

vfio/mdev: Correct the function signatures for the mdev_type_attributes · 9169cff1

由 Jason Gunthorpe 提交于 4月 06, 2021

The driver core standard is to pass in the properly typed object, the
properly typed attribute and the buffer data. It stems from the root
kobject method:

  ssize_t (*show)(struct kobject *kobj, struct kobj_attribute *attr,..)

Each subclass of kobject should provide their own function with the same
signature but more specific types, eg struct device uses:

  ssize_t (*show)(struct device *dev, struct device_attribute *attr,..)

In this case the existing signature is:

  ssize_t (*show)(struct kobject *kobj, struct device *dev,..)

Where kobj is a 'struct mdev_type *' and dev is 'mdev_type->parent->dev'.

Change the mdev_type related sysfs attribute functions to:

  ssize_t (*show)(struct mdev_type *mtype, struct mdev_type_attribute *attr,..)

In order to restore type safety and match the driver core standard

There are no current users of 'attr', but if it is ever needed it would be
hard to add in retroactively, so do it now.
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Message-Id: <18-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

9169cff1

vfio/mdev: Remove kobj from mdev_parent_ops->create() · c2ef2f50

由 Jason Gunthorpe 提交于 4月 06, 2021

The kobj here is a type-erased version of mdev_type, which is already
stored in the struct mdev_device being passed in. It was only ever used to
compute the type_group_id, which is now extracted directly from the mdev.
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJason Gunthorpe <jgg@nvidia.com>
Message-Id: <17-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com>
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

c2ef2f50

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功