1. 19 7月, 2022 15 次提交
    • T
      s390/vfio-ap: implement in-use callback for vfio_ap driver · 3f85d1df
      Tony Krowiak 提交于
      Let's implement the callback to indicate when an APQN
      is in use by the vfio_ap device driver. The callback is
      invoked whenever a change to the apmask or aqmask would
      result in one or more queue devices being removed from the driver. The
      vfio_ap device driver will indicate a resource is in use
      if the APQN of any of the queue devices to be removed are assigned to
      any of the matrix mdevs under the driver's control.
      
      There is potential for a deadlock condition between the
      matrix_dev->guests_lock used to lock the guest during assignment of
      adapters and domains and the ap_perms_mutex locked by the AP bus when
      changes are made to the sysfs apmask/aqmask attributes.
      
      The AP Perms lock controls access to the objects that store the adapter
      numbers (ap_perms) and domain numbers (aq_perms) for the sysfs
      /sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes
      identify which queues are reserved for the zcrypt default device drivers.
      Before allowing a bit to be removed from either mask, the AP bus must check
      with the vfio_ap device driver to verify that none of the queues are
      assigned to any of its mediated devices.
      
      The apmask/aqmask attributes can be written or read at any time from
      userspace, so care must be taken to prevent a deadlock with asynchronous
      operations that might be taking place in the vfio_ap device driver. For
      example, consider the following:
      
      1. A system administrator assigns an adapter to a mediated device under the
         control of the vfio_ap device driver. The driver will need to first take
         the matrix_dev->guests_lock to potentially hot plug the adapter into
         the KVM guest.
      2. At the same time, a system administrator sets a bit in the sysfs
         /sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus
         must:
         a. Take the ap_perms_mutex lock to update the object storing the values
            for the /sys/bus/ap/ap_mask attribute.
         b. Call the vfio_ap device driver's in-use callback to verify that the
            queues now being reserved for the default zcrypt drivers are not
            assigned to a mediated device owned by the vfio_ap device driver. To
            do the verification, the in-use callback function takes the
            matrix_dev->guests_lock, but has to wait because it is already held
            by the operation in 1 above.
      3. The vfio_ap device driver calls an AP bus function to verify that the
         new queues resulting from the assignment of the adapter in step 1 are
         not reserved for the default zcrypt device driver. This AP bus function
         tries to take the ap_perms_mutex lock but gets stuck waiting for the
         waiting for the lock due to step 2a above.
      
      Consequently, we have the following deadlock situation:
      
      matrix_dev->guests_lock locked (1)
      ap_perms_mutex lock locked (2a)
      Waiting for matrix_dev->gusts_lock (2b) which is currently held (1)
      Waiting for ap_perms_mutex lock (3) which is currently held (2a)
      
      To prevent this deadlock scenario, the function called in step 3 will no
      longer take the ap_perms_mutex lock and require the caller to take the
      lock. The lock will be the first taken by the adapter/domain assignment
      functions in the vfio_ap device driver to maintain the proper locking
      order.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      3f85d1df
    • T
      s390/vfio-ap: reset queues after adapter/domain unassignment · 70aeefe5
      Tony Krowiak 提交于
      When an adapter or domain is unassigned from an mdev attached to a KVM
      guest, one or more of the guest's queues may get dynamically removed. Since
      the removed queues could get re-assigned to another mdev, they need to be
      reset. So, when an adapter or domain is unassigned from the mdev, the
      queues that are removed from the guest's AP configuration (APCB) will be
      reset.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      70aeefe5
    • T
      s390/vfio-ap: hot plug/unplug of AP devices when probed/removed · 09d31ff7
      Tony Krowiak 提交于
      When an AP queue device is probed or removed, if the mediated device is
      attached to a KVM guest, the mediated device's adapter, domain and
      control domain bitmaps must be filtered to update the guest's APCB and if
      any changes are detected, the guest's APCB must then be hot plugged into
      the guest to reflect those changes to the guest.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      09d31ff7
    • T
      s390/vfio-ap: allow hot plug/unplug of AP devices when assigned/unassigned · 51dc562a
      Tony Krowiak 提交于
      Let's hot plug an adapter, domain or control domain into the guest when it
      is assigned to a matrix mdev that is attached to a KVM guest. Likewise,
      let's hot unplug an adapter, domain or control domain from the guest when
      it is unassigned from a matrix_mdev that is attached to a KVM guest.
      
      Whenever an assignment or unassignment of an adapter, domain or control
      domain is performed, the APQNs and control domains assigned to the matrix
      mdev will be filtered and assigned to the AP control block
      (APCB) that supplies the AP configuration to the guest so that no
      adapter, domain or control domain that is not in the host's AP
      configuration nor any APQN that does not reference a queue device bound
      to the vfio_ap device driver is assigned.
      
      After updating the APCB, if the mdev is in use by a KVM guest, it is
      hot plugged into the guest to dynamically provide access to the adapters,
      domains and control domains provided via the newly refreshed APCB.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      51dc562a
    • T
      s390/vfio-ap: prepare for dynamic update of guest's APCB on queue probe/remove · 2c1ee898
      Tony Krowiak 提交于
      The callback functions for probing and removing a queue device must take
      and release the locks required to perform a dynamic update of a guest's
      APCB in the proper order.
      
      The proper order for taking the locks is:
      
              matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock
      
      The proper order for releasing the locks is:
      
              matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock
      
      A new helper function is introduced to be used by the probe callback to
      acquire the required locks. Since the probe callback only has
      access to a queue device when it is called, the helper function will find
      the ap_matrix_mdev object to which the queue device's APQN is assigned and
      return it so the KVM guest to which the mdev is attached can be dynamically
      updated.
      
      Note that in order to find the ap_matrix_mdev (matrix_mdev) object, it is
      necessary to search the matrix_dev->mdev_list. This presents a
      locking order dilemma because the matrix_dev->mdevs_lock can't be taken to
      protect against changes to the list while searching for the matrix_mdev to
      which a queue device's APQN is assigned. This is due to the fact that the
      proper locking order requires that the matrix_dev->mdevs_lock be taken
      after both the matrix_mdev->kvm->lock and the matrix_dev->mdevs_lock.
      Consequently, the matrix_dev->guests_lock will be used to protect against
      removal of a matrix_mdev object from the list while a queue device is
      being probed. This necessitates changes to the mdev probe/remove
      callback functions to take the matrix_dev->guests_lock prior to removing
      a matrix_mdev object from the list.
      
      A new macro is also introduced to acquire the locks required to dynamically
      update the guest's APCB in the proper order when a queue device is
      removed.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      2c1ee898
    • T
      s390/vfio-ap: prepare for dynamic update of guest's APCB on assign/unassign · 8ee13ad9
      Tony Krowiak 提交于
      The functions backing the matrix mdev's sysfs attribute interfaces to
      assign/unassign adapters, domains and control domains must take and
      release the locks required to perform a dynamic update of a guest's APCB
      in the proper order.
      
      The proper order for taking the locks is:
      
      matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock
      
      The proper order for releasing the locks is:
      
      matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock
      
      Two new macros are introduced for this purpose: One to take the locks and
      the other to release the locks. These macros will be used by the
      assignment/unassignment functions to prepare for dynamic update of
      the KVM guest's APCB.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      8ee13ad9
    • T
      s390/vfio-ap: use proper locking order when setting/clearing KVM pointer · b84eb8e0
      Tony Krowiak 提交于
      The group notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM event must
      use the required locks in proper locking order to dynamically update the
      guest's APCB. The proper locking order is:
      
             1. matrix_dev->guests_lock: required to use the KVM pointer to
                update a KVM guest's APCB.
      
             2. matrix_mdev->kvm->lock: required to update a KVM guest's APCB.
      
             3. matrix_dev->mdevs_lock: required to store or access the data
                stored in a struct ap_matrix_mdev instance.
      
      Two macros are introduced to acquire and release the locks in the proper
      order. These macros are now used by the group notifier functions.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      b84eb8e0
    • T
      s390/vfio-ap: introduce new mutex to control access to the KVM pointer · 21195eb0
      Tony Krowiak 提交于
      The vfio_ap device driver registers for notification when the pointer to
      the KVM object for a guest is set. Recall that the KVM lock (kvm->lock)
      mutex must be taken outside of the matrix_dev->lock mutex to prevent the
      reporting by lockdep of a circular locking dependency (a.k.a., a lockdep
      splat):
      
      * see commit 0cc00c8d ("Fix circular lockdep when setting/clearing
        crypto masks")
      
      * see commit 86956e70 ("replace open coded locks for
        VFIO_GROUP_NOTIFY_SET_KVM notification")
      
      With the introduction of support for hot plugging/unplugging AP devices
      passed through to a KVM guest, a new guests_lock mutex is introduced to
      ensure the proper locking order is maintained:
      
      struct ap_matrix_dev {
              ...
              struct mutex guests_lock;
             ...
      }
      
      The matrix_dev->guests_lock controls access to the matrix_mdev instances
      that hold the state for AP devices that have been passed through to a
      KVM guest. This lock must be held to control access to the KVM pointer
      (matrix_mdev->kvm) while the vfio_ap device driver is using it to
      plug/unplug AP devices passed through to the KVM guest.
      
      Keep in mind, the proper locking order must be maintained whenever
      dynamically updating a KVM guest's APCB to plug/unplug adapters, domains
      and control domains:
      
          1. matrix_dev->guests_lock: required to use the KVM pointer - stored in
             a struct ap_matrix_mdev instance - to update a KVM guest's APCB
      
          2. matrix_mdev->kvm->lock: required to update a guest's APCB
      
          3. matrix_dev->mdevs_lock: required to access data stored in a
             struct ap_matrix_mdev instance.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      21195eb0
    • T
      s390/vfio-ap: rename matrix_dev->lock mutex to matrix_dev->mdevs_lock · d0786556
      Tony Krowiak 提交于
      The matrix_dev->lock mutex is being renamed to matrix_dev->mdevs_lock to
      better reflect its purpose, which is to control access to the state of the
      mediated devices under the control of the vfio_ap device driver.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      d0786556
    • T
      s390/vfio-ap: allow assignment of unavailable AP queues to mdev device · e2126a73
      Tony Krowiak 提交于
      The current implementation does not allow assignment of an AP adapter or
      domain to an mdev device if each APQN resulting from the assignment
      does not reference an AP queue device that is bound to the vfio_ap device
      driver. This patch allows assignment of AP resources to the matrix mdev as
      long as the APQNs resulting from the assignment:
         1. Are not reserved by the AP BUS for use by the zcrypt device drivers.
         2. Are not assigned to another matrix mdev.
      
      The rationale behind this is that the AP architecture does not preclude
      assignment of APQNs to an AP configuration profile that are not available
      to the system.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      e2126a73
    • T
      s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev · 48cae940
      Tony Krowiak 提交于
      Refresh the guest's APCB by filtering the APQNs and control domain numbers
      assigned to the matrix mdev.
      
      Filtering of APQNs:
      -----------------
      APQNs that do not reference an AP queue device bound to the vfio_ap device
      driver must be filtered from the APQNs assigned to the matrix mdev before
      they can be assigned to the guest's APCB. Given that the APQNs are
      configured in the guest's APCB as a matrix of APIDs (adapters) and APQIs
      (domains), it is not possible to filter an individual APQN. For example,
      suppose the matrix of APQNs is structured as follows:
      
                         APIDs
                   3      4      5
              0  (3,0)  (4,0)  (5,0)
      APQIs   1  (3,1)  (4,1)  (5,1)
              2  (3,2)  (4,2)  (5,2)
      
      Now suppose APQN (4,1) does not reference a queue device bound to the
      vfio_ap device driver. If we filter APID 4, the APQNs (4,0), (4,1) and
      (4,2) will be removed. Similarly, if we filter domain 1, APQNs (3,1),
      (4,1) and (5,1) will be removed.
      
      To resolve this dilemma, the choice was made to filter the APID - in this
      case 4 - from the guest's APCB. The reason for this design decision is
      because the APID references an AP adapter which is a real hardware device
      that can be physically installed, removed, enabled or disabled; whereas, a
      domain is a partition within the adapter. It therefore better reflects
      reality to remove the APID from the guest's APCB.
      
      Filtering of control domains:
      ----------------------------
      Any control domains that are not assigned to the host's AP configuration
      will be filtered from those assigned to the matrix mdev before assigning
      them to the guest's APCB.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      48cae940
    • T
      s390/vfio-ap: introduce shadow APCB · 49b0109f
      Tony Krowiak 提交于
      The APCB is a field within the CRYCB that provides the AP configuration
      to a KVM guest. Let's introduce a shadow copy of the KVM guest's APCB and
      maintain it for the lifespan of the guest.
      
      The shadow APCB serves the following purposes:
      
      1. The shadow APCB can be maintained even when the mediated device is not
         currently in use by a KVM guest. Since the mediated device's AP
         configuration is filtered to ensure that no AP queues are passed through
         to the KVM guest that are not bound to the vfio_ap device driver or
         available to the host, the mediated device's AP configuration may differ
         from the guest's. Having a shadow of a guest's APCB allows us to provide
         a sysfs interface to view the guest's APCB even if the mediated device
         is not currently passed through to a KVM guest. This can aid in
         problem determination when the guest is unexpectedly missing AP
         resources.
      
      2. If filtering was done in-place for the real APCB, the guest could pick
         up a transient state. Doing the filtering on a shadow and transferring
         the AP configuration to the real APCB after the guest is started or when
         AP resources are assigned to or unassigned from the mediated device, or
         when the host configuration changes, the guest's AP configuration will
         never be in a transient state.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      49b0109f
    • T
      s390/vfio-ap: manage link between queue struct and matrix mdev · 11cb2419
      Tony Krowiak 提交于
      Let's create links between each queue device bound to the vfio_ap device
      driver and the matrix mdev to which the queue's APQN is assigned. The idea
      is to facilitate efficient retrieval of the objects representing the queue
      devices and matrix mdevs as well as to verify that a queue assigned to
      a matrix mdev is bound to the driver.
      
      The links will be created as follows:
      
       * When the queue device is probed, if its APQN is assigned to a matrix
         mdev, the structures representing the queue device and the matrix mdev
         will be linked.
      
       * When an adapter or domain is assigned to a matrix mdev, for each new
         APQN assigned that references a queue device bound to the vfio_ap
         device driver, the structures representing the queue device and the
         matrix mdev will be linked.
      
      The links will be removed as follows:
      
       * When the queue device is removed, if its APQN is assigned to a matrix
         mdev, the link from the structure representing the matrix mdev to the
         structure representing the queue will be removed. Since the storage
         allocated for the vfio_ap_queue will be freed, there is no need to
         remove the link to the matrix_mdev to which the queue's APQN is
         assigned.
      
       * When an adapter or domain is unassigned from a matrix mdev, for each
         APQN unassigned that references a queue device bound to the vfio_ap
         device driver, the structures representing the queue device and the
         matrix mdev will be unlinked.
      
       * When an mdev is removed, the link from any queues assigned to the mdev
         to the mdev will be removed.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      11cb2419
    • T
      s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c · 260f3ea1
      Tony Krowiak 提交于
      Let's move the probe and remove callbacks into the vfio_ap_ops.c
      file to keep all code related to managing queues in a single file. This
      way, all functions related to queue management can be removed from the
      vfio_ap_private.h header file defining the public interfaces for the
      vfio_ap device driver.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      260f3ea1
    • T
      s390/vfio-ap: use new AP bus interface to search for queue devices · 034921cd
      Tony Krowiak 提交于
      This patch refactors the vfio_ap device driver to use the AP bus's
      ap_get_qdev() function to retrieve the vfio_ap_queue struct containing
      information about a queue that is bound to the vfio_ap device driver.
      The bus's ap_get_qdev() function retrieves the queue device from a
      hashtable keyed by APQN. This is much more efficient than looping over
      the list of devices attached to the AP bus by several orders of
      magnitude.
      Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
      Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
      Reviewed-by: NJason J. Herne <jjherne@linux.ibm.com>
      Signed-off-by: NAlexander Gordeev <agordeev@linux.ibm.com>
      034921cd
  2. 05 7月, 2022 1 次提交
  3. 20 6月, 2022 1 次提交
  4. 19 6月, 2022 19 次提交
  5. 18 6月, 2022 4 次提交