1. 17 12月, 2019 2 次提交
  2. 15 10月, 2019 3 次提交
  3. 03 9月, 2019 1 次提交
  4. 30 8月, 2019 2 次提交
  5. 23 8月, 2019 6 次提交
  6. 30 7月, 2019 1 次提交
    • W
      iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync() · 56f8af5e
      Will Deacon 提交于
      To allow IOMMU drivers to batch up TLB flushing operations and postpone
      them until ->iotlb_sync() is called, extend the prototypes for the
      ->unmap() and ->iotlb_sync() IOMMU ops callbacks to take a pointer to
      the current iommu_iotlb_gather structure.
      
      All affected IOMMU drivers are updated, but there should be no
      functional change since the extra parameter is ignored for now.
      Signed-off-by: NWill Deacon <will@kernel.org>
      56f8af5e
  7. 24 7月, 2019 2 次提交
    • W
      iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes · a7d20dc1
      Will Deacon 提交于
      To permit batching of TLB flushes across multiple calls to the IOMMU
      driver's ->unmap() implementation, introduce a new structure for
      tracking the address range to be flushed and the granularity at which
      the flushing is required.
      
      This is hooked into the IOMMU API and its caller are updated to make use
      of the new structure. Subsequent patches will plumb this into the IOMMU
      drivers as well, but for now the gathering information is ignored.
      Signed-off-by: NWill Deacon <will@kernel.org>
      a7d20dc1
    • W
      iommu: Remove empty iommu_tlb_range_add() callback from iommu_ops · 6d1bcb95
      Will Deacon 提交于
      Commit add02cfd ("iommu: Introduce Interface for IOMMU TLB Flushing")
      added three new TLB flushing operations to the IOMMU API so that the
      underlying driver operations can be batched when unmapping large regions
      of IO virtual address space.
      
      However, the ->iotlb_range_add() callback has not been implemented by
      any IOMMU drivers (amd_iommu.c implements it as an empty function, which
      incurs the overhead of an indirect branch). Instead, drivers either flush
      the entire IOTLB in the ->iotlb_sync() callback or perform the necessary
      invalidation during ->unmap().
      
      Attempting to implement ->iotlb_range_add() for arm-smmu-v3.c revealed
      two major issues:
      
        1. The page size used to map the region in the page-table is not known,
           and so it is not generally possible to issue TLB flushes in the most
           efficient manner.
      
        2. The only mutable state passed to the callback is a pointer to the
           iommu_domain, which can be accessed concurrently and therefore
           requires expensive synchronisation to keep track of the outstanding
           flushes.
      
      Remove the callback entirely in preparation for extending ->unmap() and
      ->iotlb_sync() to update a token on the caller's stack.
      Signed-off-by: NWill Deacon <will@kernel.org>
      6d1bcb95
  8. 12 6月, 2019 4 次提交
    • E
      iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions · adfd3738
      Eric Auger 提交于
      Introduce a new type for reserved region. This corresponds
      to directly mapped regions which are known to be relaxable
      in some specific conditions, such as device assignment use
      case. Well known examples are those used by USB controllers
      providing PS/2 keyboard emulation for pre-boot BIOS and
      early BOOT or RMRRs associated to IGD working in legacy mode.
      
      Since commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs
      from IOMMU API domains") and commit 18436afd ("iommu/vt-d: Allow
      RMRR on graphics devices too"), those regions are currently
      considered "safe" with respect to device assignment use case
      which requires a non direct mapping at IOMMU physical level
      (RAM GPA -> HPA mapping).
      
      Those RMRRs currently exist and sometimes the device is
      attempting to access it but this has not been considered
      an issue until now.
      
      However at the moment, iommu_get_group_resv_regions() is
      not able to make any difference between directly mapped
      regions: those which must be absolutely enforced and those
      like above ones which are known as relaxable.
      
      This is a blocker for reporting severe conflicts between
      non relaxable RMRRs (like MSI doorbells) and guest GPA space.
      
      With this new reserved region type we will be able to use
      iommu_get_group_resv_regions() to enumerate the IOVA space
      that is usable through the IOMMU API without introducing
      regressions with respect to existing device assignment
      use cases (USB and IGD).
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      adfd3738
    • E
      iommu: Fix a leak in iommu_insert_resv_region · ad0834de
      Eric Auger 提交于
      In case we expand an existing region, we unlink
      this latter and insert the larger one. In
      that case we should free the original region after
      the insertion. Also we can immediately return.
      
      Fixes: 6c65fb31 ("iommu: iommu_get_group_resv_regions")
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      ad0834de
    • J
      iommu: Add recoverable fault reporting · bf3255b3
      Jean-Philippe Brucker 提交于
      Some IOMMU hardware features, for example PCI PRI and Arm SMMU Stall,
      enable recoverable I/O page faults. Allow IOMMU drivers to report PRI Page
      Requests and Stall events through the new fault reporting API. The
      consumer of the fault can be either an I/O page fault handler in the host,
      or a guest OS.
      
      Once handled, the fault must be completed by sending a page response back
      to the IOMMU. Add an iommu_page_response() function to complete a page
      fault.
      
      There are two ways to extend the userspace API:
      * Add a field to iommu_page_response and a flag to
        iommu_page_response::flags describing the validity of this field.
      * Introduce a new iommu_page_response_X structure with a different version
        number. The kernel must then support both versions.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      bf3255b3
    • J
      iommu: Introduce device fault report API · 0c830e6b
      Jacob Pan 提交于
      Traditionally, device specific faults are detected and handled within
      their own device drivers. When IOMMU is enabled, faults such as DMA
      related transactions are detected by IOMMU. There is no generic
      reporting mechanism to report faults back to the in-kernel device
      driver or the guest OS in case of assigned devices.
      
      This patch introduces a registration API for device specific fault
      handlers. This differs from the existing iommu_set_fault_handler/
      report_iommu_fault infrastructures in several ways:
      - it allows to report more sophisticated fault events (both
        unrecoverable faults and page request faults) due to the nature
        of the iommu_fault struct
      - it is device specific and not domain specific.
      
      The current iommu_report_device_fault() implementation only handles
      the "shoot and forget" unrecoverable fault case. Handling of page
      request faults or stalled faults will come later.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      0c830e6b
  9. 05 6月, 2019 1 次提交
  10. 27 5月, 2019 3 次提交
  11. 11 4月, 2019 2 次提交
    • J
      iommu: Bind process address spaces to devices · 26b25a2b
      Jean-Philippe Brucker 提交于
      Add bind() and unbind() operations to the IOMMU API.
      iommu_sva_bind_device() binds a device to an mm, and returns a handle to
      the bond, which is released by calling iommu_sva_unbind_device().
      
      Each mm bound to devices gets a PASID (by convention, a 20-bit system-wide
      ID representing the address space), which can be retrieved with
      iommu_sva_get_pasid(). When programming DMA addresses, device drivers
      include this PASID in a device-specific manner, to let the device access
      the given address space. Since the process memory may be paged out, device
      and IOMMU must support I/O page faults (e.g. PCI PRI).
      
      Using iommu_sva_set_ops(), device drivers provide an mm_exit() callback
      that is called by the IOMMU driver if the process exits before the device
      driver called unbind(). In mm_exit(), device driver should disable DMA
      from the given context, so that the core IOMMU can reallocate the PASID.
      Whether the process exited or nor, the device driver should always release
      the handle with unbind().
      
      To use these functions, device driver must first enable the
      IOMMU_DEV_FEAT_SVA device feature with iommu_dev_enable_feature().
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      26b25a2b
    • L
      iommu: Add APIs for multiple domains per device · a3a19592
      Lu Baolu 提交于
      Sharing a physical PCI device in a finer-granularity way
      is becoming a consensus in the industry. IOMMU vendors
      are also engaging efforts to support such sharing as well
      as possible. Among the efforts, the capability of support
      finer-granularity DMA isolation is a common requirement
      due to the security consideration. With finer-granularity
      DMA isolation, subsets of a PCI function can be isolated
      from each others by the IOMMU. As a result, there is a
      request in software to attach multiple domains to a physical
      PCI device. One example of such use model is the Intel
      Scalable IOV [1] [2]. The Intel vt-d 3.0 spec [3] introduces
      the scalable mode which enables PASID granularity DMA
      isolation.
      
      This adds the APIs to support multiple domains per device.
      In order to ease the discussions, we call it 'a domain in
      auxiliary mode' or simply 'auxiliary domain' when multiple
      domains are attached to a physical device.
      
      The APIs include:
      
      * iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
        - Detect both IOMMU and PCI endpoint devices supporting
          the feature (aux-domain here) without the host driver
          dependency.
      
      * iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)
        - Check the enabling status of the feature (aux-domain
          here). The aux-domain interfaces are available only
          if this returns true.
      
      * iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
        - Enable/disable device specific aux-domain feature.
      
      * iommu_aux_attach_device(domain, dev)
        - Attaches @domain to @dev in the auxiliary mode. Multiple
          domains could be attached to a single device in the
          auxiliary mode with each domain representing an isolated
          address space for an assignable subset of the device.
      
      * iommu_aux_detach_device(domain, dev)
        - Detach @domain which has been attached to @dev in the
          auxiliary mode.
      
      * iommu_aux_get_pasid(domain, dev)
        - Return ID used for finer-granularity DMA translation.
          For the Intel Scalable IOV usage model, this will be
          a PASID. The device which supports Scalable IOV needs
          to write this ID to the device register so that DMA
          requests could be tagged with a right PASID prefix.
      
      This has been updated with the latest proposal from Joerg
      posted here [5].
      
      Many people involved in discussions of this design.
      
      Kevin Tian <kevin.tian@intel.com>
      Liu Yi L <yi.l.liu@intel.com>
      Ashok Raj <ashok.raj@intel.com>
      Sanjay Kumar <sanjay.k.kumar@intel.com>
      Jacob Pan <jacob.jun.pan@linux.intel.com>
      Alex Williamson <alex.williamson@redhat.com>
      Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Joerg Roedel <joro@8bytes.org>
      
      and some discussions can be found here [4] [5].
      
      [1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
      [2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
      [3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
      [4] https://lkml.org/lkml/2018/7/26/4
      [5] https://www.spinics.net/lists/iommu/msg31874.html
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Cc: Liu Yi L <yi.l.liu@intel.com>
      Suggested-by: NKevin Tian <kevin.tian@intel.com>
      Suggested-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Suggested-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      a3a19592
  12. 25 3月, 2019 2 次提交
  13. 11 2月, 2019 1 次提交
    • B
      iommu: Use dev_printk() when possible · 780da9e4
      Bjorn Helgaas 提交于
      Use dev_printk() when possible so the IOMMU messages are more consistent
      with other messages related to the device.
      
      E.g., I think these messages related to surprise hotplug:
      
        pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
        iommu: Removing device 0000:87:00.0 from group 12
        pciehp 0000:80:10.0:pcie004: Slot(36): Card present
        pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
      
      would be easier to read as these (also requires some PCI changes not
      included here):
      
        pci 0000:80:10.0: Slot(36): Link Down
        pci 0000:87:00.0: Removing from iommu group 12
        pci 0000:80:10.0: Slot(36): Card present
        pci 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      780da9e4
  14. 16 1月, 2019 1 次提交
  15. 20 12月, 2018 1 次提交
  16. 17 12月, 2018 2 次提交
  17. 03 12月, 2018 1 次提交
    • P
      iommu: Audit and remove any unnecessary uses of module.h · c1af7b40
      Paul Gortmaker 提交于
      Historically a lot of these existed because we did not have
      a distinction between what was modular code and what was providing
      support to modules via EXPORT_SYMBOL and friends.  That changed
      when we forked out support for the latter into the export.h file.
      This means we should be able to reduce the usage of module.h
      in code that is obj-y Makefile or bool Kconfig.
      
      The advantage in removing such instances is that module.h itself
      sources about 15 other headers; adding significantly to what we feed
      cpp, and it can obscure what headers we are effectively using.
      
      Since module.h might have been the implicit source for init.h
      (for __init) and for export.h (for EXPORT_SYMBOL) we consider each
      instance for the presence of either and replace as needed.
      
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: iommu@lists.linux-foundation.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      c1af7b40
  18. 06 11月, 2018 1 次提交
    • R
      iommu: Do physical merging in iommu_map_sg() · 5d95f40e
      Robin Murphy 提交于
      The original motivation for iommu_map_sg() was to give IOMMU drivers the
      chance to map an IOVA-contiguous scatterlist as efficiently as they
      could. It turns out that there isn't really much driver-specific
      business involved there, so now that the default implementation is
      mandatory let's just improve that - the main thing we're after is to use
      larger pages wherever possible, and as long as domain->pgsize_bitmap
      reflects reality, iommu_map() can already do that in a generic way. All
      we need to do is detect physically-contiguous segments and batch them
      into a single map operation, since whatever we do here is transparent to
      our caller and not bound by any segment-length restrictions on the list
      itself.
      
      Speaking of efficiency, there's really very little point in duplicating
      the checks that iommu_map() is going to do anyway, so those get cleared
      up in the process.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      5d95f40e
  19. 01 10月, 2018 1 次提交
  20. 25 9月, 2018 3 次提交
    • R
      iommu: Fix a typo · 35449adc
      Rami Rosen 提交于
      This patch fixes a typo in iommu.c.
      Signed-off-by: NRami Rosen <ramirose@gmail.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      35449adc
    • R
      iommu: Tidy up window attributes · 701d8a62
      Robin Murphy 提交于
      The external interface to get/set window attributes is already
      abstracted behind iommu_domain_{get,set}_attr(), so there's no real
      reason for the internal interface to be different. Since we only have
      one window-based driver anyway, clean up the core code by just moving
      the DOMAIN_ATTR_WINDOWS handling directly into the PAMU driver.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      701d8a62
    • R
      iommu: Add fast hook for getting DMA domains · 6af588fe
      Robin Murphy 提交于
      While iommu_get_domain_for_dev() is the robust way for arbitrary IOMMU
      API callers to retrieve the domain pointer, for DMA ops domains it
      doesn't scale well for large systems and multi-queue devices, since the
      momentary refcount adjustment will lead to exclusive cacheline contention
      when multiple CPUs are operating in parallel on different mappings for
      the same device.
      
      In the case of DMA ops domains, however, this refcounting is actually
      unnecessary, since they already imply that the group exists and is
      managed by platform code and IOMMU internals (by virtue of
      iommu_group_get_for_dev()) such that a reference will already be held
      for the lifetime of the device. Thus we can avoid the bottleneck by
      providing a fast lookup specifically for the DMA code to retrieve the
      default domain it already knows it has set up - a simple read-only
      dereference plays much nicer with cache-coherency protocols.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Tested-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      6af588fe