1. 15 10月, 2019 1 次提交
  2. 23 8月, 2019 1 次提交
  3. 30 7月, 2019 1 次提交
    • W
      iommu: Pass struct iommu_iotlb_gather to ->unmap() and ->iotlb_sync() · 56f8af5e
      Will Deacon 提交于
      To allow IOMMU drivers to batch up TLB flushing operations and postpone
      them until ->iotlb_sync() is called, extend the prototypes for the
      ->unmap() and ->iotlb_sync() IOMMU ops callbacks to take a pointer to
      the current iommu_iotlb_gather structure.
      
      All affected IOMMU drivers are updated, but there should be no
      functional change since the extra parameter is ignored for now.
      Signed-off-by: NWill Deacon <will@kernel.org>
      56f8af5e
  4. 24 7月, 2019 3 次提交
    • W
      iommu: Introduce iommu_iotlb_gather_add_page() · 4fcf8544
      Will Deacon 提交于
      Introduce a helper function for drivers to use when updating an
      iommu_iotlb_gather structure in response to an ->unmap() call, rather
      than having to open-code the logic in every page-table implementation.
      Signed-off-by: NWill Deacon <will@kernel.org>
      4fcf8544
    • W
      iommu: Introduce struct iommu_iotlb_gather for batching TLB flushes · a7d20dc1
      Will Deacon 提交于
      To permit batching of TLB flushes across multiple calls to the IOMMU
      driver's ->unmap() implementation, introduce a new structure for
      tracking the address range to be flushed and the granularity at which
      the flushing is required.
      
      This is hooked into the IOMMU API and its caller are updated to make use
      of the new structure. Subsequent patches will plumb this into the IOMMU
      drivers as well, but for now the gathering information is ignored.
      Signed-off-by: NWill Deacon <will@kernel.org>
      a7d20dc1
    • W
      iommu: Remove empty iommu_tlb_range_add() callback from iommu_ops · 6d1bcb95
      Will Deacon 提交于
      Commit add02cfd ("iommu: Introduce Interface for IOMMU TLB Flushing")
      added three new TLB flushing operations to the IOMMU API so that the
      underlying driver operations can be batched when unmapping large regions
      of IO virtual address space.
      
      However, the ->iotlb_range_add() callback has not been implemented by
      any IOMMU drivers (amd_iommu.c implements it as an empty function, which
      incurs the overhead of an indirect branch). Instead, drivers either flush
      the entire IOTLB in the ->iotlb_sync() callback or perform the necessary
      invalidation during ->unmap().
      
      Attempting to implement ->iotlb_range_add() for arm-smmu-v3.c revealed
      two major issues:
      
        1. The page size used to map the region in the page-table is not known,
           and so it is not generally possible to issue TLB flushes in the most
           efficient manner.
      
        2. The only mutable state passed to the callback is a pointer to the
           iommu_domain, which can be accessed concurrently and therefore
           requires expensive synchronisation to keep track of the outstanding
           flushes.
      
      Remove the callback entirely in preparation for extending ->unmap() and
      ->iotlb_sync() to update a token on the caller's stack.
      Signed-off-by: NWill Deacon <will@kernel.org>
      6d1bcb95
  5. 19 6月, 2019 1 次提交
    • V
      iommu/io-pgtable-arm: Add support to use system cache · 90ec7a76
      Vivek Gautam 提交于
      Few Qualcomm platforms such as, sdm845 have an additional outer
      cache called as System cache, aka. Last level cache (LLC) that
      allows non-coherent devices to upgrade to using caching.
      This cache sits right before the DDR, and is tightly coupled
      with the memory controller. The clients using this cache request
      their slices from this system cache, make it active, and can then
      start using it.
      
      There is a fundamental assumption that non-coherent devices can't
      access caches. This change adds an exception where they *can* use
      some level of cache despite still being non-coherent overall.
      The coherent devices that use cacheable memory, and CPU make use of
      this system cache by default.
      
      Looking at memory types, we have following -
      a) Normal uncached :- MAIR 0x44, inner non-cacheable,
                            outer non-cacheable;
      b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
                            outer read write-back non-transient;
                            attribute setting for coherenet I/O devices.
      and, for non-coherent i/o devices that can allocate in system cache
      another type gets added -
      c) Normal sys-cached :- MAIR 0xf4, inner non-cacheable,
                              outer read write-back non-transient
      
      Coherent I/O devices use system cache by marking the memory as
      normal cached.
      Non-coherent I/O devices should mark the memory as normal
      sys-cached in page tables to use system cache.
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      90ec7a76
  6. 12 6月, 2019 4 次提交
    • E
      iommu: Introduce IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions · adfd3738
      Eric Auger 提交于
      Introduce a new type for reserved region. This corresponds
      to directly mapped regions which are known to be relaxable
      in some specific conditions, such as device assignment use
      case. Well known examples are those used by USB controllers
      providing PS/2 keyboard emulation for pre-boot BIOS and
      early BOOT or RMRRs associated to IGD working in legacy mode.
      
      Since commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs
      from IOMMU API domains") and commit 18436afd ("iommu/vt-d: Allow
      RMRR on graphics devices too"), those regions are currently
      considered "safe" with respect to device assignment use case
      which requires a non direct mapping at IOMMU physical level
      (RAM GPA -> HPA mapping).
      
      Those RMRRs currently exist and sometimes the device is
      attempting to access it but this has not been considered
      an issue until now.
      
      However at the moment, iommu_get_group_resv_regions() is
      not able to make any difference between directly mapped
      regions: those which must be absolutely enforced and those
      like above ones which are known as relaxable.
      
      This is a blocker for reporting severe conflicts between
      non relaxable RMRRs (like MSI doorbells) and guest GPA space.
      
      With this new reserved region type we will be able to use
      iommu_get_group_resv_regions() to enumerate the IOVA space
      that is usable through the IOMMU API without introducing
      regressions with respect to existing device assignment
      use cases (USB and IGD).
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      adfd3738
    • J
      iommu: Add recoverable fault reporting · bf3255b3
      Jean-Philippe Brucker 提交于
      Some IOMMU hardware features, for example PCI PRI and Arm SMMU Stall,
      enable recoverable I/O page faults. Allow IOMMU drivers to report PRI Page
      Requests and Stall events through the new fault reporting API. The
      consumer of the fault can be either an I/O page fault handler in the host,
      or a guest OS.
      
      Once handled, the fault must be completed by sending a page response back
      to the IOMMU. Add an iommu_page_response() function to complete a page
      fault.
      
      There are two ways to extend the userspace API:
      * Add a field to iommu_page_response and a flag to
        iommu_page_response::flags describing the validity of this field.
      * Introduce a new iommu_page_response_X structure with a different version
        number. The kernel must then support both versions.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      bf3255b3
    • J
      iommu: Introduce device fault report API · 0c830e6b
      Jacob Pan 提交于
      Traditionally, device specific faults are detected and handled within
      their own device drivers. When IOMMU is enabled, faults such as DMA
      related transactions are detected by IOMMU. There is no generic
      reporting mechanism to report faults back to the in-kernel device
      driver or the guest OS in case of assigned devices.
      
      This patch introduces a registration API for device specific fault
      handlers. This differs from the existing iommu_set_fault_handler/
      report_iommu_fault infrastructures in several ways:
      - it allows to report more sophisticated fault events (both
        unrecoverable faults and page request faults) due to the nature
        of the iommu_fault struct
      - it is device specific and not domain specific.
      
      The current iommu_report_device_fault() implementation only handles
      the "shoot and forget" unrecoverable fault case. Handling of page
      request faults or stalled faults will come later.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      0c830e6b
    • J
      iommu: Introduce device fault data · 4e32348b
      Jacob Pan 提交于
      Device faults detected by IOMMU can be reported outside the IOMMU
      subsystem for further processing. This patch introduces
      a generic device fault data structure.
      
      The fault can be either an unrecoverable fault or a page request,
      also referred to as a recoverable fault.
      
      We only care about non internal faults that are likely to be reported
      to an external subsystem.
      Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NLiu, Yi L <yi.l.liu@linux.intel.com>
      Signed-off-by: NAshok Raj <ashok.raj@intel.com>
      Signed-off-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      4e32348b
  7. 05 6月, 2019 1 次提交
  8. 27 5月, 2019 1 次提交
    • L
      iommu: Add API to request DMA domain for device · 7423e017
      Lu Baolu 提交于
      Normally during iommu probing a device, a default doamin will
      be allocated and attached to the device. The domain type of
      the default domain is statically defined, which results in a
      situation where the allocated default domain isn't suitable
      for the device due to some limitations. We already have API
      iommu_request_dm_for_dev() to replace a DMA domain with an
      identity one. This adds iommu_request_dma_domain_for_dev()
      to request a dma domain if an allocated identity domain isn't
      suitable for the device in question.
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      7423e017
  9. 23 4月, 2019 1 次提交
  10. 11 4月, 2019 2 次提交
    • J
      iommu: Bind process address spaces to devices · 26b25a2b
      Jean-Philippe Brucker 提交于
      Add bind() and unbind() operations to the IOMMU API.
      iommu_sva_bind_device() binds a device to an mm, and returns a handle to
      the bond, which is released by calling iommu_sva_unbind_device().
      
      Each mm bound to devices gets a PASID (by convention, a 20-bit system-wide
      ID representing the address space), which can be retrieved with
      iommu_sva_get_pasid(). When programming DMA addresses, device drivers
      include this PASID in a device-specific manner, to let the device access
      the given address space. Since the process memory may be paged out, device
      and IOMMU must support I/O page faults (e.g. PCI PRI).
      
      Using iommu_sva_set_ops(), device drivers provide an mm_exit() callback
      that is called by the IOMMU driver if the process exits before the device
      driver called unbind(). In mm_exit(), device driver should disable DMA
      from the given context, so that the core IOMMU can reallocate the PASID.
      Whether the process exited or nor, the device driver should always release
      the handle with unbind().
      
      To use these functions, device driver must first enable the
      IOMMU_DEV_FEAT_SVA device feature with iommu_dev_enable_feature().
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      26b25a2b
    • L
      iommu: Add APIs for multiple domains per device · a3a19592
      Lu Baolu 提交于
      Sharing a physical PCI device in a finer-granularity way
      is becoming a consensus in the industry. IOMMU vendors
      are also engaging efforts to support such sharing as well
      as possible. Among the efforts, the capability of support
      finer-granularity DMA isolation is a common requirement
      due to the security consideration. With finer-granularity
      DMA isolation, subsets of a PCI function can be isolated
      from each others by the IOMMU. As a result, there is a
      request in software to attach multiple domains to a physical
      PCI device. One example of such use model is the Intel
      Scalable IOV [1] [2]. The Intel vt-d 3.0 spec [3] introduces
      the scalable mode which enables PASID granularity DMA
      isolation.
      
      This adds the APIs to support multiple domains per device.
      In order to ease the discussions, we call it 'a domain in
      auxiliary mode' or simply 'auxiliary domain' when multiple
      domains are attached to a physical device.
      
      The APIs include:
      
      * iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
        - Detect both IOMMU and PCI endpoint devices supporting
          the feature (aux-domain here) without the host driver
          dependency.
      
      * iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)
        - Check the enabling status of the feature (aux-domain
          here). The aux-domain interfaces are available only
          if this returns true.
      
      * iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
        - Enable/disable device specific aux-domain feature.
      
      * iommu_aux_attach_device(domain, dev)
        - Attaches @domain to @dev in the auxiliary mode. Multiple
          domains could be attached to a single device in the
          auxiliary mode with each domain representing an isolated
          address space for an assignable subset of the device.
      
      * iommu_aux_detach_device(domain, dev)
        - Detach @domain which has been attached to @dev in the
          auxiliary mode.
      
      * iommu_aux_get_pasid(domain, dev)
        - Return ID used for finer-granularity DMA translation.
          For the Intel Scalable IOV usage model, this will be
          a PASID. The device which supports Scalable IOV needs
          to write this ID to the device register so that DMA
          requests could be tagged with a right PASID prefix.
      
      This has been updated with the latest proposal from Joerg
      posted here [5].
      
      Many people involved in discussions of this design.
      
      Kevin Tian <kevin.tian@intel.com>
      Liu Yi L <yi.l.liu@intel.com>
      Ashok Raj <ashok.raj@intel.com>
      Sanjay Kumar <sanjay.k.kumar@intel.com>
      Jacob Pan <jacob.jun.pan@linux.intel.com>
      Alex Williamson <alex.williamson@redhat.com>
      Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Joerg Roedel <joro@8bytes.org>
      
      and some discussions can be found here [4] [5].
      
      [1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
      [2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
      [3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
      [4] https://lkml.org/lkml/2018/7/26/4
      [5] https://www.spinics.net/lists/iommu/msg31874.html
      
      Cc: Ashok Raj <ashok.raj@intel.com>
      Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
      Cc: Kevin Tian <kevin.tian@intel.com>
      Cc: Liu Yi L <yi.l.liu@intel.com>
      Suggested-by: NKevin Tian <kevin.tian@intel.com>
      Suggested-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Suggested-by: NJoerg Roedel <jroedel@suse.de>
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      a3a19592
  11. 26 2月, 2019 3 次提交
  12. 16 1月, 2019 1 次提交
  13. 17 12月, 2018 2 次提交
  14. 06 12月, 2018 1 次提交
  15. 01 10月, 2018 1 次提交
    • Z
      iommu/dma: Add support for non-strict mode · 2da274cd
      Zhen Lei 提交于
      With the flush queue infrastructure already abstracted into IOVA
      domains, hooking it up in iommu-dma is pretty simple. Since there is a
      degree of dependency on the IOMMU driver knowing what to do to play
      along, we key the whole thing off a domain attribute which will be set
      on default DMA ops domains to request non-strict invalidation. That way,
      drivers can indicate the appropriate support by acknowledging the
      attribute, and we can easily fall back to strict invalidation otherwise.
      
      The flush queue callback needs a handle on the iommu_domain which owns
      our cookie, so we have to add a pointer back to that, but neatly, that's
      also sufficient to indicate whether we're using a flush queue or not,
      and thus which way to release IOVAs. The only slight subtlety is
      switching __iommu_dma_unmap() from calling iommu_unmap() to explicit
      iommu_unmap_fast()/iommu_tlb_sync() so that we can elide the sync
      entirely in non-strict mode.
      Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      [rm: convert to domain attribute, tweak comments and commit message]
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2da274cd
  16. 25 9月, 2018 3 次提交
  17. 08 8月, 2018 1 次提交
  18. 06 7月, 2018 1 次提交
    • G
      iommu: Enable debugfs exposure of IOMMU driver internals · bad614b2
      Gary R Hook 提交于
      Provide base enablement for using debugfs to expose internal data of an
      IOMMU driver. When called, create the /sys/kernel/debug/iommu directory.
      
      Emit a strong warning at boot time to indicate that this feature is
      enabled.
      
      This function is called from iommu_init, and creates the initial DebugFS
      directory. Drivers may then call iommu_debugfs_new_driver_dir() to
      instantiate a device-specific directory to expose internal data.
      It will return a pointer to the new dentry structure created in
      /sys/kernel/debug/iommu, or NULL in the event of a failure.
      
      Since the IOMMU driver can not be removed from the running system, there
      is no need for an "off" function.
      Signed-off-by: NGary R Hook <gary.hook@amd.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      bad614b2
  19. 14 2月, 2018 1 次提交
  20. 27 9月, 2017 1 次提交
  21. 31 8月, 2017 1 次提交
    • J
      iommu: Introduce Interface for IOMMU TLB Flushing · add02cfd
      Joerg Roedel 提交于
      With the current IOMMU-API the hardware TLBs have to be
      flushed in every iommu_ops->unmap() call-back.
      
      For unmapping large amounts of address space, like it
      happens when a KVM domain with assigned devices is
      destroyed, this causes thousands of unnecessary TLB flushes
      in the IOMMU hardware because the unmap call-back runs for
      every unmapped physical page.
      
      With the TLB Flush Interface and the new iommu_unmap_fast()
      function introduced here the need to clean the hardware TLBs
      is removed from the unmapping code-path. Users of
      iommu_unmap_fast() have to explicitly call the TLB-Flush
      functions to sync the page-table changes to the hardware.
      
      Three functions for TLB-Flushes are introduced:
      
      	* iommu_flush_tlb_all() - Flushes all TLB entries
      	                          associated with that
      				  domain. TLBs entries are
      				  flushed when this function
      				  returns.
      
      	* iommu_tlb_range_add() - This will add a given
      				  range to the flush queue
      				  for this domain.
      
      	* iommu_tlb_sync() - Flushes all queued ranges from
      			     the hardware TLBs. Returns when
      			     the flush is finished.
      
      The semantic of this interface is intentionally similar to
      the iommu_gather_ops from the io-pgtable code.
      
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      add02cfd
  22. 16 8月, 2017 1 次提交
  23. 15 8月, 2017 1 次提交
    • J
      iommu: Fix wrong freeing of iommu_device->dev · 2926a2aa
      Joerg Roedel 提交于
      The struct iommu_device has a 'struct device' embedded into
      it, not as a pointer, but the whole struct. In the
      conversion of the iommu drivers to use struct iommu_device
      it was forgotten that the relase function for that struct
      device simply calls kfree() on the pointer.
      
      This frees memory that was never allocated and causes memory
      corruption.
      
      To fix this issue, use a pointer to struct device instead of
      embedding the whole struct. This needs some updates in the
      iommu sysfs code as well as the Intel VT-d and AMD IOMMU
      driver.
      Reported-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Fixes: 39ab9555 ('iommu: Add sysfs bindings for struct iommu_device')
      Cc: stable@vger.kernel.org # >= v4.11
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      2926a2aa
  24. 26 7月, 2017 1 次提交
  25. 29 4月, 2017 1 次提交
  26. 27 4月, 2017 2 次提交
  27. 06 4月, 2017 1 次提交
  28. 22 3月, 2017 1 次提交
    • R
      iommu: Disambiguate MSI region types · 9d3a4de4
      Robin Murphy 提交于
      The introduction of reserved regions has left a couple of rough edges
      which we could do with sorting out sooner rather than later. Since we
      are not yet addressing the potential dynamic aspect of software-managed
      reservations and presenting them at arbitrary fixed addresses, it is
      incongruous that we end up displaying hardware vs. software-managed MSI
      regions to userspace differently, especially since ARM-based systems may
      actually require one or the other, or even potentially both at once,
      (which iommu-dma currently has no hope of dealing with at all). Let's
      resolve the former user-visible inconsistency ASAP before the ABI has
      been baked into a kernel release, in a way that also lays the groundwork
      for the latter shortcoming to be addressed by follow-up patches.
      
      For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use
      IOMMU_RESV_MSI to describe the hardware type, and document everything a
      little bit. Since the x86 MSI remapping hardware falls squarely under
      this meaning of IOMMU_RESV_MSI, apply that type to their regions as well,
      so that we tell the same story to userspace across all platforms.
      
      Secondly, as the various region types require quite different handling,
      and it really makes little sense to ever try combining them, convert the
      bitfield-esque #defines to a plain enum in the process before anyone
      gets the wrong impression.
      
      Fixes: d30ddcaa ("iommu: Add a new type field in iommu_resv_region")
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      CC: Alex Williamson <alex.williamson@redhat.com>
      CC: David Woodhouse <dwmw2@infradead.org>
      CC: kvm@vger.kernel.org
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      9d3a4de4