1. 17 12月, 2019 5 次提交
    • J
      iommu: set group default domain before creating direct mappings · d3602115
      Jerry Snitselaar 提交于
      iommu_group_create_direct_mappings uses group->default_domain, but
      right after it is called, request_default_domain_for_dev calls
      iommu_domain_free for the default domain, and sets the group default
      domain to a different domain. Move the
      iommu_group_create_direct_mappings call to after the group default
      domain is set, so the direct mappings get associated with that domain.
      
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: iommu@lists.linux-foundation.org
      Cc: stable@vger.kernel.org
      Fixes: 7423e017 ("iommu: Add API to request DMA domain for device")
      Signed-off-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      d3602115
    • L
      iommu/vt-d: Fix dmar pte read access not set error · 75d18385
      Lu Baolu 提交于
      If the default DMA domain of a group doesn't fit a device, it
      will still sit in the group but use a private identity domain.
      When map/unmap/iova_to_phys come through iommu API, the driver
      should still serve them, otherwise, other devices in the same
      group will be impacted. Since identity domain has been mapped
      with the whole available memory space and RMRRs, we don't need
      to worry about the impact on it.
      
      Link: https://www.spinics.net/lists/iommu/msg40416.html
      Cc: Jerry Snitselaar <jsnitsel@redhat.com>
      Reported-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Fixes: 942067f1 ("iommu/vt-d: Identify default domains replaced with private")
      Cc: stable@vger.kernel.org # v5.3+
      Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Tested-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      75d18385
    • A
      iommu/vt-d: Set ISA bridge reserved region as relaxable · d8018a0e
      Alex Williamson 提交于
      Commit d850c2ee ("iommu/vt-d: Expose ISA direct mapping region via
      iommu_get_resv_regions") created a direct-mapped reserved memory region
      in order to replace the static identity mapping of the ISA address
      space, where the latter was then removed in commit df4f3c60
      ("iommu/vt-d: Remove static identity map code").  According to the
      history of this code and the Kconfig option surrounding it, this direct
      mapping exists for the benefit of legacy ISA drivers that are not
      compatible with the DMA API.
      
      In conjuntion with commit 9b77e5c7 ("vfio/type1: check dma map
      request is within a valid iova range") this change introduced a
      regression where the vfio IOMMU backend enforces reserved memory regions
      per IOMMU group, preventing userspace from creating IOMMU mappings
      conflicting with prescribed reserved regions.  A necessary prerequisite
      for the vfio change was the introduction of "relaxable" direct mappings
      introduced by commit adfd3738 ("iommu: Introduce
      IOMMU_RESV_DIRECT_RELAXABLE reserved memory regions").  These relaxable
      direct mappings provide the same identity mapping support in the default
      domain, but also indicate that the reservation is software imposed and
      may be relaxed under some conditions, such as device assignment.
      
      Convert the ISA bridge direct-mapped reserved region to relaxable to
      reflect that the restriction is self imposed and need not be enforced
      by drivers such as vfio.
      
      Fixes: 1c5c59fb ("iommu/vt-d: Differentiate relaxable and non relaxable RMRRs")
      Cc: stable@vger.kernel.org # v5.3+
      Link: https://lore.kernel.org/linux-iommu/20191211082304.2d4fab45@x1.homeReported-by: Ncprt <cprt@protonmail.com>
      Tested-by: Ncprt <cprt@protonmail.com>
      Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
      Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      d8018a0e
    • R
      iommu/dma: Rationalise types for DMA masks · bd036d2f
      Robin Murphy 提交于
      Since iommu_dma_alloc_iova() combines incoming masks with the u64 bus
      limit, it makes more sense to pass them around in their native u64
      rather than converting to dma_addr_t early. Do that, and resolve the
      remaining type discrepancy against the domain geometry with a cheeky
      cast to keep things simple.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Tested-by: Nathan Chancellor <natechancellor@gmail.com> # build
      Reviewed-by: NNicolas Saenz Julienne <nsaenzjulienne@suse.de>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      bd036d2f
    • X
      iommu/iova: Init the struct iova to fix the possible memleak · 472d26df
      Xiaotao Yin 提交于
      During ethernet(Marvell octeontx2) set ring buffer test:
      ethtool -G eth1 rx <rx ring size> tx <tx ring size>
      following kmemleak will happen sometimes:
      
      unreferenced object 0xffff000b85421340 (size 64):
        comm "ethtool", pid 867, jiffies 4295323539 (age 550.500s)
        hex dump (first 64 bytes):
          80 13 42 85 0b 00 ff ff ff ff ff ff ff ff ff ff  ..B.............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000001b204ddf>] kmem_cache_alloc+0x1b0/0x350
          [<00000000d9ef2e50>] alloc_iova+0x3c/0x168
          [<00000000ea30f99d>] alloc_iova_fast+0x7c/0x2d8
          [<00000000b8bb2f1f>] iommu_dma_alloc_iova.isra.0+0x12c/0x138
          [<000000002f1a43b5>] __iommu_dma_map+0x8c/0xf8
          [<00000000ecde7899>] iommu_dma_map_page+0x98/0xf8
          [<0000000082004e59>] otx2_alloc_rbuf+0xf4/0x158
          [<000000002b107f6b>] otx2_rq_aura_pool_init+0x110/0x270
          [<00000000c3d563c7>] otx2_open+0x15c/0x734
          [<00000000a2f5f3a8>] otx2_dev_open+0x3c/0x68
          [<00000000456a98b5>] otx2_set_ringparam+0x1ac/0x1d4
          [<00000000f2fbb819>] dev_ethtool+0xb84/0x2028
          [<0000000069b67c5a>] dev_ioctl+0x248/0x3a0
          [<00000000af38663a>] sock_ioctl+0x280/0x638
          [<000000002582384c>] do_vfs_ioctl+0x8b0/0xa80
          [<000000004e1a2c02>] ksys_ioctl+0x84/0xb8
      
      The reason:
      When alloc_iova_mem() without initial with Zero, sometimes fpn_lo will
      equal to IOVA_ANCHOR by chance, so when return with -ENOMEM(iova32_full)
      from __alloc_and_insert_iova_range(), the new_iova will not be freed in
      free_iova_mem().
      
      Fixes: bb68b2fb ("iommu/iova: Add rbtree anchor node")
      Signed-off-by: NXiaotao Yin <xiaotao.yin@windriver.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      472d26df
  2. 22 11月, 2019 2 次提交
    • B
      drivers: iommu: hyperv: Make HYPERV_IOMMU only available on x86 · d7f0b2e4
      Boqun Feng 提交于
      Currently hyperv-iommu is implemented in a x86 specific way, for
      example, apic is used. So make the HYPERV_IOMMU Kconfig depend on X86
      as a preparation for enabling HyperV on architecture other than x86.
      
      Cc: Lan Tianyu <Tianyu.Lan@microsoft.com>
      Cc: Michael Kelley <mikelley@microsoft.com>
      Cc: linux-hyperv@vger.kernel.org
      Signed-off-by: NBoqun Feng (Microsoft) <boqun.feng@gmail.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d7f0b2e4
    • N
      dma-mapping: treat dev->bus_dma_mask as a DMA limit · a7ba70f1
      Nicolas Saenz Julienne 提交于
      Using a mask to represent bus DMA constraints has a set of limitations.
      The biggest one being it can only hold a power of two (minus one). The
      DMA mapping code is already aware of this and treats dev->bus_dma_mask
      as a limit. This quirk is already used by some architectures although
      still rare.
      
      With the introduction of the Raspberry Pi 4 we've found a new contender
      for the use of bus DMA limits, as its PCIe bus can only address the
      lower 3GB of memory (of a total of 4GB). This is impossible to represent
      with a mask. To make things worse the device-tree code rounds non power
      of two bus DMA limits to the next power of two, which is unacceptable in
      this case.
      
      In the light of this, rename dev->bus_dma_mask to dev->bus_dma_limit all
      over the tree and treat it as such. Note that dev->bus_dma_limit should
      contain the higher accessible DMA address.
      Signed-off-by: NNicolas Saenz Julienne <nsaenzjulienne@suse.de>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      a7ba70f1
  3. 21 11月, 2019 2 次提交
  4. 13 11月, 2019 1 次提交
  5. 11 11月, 2019 17 次提交
  6. 07 11月, 2019 1 次提交
    • W
      iommu/io-pgtable-arm: Rename IOMMU_QCOM_SYS_CACHE and improve doc · dd5ddd3c
      Will Deacon 提交于
      The 'IOMMU_QCOM_SYS_CACHE' IOMMU protection flag is exposed to all
      users of the IOMMU API. Despite its name, the idea behind it isn't
      especially tied to Qualcomm implementations and could conceivably be
      used by other systems.
      
      Rename it to 'IOMMU_SYS_CACHE_ONLY' and update the comment to describe
      a bit better the idea behind it.
      
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: "Isaac J. Manjarres" <isaacm@codeaurora.org>
      Signed-off-by: NWill Deacon <will@kernel.org>
      dd5ddd3c
  7. 05 11月, 2019 8 次提交
    • R
      iommu/io-pgtable-arm: Rationalise MAIR handling · 205577ab
      Robin Murphy 提交于
      Between VMSAv8-64 and the various 32-bit formats, there is either one
      64-bit MAIR or a pair of 32-bit MAIR0/MAIR1 or NMRR/PMRR registers.
      As such, keeping two 64-bit values in io_pgtable_cfg has always been
      overkill.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      205577ab
    • R
      iommu/io-pgtable-arm: Simplify level indexing · 5fb190b0
      Robin Murphy 提交于
      The nature of the LPAE format means that data->pg_shift is always
      redundant with data->bits_per_level, since they represent the size of a
      page and the number of PTEs per page respectively, and the size of a PTE
      is constant. Thus it works out more efficient to only store the latter,
      and derive the former via a trivial addition where necessary.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      [will: Reworked granule check in iopte_to_paddr()]
      Signed-off-by: NWill Deacon <will@kernel.org>
      5fb190b0
    • R
      iommu/io-pgtable-arm: Simplify PGD size handling · c79278c1
      Robin Murphy 提交于
      We use data->pgd_size directly for the one-off allocation and freeing of
      the top-level table, but otherwise it serves for ARM_LPAE_PGD_IDX() to
      repeatedly re-calculate the effective number of top-level address bits
      it represents. Flip this around so we store the form we most commonly
      need, and derive the lesser-used one instead. This cuts a whole bunch of
      code out of the map/unmap/iova_to_phys fast-paths.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      c79278c1
    • R
      iommu/io-pgtable-arm: Simplify start level lookup · 594ab90f
      Robin Murphy 提交于
      Beyond a couple of allocation-time calculations, data->levels is only
      ever used to derive the start level. Storing the start level directly
      leads to a small reduction in object code, which should help eke out a
      little more efficiency, and slightly more readable source to boot.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      594ab90f
    • R
      iommu/io-pgtable-arm: Simplify bounds checks · 67f3e53d
      Robin Murphy 提交于
      We're merely checking that the relevant upper bits of each address
      are all zero, so there are cheaper ways to achieve that.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      67f3e53d
    • R
      iommu/io-pgtable-arm: Rationalise size check · f7b90d2c
      Robin Murphy 提交于
      It makes little sense to only validate the requested size after we think
      we've found a matching block size - making the check up-front is simple,
      and far more logical than waiting to walk off the bottom of the table to
      infer that we must have been passed a bogus size to start with.
      
      We're missing an equivalent check on the unmap path, so add that as well
      for consistency.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      f7b90d2c
    • R
      iommu/io-pgtable: Make selftest gubbins consistently __init · b5813c16
      Robin Murphy 提交于
      The selftests run as an initcall, but the annotation of the various
      callbacks and data seems to be somewhat arbitrary. Add it consistently
      for everything related to the selftests.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will@kernel.org>
      b5813c16
    • V
      iommu: arm-smmu-impl: Add sdm845 implementation hook · 759aaa10
      Vivek Gautam 提交于
      Add reset hook for sdm845 based platforms to turn off
      the wait-for-safe sequence.
      
      Understanding how wait-for-safe logic affects USB and UFS performance
      on MTP845 and DB845 boards:
      
      Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
      to address under-performance issues in real-time clients, such as
      Display, and Camera.
      On receiving an invalidation requests, the SMMU forwards SAFE request
      to these clients and waits for SAFE ack signal from real-time clients.
      The SAFE signal from such clients is used to qualify the start of
      invalidation.
      This logic is controlled by chicken bits, one for each - MDP (display),
      IFE0, and IFE1 (camera), that can be accessed only from secure software
      on sdm845.
      
      This configuration, however, degrades the performance of non-real time
      clients, such as USB, and UFS etc. This happens because, with wait-for-safe
      logic enabled the hardware tries to throttle non-real time clients while
      waiting for SAFE ack signals from real-time clients.
      
      On mtp845 and db845 devices, with wait-for-safe logic enabled by the
      bootloaders we see degraded performance of USB and UFS when kernel
      enables the smmu stage-1 translations for these clients.
      Turn off this wait-for-safe logic from the kernel gets us back the perf
      of USB and UFS devices until we re-visit this when we start seeing perf
      issues on display/camera on upstream supported SDM845 platforms.
      The bootloaders on these boards implement secure monitor callbacks to
      handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
      logic can be toggled.
      
      There are other boards such as cheza whose bootloaders don't enable this
      logic. Such boards don't implement callbacks to handle the specific SCM
      call so disabling this logic for such boards will be a no-op.
      
      This change is inspired by the downstream change from Patrick Daly
      to address performance issues with display and camera by handling
      this wait-for-safe within separte io-pagetable ops to do TLB
      maintenance. So a big thanks to him for the change and for all the
      offline discussions.
      
      Without this change the UFS reads are pretty slow:
      $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
      10+0 records in
      10+0 records out
      10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
      real    0m 22.39s
      user    0m 0.00s
      sys     0m 0.01s
      
      With this change they are back to rock!
      $ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
      300+0 records in
      300+0 records out
      314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
      real    0m 1.03s
      user    0m 0.00s
      sys     0m 0.54s
      Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: NStephen Boyd <swboyd@chromium.org>
      Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
      Signed-off-by: NWill Deacon <will@kernel.org>
      759aaa10
  8. 02 11月, 2019 1 次提交
  9. 30 10月, 2019 3 次提交
    • C
      iommu/virtio: Remove unused variable · c1c8058d
      Cristiane Naves 提交于
      Remove the variable of return. Issue found by
      coccicheck(scripts/coccinelle/misc/returnvar.cocci)
      Signed-off-by: NCristiane Naves <cristianenavescardoso09@gmail.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      c1c8058d
    • L
      iommu/amd: Support multiple PCI DMA aliases in IRQ Remapping · 3c124435
      Logan Gunthorpe 提交于
      Non-Transparent Bridge (NTB) devices (among others) may have many DMA
      aliases seeing the hardware will send requests with different device ids
      depending on their origin across the bridged hardware.
      
      See commit ad281ecf ("PCI: Add DMA alias quirk for Microsemi Switchtec
      NTB") for more information on this.
      
      The AMD IOMMU IRQ remapping functionality ignores all PCI aliases for
      IRQs so if devices send an interrupt from one of their aliases they
      will be blocked on AMD hardware with the IOMMU enabled.
      
      To fix this, ensure IRQ remapping is enabled for all aliases with
      MSI interrupts.
      
      This is analogous to the functionality added to the Intel IRQ remapping
      code in commit 3f0c625c ("iommu/vt-d: Allow interrupts from the entire
      bus for aliased devices")
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      3c124435
    • L
      iommu/amd: Support multiple PCI DMA aliases in device table · 3332364e
      Logan Gunthorpe 提交于
      Non-Transparent Bridge (NTB) devices (among others) may have many DMA
      aliases seeing the hardware will send requests with different device ids
      depending on their origin across the bridged hardware.
      
      See commit ad281ecf ("PCI: Add DMA alias quirk for Microsemi
      Switchtec NTB") for more information on this.
      
      The AMD IOMMU ignores all the PCI aliases except the last one so DMA
      transfers from these aliases will be blocked on AMD hardware with the
      IOMMU enabled.
      
      To fix this, ensure the DTEs are cloned for every PCI alias. This is
      done by copying the DTE data for each alias as well as the IVRS alias
      every time it is changed.
      Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      3332364e