1. 08 6月, 2017 9 次提交
  2. 30 5月, 2017 2 次提交
  3. 29 4月, 2017 2 次提交
    • J
      iommu: Remove pci.h include from trace/events/iommu.h · 461a6946
      Joerg Roedel 提交于
      The include file does not need any PCI specifics, so remove
      that include. Also fix the places that relied on it.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      461a6946
    • Q
      iommu/vt-d: Don't print the failure message when booting non-kdump kernel · 8e121884
      Qiuxu Zhuo 提交于
      When booting a new non-kdump kernel, we have below failure message:
      
      [    0.004000] DMAR-IR: IRQ remapping was enabled on dmar2 but we are not in kdump mode
      [    0.004000] DMAR-IR: Failed to copy IR table for dmar2 from previous kernel
      [    0.004000] DMAR-IR: IRQ remapping was enabled on dmar1 but we are not in kdump mode
      [    0.004000] DMAR-IR: Failed to copy IR table for dmar1 from previous kernel
      [    0.004000] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in kdump mode
      [    0.004000] DMAR-IR: Failed to copy IR table for dmar0 from previous kernel
      [    0.004000] DMAR-IR: IRQ remapping was enabled on dmar3 but we are not in kdump mode
      [    0.004000] DMAR-IR: Failed to copy IR table for dmar3 from previous kernel
      
      For non-kdump case, we no need to copy IR table from previous kernel
      so it's nonthing actually failed. To be less alarming or misleading,
      do not print "DMAR-IR: Failed to copy IR table for dmar[0-9] from
      previous kernel" messages when booting non-kdump kernel.
      Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      8e121884
  4. 27 4月, 2017 2 次提交
    • J
      iommu: Move report_iommu_fault() to iommu.c · 207c6e36
      Joerg Roedel 提交于
      The function is in no fast-path, there is no need for it to
      be static inline in a header file. This also removes the
      need to include iommu trace-points in iommu.h.
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      207c6e36
    • S
      x86, iommu/vt-d: Add an option to disable Intel IOMMU force on · bfd20f1c
      Shaohua Li 提交于
      IOMMU harms performance signficantly when we run very fast networking
      workloads. It's 40GB networking doing XDP test. Software overhead is
      almost unaware, but it's the IOTLB miss (based on our analysis) which
      kills the performance. We observed the same performance issue even with
      software passthrough (identity mapping), only the hardware passthrough
      survives. The pps with iommu (with software passthrough) is only about
      ~30% of that without it. This is a limitation in hardware based on our
      observation, so we'd like to disable the IOMMU force on, but we do want
      to use TBOOT and we can sacrifice the DMA security bought by IOMMU. I
      must admit I know nothing about TBOOT, but TBOOT guys (cc-ed) think not
      eabling IOMMU is totally ok.
      
      So introduce a new boot option to disable the force on. It's kind of
      silly we need to run into intel_iommu_init even without force on, but we
      need to disable TBOOT PMR registers. For system without the boot option,
      nothing is changed.
      Signed-off-by: NShaohua Li <shli@fb.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      bfd20f1c
  5. 26 4月, 2017 1 次提交
  6. 25 4月, 2017 1 次提交
  7. 24 4月, 2017 1 次提交
  8. 20 4月, 2017 11 次提交
  9. 07 4月, 2017 1 次提交
    • N
      iommu/iova: Fix underflow bug in __alloc_and_insert_iova_range · 5016bdb7
      Nate Watterson 提交于
      Normally, calling alloc_iova() using an iova_domain with insufficient
      pfns remaining between start_pfn and dma_limit will fail and return a
      NULL pointer. Unexpectedly, if such a "full" iova_domain contains an
      iova with pfn_lo == 0, the alloc_iova() call will instead succeed and
      return an iova containing invalid pfns.
      
      This is caused by an underflow bug in __alloc_and_insert_iova_range()
      that occurs after walking the "full" iova tree when the search ends
      at the iova with pfn_lo == 0 and limit_pfn is then adjusted to be just
      below that (-1). This (now huge) limit_pfn gives the impression that a
      vast amount of space is available between it and start_pfn and thus
      a new iova is allocated with the invalid pfn_hi value, 0xFFF.... .
      
      To rememdy this, a check is introduced to ensure that adjustments to
      limit_pfn will not underflow.
      
      This issue has been observed in the wild, and is easily reproduced with
      the following sample code.
      
      	struct iova_domain *iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
      	struct iova *rsvd_iova, *good_iova, *bad_iova;
      	unsigned long limit_pfn = 3;
      	unsigned long start_pfn = 1;
      	unsigned long va_size = 2;
      
      	init_iova_domain(iovad, SZ_4K, start_pfn, limit_pfn);
      	rsvd_iova = reserve_iova(iovad, 0, 0);
      	good_iova = alloc_iova(iovad, va_size, limit_pfn, true);
      	bad_iova = alloc_iova(iovad, va_size, limit_pfn, true);
      
      Prior to the patch, this yielded:
      	*rsvd_iova == {0, 0}   /* Expected */
      	*good_iova == {2, 3}   /* Expected */
      	*bad_iova  == {-2, -1} /* Oh no... */
      
      After the patch, bad_iova is NULL as expected since inadequate
      space remains between limit_pfn and start_pfn after allocating
      good_iova.
      Signed-off-by: NNate Watterson <nwatters@codeaurora.org>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      5016bdb7
  10. 06 4月, 2017 10 次提交
    • R
      iommu/io-pgtable-arm: Avoid shift overflow in block size · 022f4e4f
      Robin Murphy 提交于
      The recursive nature of __arm_lpae_{map,unmap}() means that
      ARM_LPAE_BLOCK_SIZE() is evaluated for every level, including those
      where block mappings aren't possible. This in itself is harmless enough,
      as we will only ever be called with valid sizes from the pgsize_bitmap,
      and thus always recurse down past any imaginary block sizes. The only
      problem is that most of those imaginary sizes overflow the type used for
      the calculation, and thus trigger warnings under UBsan:
      
      [   63.020939] ================================================================================
      [   63.021284] UBSAN: Undefined behaviour in drivers/iommu/io-pgtable-arm.c:312:22
      [   63.021602] shift exponent 39 is too large for 32-bit type 'int'
      [   63.021909] CPU: 0 PID: 1119 Comm: lkvm Not tainted 4.7.0-rc3+ #819
      [   63.022163] Hardware name: FVP Base (DT)
      [   63.022345] Call trace:
      [   63.022629] [<ffffff900808f258>] dump_backtrace+0x0/0x3a8
      [   63.022975] [<ffffff900808f614>] show_stack+0x14/0x20
      [   63.023294] [<ffffff90086bc9dc>] dump_stack+0x104/0x148
      [   63.023609] [<ffffff9008713ce8>] ubsan_epilogue+0x18/0x68
      [   63.023956] [<ffffff9008714410>] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
      [   63.024365] [<ffffff900890fcb0>] __arm_lpae_map+0x720/0xae0
      [   63.024732] [<ffffff9008910170>] arm_lpae_map+0x100/0x190
      [   63.025049] [<ffffff90089183d8>] arm_smmu_map+0x78/0xc8
      [   63.025390] [<ffffff9008906c18>] iommu_map+0x130/0x230
      [   63.025763] [<ffffff9008bf7564>] vfio_iommu_type1_attach_group+0x4bc/0xa00
      [   63.026156] [<ffffff9008bf3c78>] vfio_fops_unl_ioctl+0x320/0x580
      [   63.026515] [<ffffff9008377420>] do_vfs_ioctl+0x140/0xd28
      [   63.026858] [<ffffff9008378094>] SyS_ioctl+0x8c/0xa0
      [   63.027179] [<ffffff9008086e70>] el0_svc_naked+0x24/0x28
      [   63.027412] ================================================================================
      
      Perform the shift in a 64-bit type to prevent the theoretical overflow
      and keep the peace. As it turns out, this generates identical code for
      32-bit ARM, and marginally shorter AArch64 code, so it's good all round.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      022f4e4f
    • W
      iommu: Allow default domain type to be set on the kernel command line · fccb4e3b
      Will Deacon 提交于
      The IOMMU core currently initialises the default domain for each group
      to IOMMU_DOMAIN_DMA, under the assumption that devices will use
      IOMMU-backed DMA ops by default. However, in some cases it is desirable
      for the DMA ops to bypass the IOMMU for performance reasons, reserving
      use of translation for subsystems such as VFIO that require it for
      enforcing device isolation.
      
      Rather than modify each IOMMU driver to provide different semantics for
      DMA domains, instead we introduce a command line parameter that can be
      used to change the type of the default domain. Passthrough can then be
      specified using "iommu.passthrough=1" on the kernel command line.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fccb4e3b
    • W
      iommu/arm-smmu-v3: Install bypass STEs for IOMMU_DOMAIN_IDENTITY domains · beb3c6a0
      Will Deacon 提交于
      In preparation for allowing the default domain type to be overridden,
      this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
      ARM SMMUv3 driver.
      
      An identity domain is created by placing the corresponding stream table
      entries into "bypass" mode, which allows transactions to flow through
      the SMMU without any translation.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      beb3c6a0
    • W
      iommu/arm-smmu-v3: Make arm_smmu_install_ste_for_dev return void · 67560edc
      Will Deacon 提交于
      arm_smmu_install_ste_for_dev cannot fail and always returns 0, however
      the fact that it returns int means that callers end up implementing
      redundant error handling code which complicates STE tracking and is
      never executed.
      
      This patch changes the return type of arm_smmu_install_ste_for_dev
      to void, to make it explicit that it cannot fail.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      67560edc
    • W
      iommu/arm-smmu: Install bypass S2CRs for IOMMU_DOMAIN_IDENTITY domains · 61bc6711
      Will Deacon 提交于
      In preparation for allowing the default domain type to be overridden,
      this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
      ARM SMMU driver.
      
      An identity domain is created by placing the corresponding S2CR
      registers into "bypass" mode, which allows transactions to flow through
      the SMMU without any translation.
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      61bc6711
    • W
      iommu/arm-smmu: Restrict domain attributes to UNMANAGED domains · 0834cc28
      Will Deacon 提交于
      The ARM SMMU drivers provide a DOMAIN_ATTR_NESTING domain attribute,
      which allows callers of the IOMMU API to request that the page table
      for a domain is installed at stage-2, if supported by the hardware.
      
      Since setting this attribute only makes sense for UNMANAGED domains,
      this patch returns -ENODEV if the domain_{get,set}_attr operations are
      called on other domain types.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      0834cc28
    • R
      iommu/arm-smmu: Add global SMR masking property · 56fbf600
      Robin Murphy 提交于
      The current SMR masking support using a 2-cell iommu-specifier is
      primarily intended to handle individual masters with large and/or
      complex Stream ID assignments; it quickly gets a bit clunky in other SMR
      use-cases where we just want to consistently mask out the same part of
      every Stream ID (e.g. for MMU-500 configurations where the appended TBU
      number gets in the way unnecessarily). Let's add a new property to allow
      a single global mask value to better fit the latter situation.
      Acked-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NNipun Gupta <nipun.gupta@nxp.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      56fbf600
    • R
      iommu/arm-smmu: Poll for TLB sync completion more effectively · 8513c893
      Robin Murphy 提交于
      On relatively slow development platforms and software models, the
      inefficiency of our TLB sync loop tends not to show up - for instance on
      a Juno r1 board I typically see the TLBI has completed of its own accord
      by the time we get to the sync, such that the latter finishes instantly.
      
      However, on larger systems doing real I/O, it's less realistic for the
      TLBs to go idle immediately, and at that point falling into the 1MHz
      polling loop turns out to throw away performance drastically. Let's
      strike a balance by polling more than once between pauses, such that we
      have much more chance of catching normal operations completing before
      committing to the fixed delay, but also backing off exponentially, since
      if a sync really hasn't completed within one or two "reasonable time"
      periods, it becomes increasingly unlikely that it ever will.
      Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8513c893
    • R
      iommu/arm-smmu: Use per-context TLB sync as appropriate · 11febfca
      Robin Murphy 提交于
      TLB synchronisation typically involves the SMMU blocking all incoming
      transactions until the TLBs report completion of all outstanding
      operations. In the common SMMUv2 configuration of a single distributed
      SMMU serving multiple peripherals, that means that a single unmap
      request has the potential to bring the hammer down on the entire system
      if synchronised globally. Since stage 1 contexts, and stage 2 contexts
      under SMMUv2, offer local sync operations, let's make use of those
      wherever we can in the hope of minimising global disruption.
      
      To that end, rather than add any more branches to the already unwieldy
      monolithic TLB maintenance ops, break them up into smaller, neater,
      functions which we can then mix and match as appropriate.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      11febfca
    • R
      iommu/arm-smmu: Tidy up context bank indexing · 452107c7
      Robin Murphy 提交于
      ARM_AMMU_CB() is calculated relative to ARM_SMMU_CB_BASE(), but the
      latter is never of use on its own, and what we end up with is the same
      ARM_SMMU_CB_BASE() + ARM_AMMU_CB() expression being duplicated at every
      callsite. Folding the two together gives us a self-contained context
      bank accessor which is much more pleasant to work with.
      
      Secondly, we might as well simplify CB_BASE itself at the same time.
      We use the address space size for its own sake precisely once, at probe
      time, and every other usage is to dynamically calculate CB_BASE over
      and over and over again. Let's flip things around so that we just
      maintain the CB_BASE address directly.
      Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      452107c7