- 08 6月, 2017 9 次提交
-
-
由 Joerg Roedel 提交于
We can use queue_ring_free_flushed() instead, so remove this redundancy. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Add a timer to each dma_ops domain so that we flush unused IOTLB entries regularily, even if the queues don't get full all the time. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
The counters are increased every time the TLB for a given domain is flushed. We also store the current value of that counter into newly added entries of the flush-queue, so that we can tell whether this entry is already flushed. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
With locking we can safely access the flush-queues of other cpus. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Fill the flush-queue on unmap and only flush the IOMMU and device TLBs when a per-cpu queue gets full. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Make the flush-queue per dma-ops domain and add code allocate and free the flush-queues; Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
The queue flushing is pretty inefficient when it flushes the queues for all cpus at once. Further it flushes all domains from all IOMMUs for all CPUs, which is overkill as well. Rip it out to make room for something more efficient. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Tom Lendacky 提交于
Currently if there is no room to add a command to the command buffer, the driver performs a "completion wait" which only returns when all commands on the queue have been processed. There is no need to wait for the entire command queue to be executed before adding the next command. Update the driver to perform the same udelay() loop that the "completion wait" performs, but instead re-read the head pointer to determine if sufficient space is available. The very first time it is found that there is no space available, the udelay() will be skipped to immediately perform the opportunistic read of the head pointer. If it is still found that there is not sufficient space, then the udelay() will be performed. Signed-off-by: NLeo Duran <leo.duran@amd.com> Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Tom Lendacky 提交于
As newer, higher speed devices are developed, perf data shows that the amount of MMIO that is performed when submitting commands to the IOMMU causes performance issues. Currently, the command submission path reads the command buffer head and tail pointers and then writes the tail pointer once the command is ready. The tail pointer is only ever updated by the driver so it can be tracked by the driver without having to read it from the hardware. The head pointer is updated by the hardware, but can be read opportunistically. Reading the head pointer only when it appears that there might not be room in the command buffer and then re-checking the available space reduces the number of times the head pointer has to be read. Signed-off-by: NTom Lendacky <thomas.lendacky@amd.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 30 5月, 2017 2 次提交
-
-
由 Tobias Klauser 提交于
struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Misbehaving devices can cause an endless chain of io-page-faults, flooding dmesg and making the system-log unusable or even prevent the system from booting. So ratelimit the error messages about io-page-faults on a per-device basis. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 29 4月, 2017 2 次提交
-
-
由 Joerg Roedel 提交于
The include file does not need any PCI specifics, so remove that include. Also fix the places that relied on it. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Qiuxu Zhuo 提交于
When booting a new non-kdump kernel, we have below failure message: [ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar2 but we are not in kdump mode [ 0.004000] DMAR-IR: Failed to copy IR table for dmar2 from previous kernel [ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar1 but we are not in kdump mode [ 0.004000] DMAR-IR: Failed to copy IR table for dmar1 from previous kernel [ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in kdump mode [ 0.004000] DMAR-IR: Failed to copy IR table for dmar0 from previous kernel [ 0.004000] DMAR-IR: IRQ remapping was enabled on dmar3 but we are not in kdump mode [ 0.004000] DMAR-IR: Failed to copy IR table for dmar3 from previous kernel For non-kdump case, we no need to copy IR table from previous kernel so it's nonthing actually failed. To be less alarming or misleading, do not print "DMAR-IR: Failed to copy IR table for dmar[0-9] from previous kernel" messages when booting non-kdump kernel. Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 27 4月, 2017 2 次提交
-
-
由 Joerg Roedel 提交于
The function is in no fast-path, there is no need for it to be static inline in a header file. This also removes the need to include iommu trace-points in iommu.h. Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Shaohua Li 提交于
IOMMU harms performance signficantly when we run very fast networking workloads. It's 40GB networking doing XDP test. Software overhead is almost unaware, but it's the IOTLB miss (based on our analysis) which kills the performance. We observed the same performance issue even with software passthrough (identity mapping), only the hardware passthrough survives. The pps with iommu (with software passthrough) is only about ~30% of that without it. This is a limitation in hardware based on our observation, so we'd like to disable the IOMMU force on, but we do want to use TBOOT and we can sacrifice the DMA security bought by IOMMU. I must admit I know nothing about TBOOT, but TBOOT guys (cc-ed) think not eabling IOMMU is totally ok. So introduce a new boot option to disable the force on. It's kind of silly we need to run into intel_iommu_init even without force on, but we need to disable TBOOT PMR registers. For system without the boot option, nothing is changed. Signed-off-by: NShaohua Li <shli@fb.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 26 4月, 2017 1 次提交
-
-
由 Sunil Goutham 提交于
For software initiated address translation, when domain type is IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior i.e return the same IOVA as translated address. This patch is an extension to Will Deacon's patchset "Implement SMMU passthrough using the default domain". Signed-off-by: NSunil Goutham <sgoutham@cavium.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 25 4月, 2017 1 次提交
-
-
由 Peng Fan 提交于
From code "SMR mask 0x%x out of range for SMMU", so, we need to use mask, not sid. Signed-off-by: NPeng Fan <peng.fan@nxp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 24 4月, 2017 1 次提交
-
-
由 Pan Bian 提交于
In function amd_iommu_bind_pasid(), the control flow jumps to label out_free when pasid_state->mm and mm is NULL. And mmput(mm) is called. In function mmput(mm), mm is referenced without validation. This will result in a NULL dereference bug. This patch fixes the bug. Signed-off-by: NPan Bian <bianpan2016@163.com> Fixes: f0aac63b ('iommu/amd: Don't hold a reference to mm_struct') Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 20 4月, 2017 11 次提交
-
-
由 zhichang.yuan 提交于
In iommu_bus_notifier(), when action is BUS_NOTIFY_ADD_DEVICE, it will return 'ops->add_device(dev)' directly. But ops->add_device will return ERR_VAL, such as -ENODEV. These value will make notifier_call_chain() not to traverse the remain nodes in struct notifier_block list. This patch revises iommu_bus_notifier() to return NOTIFY_DONE when some errors happened in ops->add_device(). Signed-off-by: Nzhichang.yuan <yuanzhichang@hisilicon.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Support for IOMMU groups will become mandatory for drivers, so add it to the omap iommu driver. Signed-off-by: NJoerg Roedel <jroedel@suse.de> [s-anna@ti.com: minor error cleanups] Signed-off-by: NSuman Anna <s-anna@ti.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Modify the driver to register individual iommus and establish links between devices and iommus in sysfs. Signed-off-by: NJoerg Roedel <jroedel@suse.de> [s-anna@ti.com: fix some cleanup issues during failures] Signed-off-by: NSuman Anna <s-anna@ti.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
Instead of finding the matching IOMMU for a device using string comparision functions, store the pointer to the iommu_dev in arch_data during the omap_iommu_add_device callback and reset it during the omap_iommu_remove_device callback functions. Signed-off-by: NJoerg Roedel <jroedel@suse.de> [s-anna@ti.com: few minor cleanups] Signed-off-by: NSuman Anna <s-anna@ti.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Joerg Roedel 提交于
The internal data-structures are scattered over various header and C files. Consolidate them in omap-iommu.h. While at this, add the kerneldoc comment for the missing iommu domain variable and revise the iommu_arch_data name. Signed-off-by: NJoerg Roedel <jroedel@suse.de> [s-anna@ti.com: revise kerneldoc comments] Signed-off-by: NSuman Anna <s-anna@ti.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Suman Anna 提交于
All the supported boards that have OMAP IOMMU devices do support DT boot only now. So, drop the support for the non-DT legacy-style devices from the OMAP IOMMU driver. Couple of the fields from the iommu platform data would no longer be required, so they have also been cleaned up. The IOMMU platform data is still needed though for performing reset management properly in a multi-arch environment. Signed-off-by: NSuman Anna <s-anna@ti.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Suman Anna 提交于
Move the registration of the OMAP IOMMU platform driver before setting the IOMMU callbacks on the platform bus. This causes the IOMMU devices to be probed first before the .add_device() callback is invoked for all registered devices, and allows the iommu_group support to be added to the OMAP IOMMU driver. While at this, also check for the return status from bus_set_iommu. Signed-off-by: NSuman Anna <s-anna@ti.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Robin Murphy 提交于
Now that the appropriate ordering is enforced via probe-deferral of masters in core code, rip it all out and bask in the simplicity. Tested-by: NHanjun Guo <hanjun.guo@linaro.org> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> [Sricharan: Rebased on top of ACPI IORT SMMU series] Signed-off-by: NSricharan R <sricharan@codeaurora.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Laurent Pinchart 提交于
Failures to look up an IOMMU when parsing the DT iommus property need to be handled separately from the .of_xlate() failures to support deferred probing. The lack of a registered IOMMU can be caused by the lack of a driver for the IOMMU, the IOMMU device probe not having been performed yet, having been deferred, or having failed. The first case occurs when the device tree describes the bus master and IOMMU topology correctly but no device driver exists for the IOMMU yet or the device driver has not been compiled in. Return NULL, the caller will configure the device without an IOMMU. The second and third cases are handled by deferring the probe of the bus master device which will eventually get reprobed after the IOMMU. The last case is currently handled by deferring the probe of the bus master device as well. A mechanism to either configure the bus master device without an IOMMU or to fail the bus master device probe depending on whether the IOMMU is optional or mandatory would be a good enhancement. Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NRob Herring <robh@kernel.org> Signed-off-by: NLaurent Pichart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: NSricharan R <sricharan@codeaurora.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Robin Murphy 提交于
IOMMU configuration represents unchanging properties of the hardware, and as such should only need happen once in a device's lifetime, but the necessary interaction with the IOMMU device and driver complicates exactly when that point should be. Since the only reasonable tool available for handling the inter-device dependency is probe deferral, we need to prepare of_iommu_configure() to run later than it is currently called (i.e. at driver probe rather than device creation), to handle being retried, and to tell whether a not-yet present IOMMU should be waited for or skipped (by virtue of having declared a built-in driver or not). Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Robin Murphy 提交于
In preparation for some upcoming cleverness, rework the control flow in of_iommu_configure() to minimise duplication and improve the propogation of errors. It's also as good a time as any to switch over from the now-just-a-compatibility-wrapper of_iommu_get_ops() to using the generic IOMMU instance interface directly. Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 07 4月, 2017 1 次提交
-
-
由 Nate Watterson 提交于
Normally, calling alloc_iova() using an iova_domain with insufficient pfns remaining between start_pfn and dma_limit will fail and return a NULL pointer. Unexpectedly, if such a "full" iova_domain contains an iova with pfn_lo == 0, the alloc_iova() call will instead succeed and return an iova containing invalid pfns. This is caused by an underflow bug in __alloc_and_insert_iova_range() that occurs after walking the "full" iova tree when the search ends at the iova with pfn_lo == 0 and limit_pfn is then adjusted to be just below that (-1). This (now huge) limit_pfn gives the impression that a vast amount of space is available between it and start_pfn and thus a new iova is allocated with the invalid pfn_hi value, 0xFFF.... . To rememdy this, a check is introduced to ensure that adjustments to limit_pfn will not underflow. This issue has been observed in the wild, and is easily reproduced with the following sample code. struct iova_domain *iovad = kzalloc(sizeof(*iovad), GFP_KERNEL); struct iova *rsvd_iova, *good_iova, *bad_iova; unsigned long limit_pfn = 3; unsigned long start_pfn = 1; unsigned long va_size = 2; init_iova_domain(iovad, SZ_4K, start_pfn, limit_pfn); rsvd_iova = reserve_iova(iovad, 0, 0); good_iova = alloc_iova(iovad, va_size, limit_pfn, true); bad_iova = alloc_iova(iovad, va_size, limit_pfn, true); Prior to the patch, this yielded: *rsvd_iova == {0, 0} /* Expected */ *good_iova == {2, 3} /* Expected */ *bad_iova == {-2, -1} /* Oh no... */ After the patch, bad_iova is NULL as expected since inadequate space remains between limit_pfn and start_pfn after allocating good_iova. Signed-off-by: NNate Watterson <nwatters@codeaurora.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 06 4月, 2017 10 次提交
-
-
由 Robin Murphy 提交于
The recursive nature of __arm_lpae_{map,unmap}() means that ARM_LPAE_BLOCK_SIZE() is evaluated for every level, including those where block mappings aren't possible. This in itself is harmless enough, as we will only ever be called with valid sizes from the pgsize_bitmap, and thus always recurse down past any imaginary block sizes. The only problem is that most of those imaginary sizes overflow the type used for the calculation, and thus trigger warnings under UBsan: [ 63.020939] ================================================================================ [ 63.021284] UBSAN: Undefined behaviour in drivers/iommu/io-pgtable-arm.c:312:22 [ 63.021602] shift exponent 39 is too large for 32-bit type 'int' [ 63.021909] CPU: 0 PID: 1119 Comm: lkvm Not tainted 4.7.0-rc3+ #819 [ 63.022163] Hardware name: FVP Base (DT) [ 63.022345] Call trace: [ 63.022629] [<ffffff900808f258>] dump_backtrace+0x0/0x3a8 [ 63.022975] [<ffffff900808f614>] show_stack+0x14/0x20 [ 63.023294] [<ffffff90086bc9dc>] dump_stack+0x104/0x148 [ 63.023609] [<ffffff9008713ce8>] ubsan_epilogue+0x18/0x68 [ 63.023956] [<ffffff9008714410>] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc [ 63.024365] [<ffffff900890fcb0>] __arm_lpae_map+0x720/0xae0 [ 63.024732] [<ffffff9008910170>] arm_lpae_map+0x100/0x190 [ 63.025049] [<ffffff90089183d8>] arm_smmu_map+0x78/0xc8 [ 63.025390] [<ffffff9008906c18>] iommu_map+0x130/0x230 [ 63.025763] [<ffffff9008bf7564>] vfio_iommu_type1_attach_group+0x4bc/0xa00 [ 63.026156] [<ffffff9008bf3c78>] vfio_fops_unl_ioctl+0x320/0x580 [ 63.026515] [<ffffff9008377420>] do_vfs_ioctl+0x140/0xd28 [ 63.026858] [<ffffff9008378094>] SyS_ioctl+0x8c/0xa0 [ 63.027179] [<ffffff9008086e70>] el0_svc_naked+0x24/0x28 [ 63.027412] ================================================================================ Perform the shift in a 64-bit type to prevent the theoretical overflow and keep the peace. As it turns out, this generates identical code for 32-bit ARM, and marginally shorter AArch64 code, so it's good all round. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
The IOMMU core currently initialises the default domain for each group to IOMMU_DOMAIN_DMA, under the assumption that devices will use IOMMU-backed DMA ops by default. However, in some cases it is desirable for the DMA ops to bypass the IOMMU for performance reasons, reserving use of translation for subsystems such as VFIO that require it for enforcing device isolation. Rather than modify each IOMMU driver to provide different semantics for DMA domains, instead we introduce a command line parameter that can be used to change the type of the default domain. Passthrough can then be specified using "iommu.passthrough=1" on the kernel command line. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
In preparation for allowing the default domain type to be overridden, this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the ARM SMMUv3 driver. An identity domain is created by placing the corresponding stream table entries into "bypass" mode, which allows transactions to flow through the SMMU without any translation. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
arm_smmu_install_ste_for_dev cannot fail and always returns 0, however the fact that it returns int means that callers end up implementing redundant error handling code which complicates STE tracking and is never executed. This patch changes the return type of arm_smmu_install_ste_for_dev to void, to make it explicit that it cannot fail. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
In preparation for allowing the default domain type to be overridden, this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the ARM SMMU driver. An identity domain is created by placing the corresponding S2CR registers into "bypass" mode, which allows transactions to flow through the SMMU without any translation. Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Will Deacon 提交于
The ARM SMMU drivers provide a DOMAIN_ATTR_NESTING domain attribute, which allows callers of the IOMMU API to request that the page table for a domain is installed at stage-2, if supported by the hardware. Since setting this attribute only makes sense for UNMANAGED domains, this patch returns -ENODEV if the domain_{get,set}_attr operations are called on other domain types. Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
The current SMR masking support using a 2-cell iommu-specifier is primarily intended to handle individual masters with large and/or complex Stream ID assignments; it quickly gets a bit clunky in other SMR use-cases where we just want to consistently mask out the same part of every Stream ID (e.g. for MMU-500 configurations where the appended TBU number gets in the way unnecessarily). Let's add a new property to allow a single global mask value to better fit the latter situation. Acked-by: NMark Rutland <mark.rutland@arm.com> Tested-by: NNipun Gupta <nipun.gupta@nxp.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
On relatively slow development platforms and software models, the inefficiency of our TLB sync loop tends not to show up - for instance on a Juno r1 board I typically see the TLBI has completed of its own accord by the time we get to the sync, such that the latter finishes instantly. However, on larger systems doing real I/O, it's less realistic for the TLBs to go idle immediately, and at that point falling into the 1MHz polling loop turns out to throw away performance drastically. Let's strike a balance by polling more than once between pauses, such that we have much more chance of catching normal operations completing before committing to the fixed delay, but also backing off exponentially, since if a sync really hasn't completed within one or two "reasonable time" periods, it becomes increasingly unlikely that it ever will. Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
TLB synchronisation typically involves the SMMU blocking all incoming transactions until the TLBs report completion of all outstanding operations. In the common SMMUv2 configuration of a single distributed SMMU serving multiple peripherals, that means that a single unmap request has the potential to bring the hammer down on the entire system if synchronised globally. Since stage 1 contexts, and stage 2 contexts under SMMUv2, offer local sync operations, let's make use of those wherever we can in the hope of minimising global disruption. To that end, rather than add any more branches to the already unwieldy monolithic TLB maintenance ops, break them up into smaller, neater, functions which we can then mix and match as appropriate. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
ARM_AMMU_CB() is calculated relative to ARM_SMMU_CB_BASE(), but the latter is never of use on its own, and what we end up with is the same ARM_SMMU_CB_BASE() + ARM_AMMU_CB() expression being duplicated at every callsite. Folding the two together gives us a self-contained context bank accessor which is much more pleasant to work with. Secondly, we might as well simplify CB_BASE itself at the same time. We use the address space size for its own sake precisely once, at probe time, and every other usage is to dynamically calculate CB_BASE over and over and over again. Let's flip things around so that we just maintain the CB_BASE address directly. Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-