提交 · 81b3c25218447c65f93adf08b099a322b6803536 · openeuler / raspberrypi-kernel

24 6月, 2017 9 次提交

iommu/io-pgtable: Introduce explicit coherency · 81b3c252

由 Robin Murphy 提交于 6月 22, 2017

Once we remove the serialising spinlock, a potential race opens up for
non-coherent IOMMUs whereby a caller of .map() can be sure that cache
maintenance has been performed on their new PTE, but will have no
guarantee that such maintenance for table entries above it has actually
completed (e.g. if another CPU took an interrupt immediately after
writing the table entry, but before initiating the DMA sync).

Handling this race safely will add some potentially non-trivial overhead
to installing a table entry, which we would much rather avoid on
coherent systems where it will be unnecessary, and where we are stirivng
to minimise latency by removing the locking in the first place.

To that end, let's introduce an explicit notion of cache-coherency to
io-pgtable, such that we will be able to avoid penalising IOMMUs which
know enough to know when they are coherent.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

81b3c252

iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap · b9f1ef30

由 Robin Murphy 提交于 6月 22, 2017

Whilst the short-descriptor format's split_blk_unmap implementation has
no need to be recursive, it followed the pattern of the LPAE version
anyway for the sake of consistency. With the latter now reworked for
both efficiency and future scalability improvements, tweak the former
similarly, not least to make it less obtuse.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b9f1ef30

iommu/io-pgtable-arm: Improve split_blk_unmap · fb3a9579

由 Robin Murphy 提交于 6月 22, 2017

The current split_blk_unmap implementation suffers from some inscrutable
pointer trickery for creating the tables to replace the block entry, but
more than that it also suffers from hideous inefficiency. For example,
the most pathological case of unmapping a level 3 page from a level 1
block will allocate 513 lower-level tables to remap the entire block at
page granularity, when only 2 are actually needed (the rest can be
covered by level 2 block entries).

Also, we would like to be able to relax the spinlock requirement in
future, for which the roll-back-and-try-again logic for race resolution
would be pretty hideous under the current paradigm.

Both issues can be resolved most neatly by turning things sideways:
instead of repeatedly recursing into __arm_lpae_map() map to build up an
entire new sub-table depth-first, we can directly replace the block
entry with a next-level table of block/page entries, then repeat by
unmapping at the next level if necessary. With a little refactoring of
some helper functions, the code ends up not much bigger than before, but
considerably easier to follow and to adapt in future.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fb3a9579

iommu/io-pgtable-arm-v7s: Check table PTEs more precisely · 9db829d2

由 Robin Murphy 提交于 6月 22, 2017

Whilst we don't support the PXN bit at all, so should never encounter a
level 1 section or supersection PTE with it set, it would still be wise
to check both table type bits to resolve any theoretical ambiguity.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

9db829d2

iommu: arm-smmu: Handle return of iommu_device_register. · 5c2d0218

由 Arvind Yadav 提交于 6月 22, 2017

iommu_device_register returns an error code and, although it currently
never fails, we should check its return value anyway.
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
[will: adjusted to follow arm-smmu.c]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

5c2d0218

iommu: arm-smmu-v3: make of_device_ids const · ebdd13c9

由 Arvind Yadav 提交于 6月 22, 2017

of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ebdd13c9

iommu/arm-smmu: Plumb in new ACPI identifiers · 84c24379

由 Robin Murphy 提交于 6月 19, 2017

Revision C of IORT now allows us to identify ARM MMU-401 and the Cavium
ThunderX implementation. Wire them up so that we can probe these models
once firmware starts using the new codes in place of generic ones, and
so that the appropriate features and quirks get enabled when we do.

For the sake of backports and mitigating sychronisation problems with
the ACPICA headers, we'll carry a backup copy of the new definitions
locally for the short term to make life simpler.

CC: stable@vger.kernel.org # 4.10
Acked-by: NRobert Richter <rrichter@cavium.com>
Tested-by: NRobert Richter <rrichter@cavium.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

84c24379

iommu/io-pgtable-arm-v7s: constify dummy_tlb_ops. · 60ab7a75

由 Arvind Yadav 提交于 6月 13, 2017

File size before:
   text	   data	    bss	    dec	    hex	filename
   6146	     56	      9	   6211	   1843	drivers/iommu/io-pgtable-arm-v7s.o

File size After adding 'const':
   text	   data	    bss	    dec	    hex	filename
   6170	     24	      9	   6203	   183b	drivers/iommu/io-pgtable-arm-v7s.o
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

60ab7a75

iommu/arm-smmu-v3: Increase CMDQ drain timeout value · b847de4e

由 Sunil Goutham 提交于 5月 05, 2017

Waiting for a CMD_SYNC to be processed involves waiting for the command
queue to drain, which can take an awful lot longer than waiting for a
single entry to become available. Consequently, the common timeout value
of 100us has been observed to be too short on some platforms when a
CMD_SYNC is issued into a queued full of TLBI commands.

This patch resolves the issue by using a different (1s) timeout when
waiting for the CMDQ to drain and using a simple back-off mechanism
when polling the cons pointer in the absence of WFE support.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
[will: rewrote commit message and cosmetic changes]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b847de4e

29 4月, 2017 2 次提交

iommu: Remove pci.h include from trace/events/iommu.h · 461a6946

由 Joerg Roedel 提交于 4月 26, 2017

The include file does not need any PCI specifics, so remove
that include. Also fix the places that relied on it.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

461a6946

iommu/vt-d: Don't print the failure message when booting non-kdump kernel · 8e121884

由 Qiuxu Zhuo 提交于 4月 28, 2017

When booting a new non-kdump kernel, we have below failure message:

[    0.004000] DMAR-IR: IRQ remapping was enabled on dmar2 but we are not in kdump mode
[    0.004000] DMAR-IR: Failed to copy IR table for dmar2 from previous kernel
[    0.004000] DMAR-IR: IRQ remapping was enabled on dmar1 but we are not in kdump mode
[    0.004000] DMAR-IR: Failed to copy IR table for dmar1 from previous kernel
[    0.004000] DMAR-IR: IRQ remapping was enabled on dmar0 but we are not in kdump mode
[    0.004000] DMAR-IR: Failed to copy IR table for dmar0 from previous kernel
[    0.004000] DMAR-IR: IRQ remapping was enabled on dmar3 but we are not in kdump mode
[    0.004000] DMAR-IR: Failed to copy IR table for dmar3 from previous kernel

For non-kdump case, we no need to copy IR table from previous kernel
so it's nonthing actually failed. To be less alarming or misleading,
do not print "DMAR-IR: Failed to copy IR table for dmar[0-9] from
previous kernel" messages when booting non-kdump kernel.
Signed-off-by: NQiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

8e121884

27 4月, 2017 2 次提交

iommu: Move report_iommu_fault() to iommu.c · 207c6e36

由 Joerg Roedel 提交于 4月 26, 2017

The function is in no fast-path, there is no need for it to
be static inline in a header file. This also removes the
need to include iommu trace-points in iommu.h.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

207c6e36

x86, iommu/vt-d: Add an option to disable Intel IOMMU force on · bfd20f1c

由 Shaohua Li 提交于 4月 26, 2017

IOMMU harms performance signficantly when we run very fast networking
workloads. It's 40GB networking doing XDP test. Software overhead is
almost unaware, but it's the IOTLB miss (based on our analysis) which
kills the performance. We observed the same performance issue even with
software passthrough (identity mapping), only the hardware passthrough
survives. The pps with iommu (with software passthrough) is only about
~30% of that without it. This is a limitation in hardware based on our
observation, so we'd like to disable the IOMMU force on, but we do want
to use TBOOT and we can sacrifice the DMA security bought by IOMMU. I
must admit I know nothing about TBOOT, but TBOOT guys (cc-ed) think not
eabling IOMMU is totally ok.

So introduce a new boot option to disable the force on. It's kind of
silly we need to run into intel_iommu_init even without force on, but we
need to disable TBOOT PMR registers. For system without the boot option,
nothing is changed.
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bfd20f1c

26 4月, 2017 1 次提交

iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed · bdf95923

由 Sunil Goutham 提交于 4月 25, 2017

For software initiated address translation, when domain type is
IOMMU_DOMAIN_IDENTITY i.e SMMU is bypassed, mimic HW behavior
i.e return the same IOVA as translated address.

This patch is an extension to Will Deacon's patchset
"Implement SMMU passthrough using the default domain".
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bdf95923

25 4月, 2017 1 次提交

iommu/arm-smmu: Correct sid to mask · 6323f474

由 Peng Fan 提交于 4月 21, 2017

From code "SMR mask 0x%x out of range for SMMU", so, we need to use mask, not
sid.
Signed-off-by: NPeng Fan <peng.fan@nxp.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

6323f474

24 4月, 2017 1 次提交

iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid() · 73dbd4a4

由 Pan Bian 提交于 4月 23, 2017

In function amd_iommu_bind_pasid(), the control flow jumps
to label out_free when pasid_state->mm and mm is NULL. And
mmput(mm) is called.  In function mmput(mm), mm is
referenced without validation. This will result in a NULL
dereference bug. This patch fixes the bug.
Signed-off-by: NPan Bian <bianpan2016@163.com>
Fixes: f0aac63b ('iommu/amd: Don't hold a reference to mm_struct')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

73dbd4a4

20 4月, 2017 11 次提交

iommu: Make iommu_bus_notifier return NOTIFY_DONE rather than error code · 3ba8775f

由 zhichang.yuan 提交于 4月 18, 2017

In iommu_bus_notifier(), when action is
BUS_NOTIFY_ADD_DEVICE, it will return 'ops->add_device(dev)'
directly. But ops->add_device will return ERR_VAL, such as
-ENODEV. These value will make notifier_call_chain() not to
traverse the remain nodes in struct notifier_block list.

This patch revises iommu_bus_notifier() to return
NOTIFY_DONE when some errors happened in ops->add_device().
Signed-off-by: Nzhichang.yuan <yuanzhichang@hisilicon.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3ba8775f

iommu/omap: Add iommu-group support · 28ae1e3e

由 Joerg Roedel 提交于 4月 12, 2017

Support for IOMMU groups will become mandatory for drivers,
so add it to the omap iommu driver.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
[s-anna@ti.com: minor error cleanups]
Signed-off-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

28ae1e3e

iommu/omap: Make use of 'struct iommu_device' · 01611fe8

由 Joerg Roedel 提交于 4月 12, 2017

Modify the driver to register individual iommus and
establish links between devices and iommus in sysfs.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
[s-anna@ti.com: fix some cleanup issues during failures]
Signed-off-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

01611fe8

iommu/omap: Store iommu_dev pointer in arch_data · ede1c2e7

由 Joerg Roedel 提交于 4月 12, 2017

Instead of finding the matching IOMMU for a device using
string comparision functions, store the pointer to the
iommu_dev in arch_data during the omap_iommu_add_device
callback and reset it during the omap_iommu_remove_device
callback functions.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
[s-anna@ti.com: few minor cleanups]
Signed-off-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

ede1c2e7

iommu/omap: Move data structures to omap-iommu.h · e73b7afe

由 Joerg Roedel 提交于 4月 12, 2017

The internal data-structures are scattered over various
header and C files. Consolidate them in omap-iommu.h.

While at this, add the kerneldoc comment for the missing
iommu domain variable and revise the iommu_arch_data name.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
[s-anna@ti.com: revise kerneldoc comments]
Signed-off-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e73b7afe

iommu/omap: Drop legacy-style device support · 49a57ef7

由 Suman Anna 提交于 4月 12, 2017

All the supported boards that have OMAP IOMMU devices do support
DT boot only now. So, drop the support for the non-DT legacy-style
devices from the OMAP IOMMU driver. Couple of the fields from the
iommu platform data would no longer be required, so they have also
been cleaned up. The IOMMU platform data is still needed though for
performing reset management properly in a multi-arch environment.
Signed-off-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

49a57ef7

iommu/omap: Register driver before setting IOMMU ops · abaa7e5b

由 Suman Anna 提交于 4月 12, 2017

Move the registration of the OMAP IOMMU platform driver before
setting the IOMMU callbacks on the platform bus. This causes
the IOMMU devices to be probed first before the .add_device()
callback is invoked for all registered devices, and allows
the iommu_group support to be added to the OMAP IOMMU driver.

While at this, also check for the return status from bus_set_iommu.
Signed-off-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

abaa7e5b

iommu/arm-smmu: Clean up early-probing workarounds · f6810c15

由 Robin Murphy 提交于 4月 10, 2017

Now that the appropriate ordering is enforced via probe-deferral of
masters in core code, rip it all out and bask in the simplicity.
Tested-by: NHanjun Guo <hanjun.guo@linaro.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[Sricharan: Rebased on top of ACPI IORT SMMU series]
Signed-off-by: NSricharan R <sricharan@codeaurora.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f6810c15

iommu: of: Handle IOMMU lookup failure with deferred probing or error · 7b07cbef

由 Laurent Pinchart 提交于 4月 10, 2017

Failures to look up an IOMMU when parsing the DT iommus property need to
be handled separately from the .of_xlate() failures to support deferred
probing.

The lack of a registered IOMMU can be caused by the lack of a driver for
the IOMMU, the IOMMU device probe not having been performed yet, having
been deferred, or having failed.

The first case occurs when the device tree describes the bus master and
IOMMU topology correctly but no device driver exists for the IOMMU yet
or the device driver has not been compiled in. Return NULL, the caller
will configure the device without an IOMMU.

The second and third cases are handled by deferring the probe of the bus
master device which will eventually get reprobed after the IOMMU.

The last case is currently handled by deferring the probe of the bus
master device as well. A mechanism to either configure the bus master
device without an IOMMU or to fail the bus master device probe depending
on whether the IOMMU is optional or mandatory would be a good
enhancement.
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NLaurent Pichart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: NSricharan R <sricharan@codeaurora.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

7b07cbef

iommu/of: Prepare for deferred IOMMU configuration · d7b05582

由 Robin Murphy 提交于 4月 10, 2017

IOMMU configuration represents unchanging properties of the hardware,
and as such should only need happen once in a device's lifetime, but
the necessary interaction with the IOMMU device and driver complicates
exactly when that point should be.

Since the only reasonable tool available for handling the inter-device
dependency is probe deferral, we need to prepare of_iommu_configure()
to run later than it is currently called (i.e. at driver probe rather
than device creation), to handle being retried, and to tell whether a
not-yet present IOMMU should be waited for or skipped (by virtue of
having declared a built-in driver or not).
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

d7b05582

iommu/of: Refactor of_iommu_configure() for error handling · 2a0c5754

由 Robin Murphy 提交于 4月 10, 2017

In preparation for some upcoming cleverness, rework the control flow in
of_iommu_configure() to minimise duplication and improve the propogation
of errors. It's also as good a time as any to switch over from the
now-just-a-compatibility-wrapper of_iommu_get_ops() to using the generic
IOMMU instance interface directly.
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

2a0c5754

07 4月, 2017 1 次提交

iommu/iova: Fix underflow bug in __alloc_and_insert_iova_range · 5016bdb7

由 Nate Watterson 提交于 4月 07, 2017

Normally, calling alloc_iova() using an iova_domain with insufficient
pfns remaining between start_pfn and dma_limit will fail and return a
NULL pointer. Unexpectedly, if such a "full" iova_domain contains an
iova with pfn_lo == 0, the alloc_iova() call will instead succeed and
return an iova containing invalid pfns.

This is caused by an underflow bug in __alloc_and_insert_iova_range()
that occurs after walking the "full" iova tree when the search ends
at the iova with pfn_lo == 0 and limit_pfn is then adjusted to be just
below that (-1). This (now huge) limit_pfn gives the impression that a
vast amount of space is available between it and start_pfn and thus
a new iova is allocated with the invalid pfn_hi value, 0xFFF.... .

To rememdy this, a check is introduced to ensure that adjustments to
limit_pfn will not underflow.

This issue has been observed in the wild, and is easily reproduced with
the following sample code.

	struct iova_domain *iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
	struct iova *rsvd_iova, *good_iova, *bad_iova;
	unsigned long limit_pfn = 3;
	unsigned long start_pfn = 1;
	unsigned long va_size = 2;

	init_iova_domain(iovad, SZ_4K, start_pfn, limit_pfn);
	rsvd_iova = reserve_iova(iovad, 0, 0);
	good_iova = alloc_iova(iovad, va_size, limit_pfn, true);
	bad_iova = alloc_iova(iovad, va_size, limit_pfn, true);

Prior to the patch, this yielded:
	*rsvd_iova == {0, 0}   /* Expected */
	*good_iova == {2, 3}   /* Expected */
	*bad_iova  == {-2, -1} /* Oh no... */

After the patch, bad_iova is NULL as expected since inadequate
space remains between limit_pfn and start_pfn after allocating
good_iova.
Signed-off-by: NNate Watterson <nwatters@codeaurora.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5016bdb7

06 4月, 2017 12 次提交

iommu/io-pgtable-arm: Avoid shift overflow in block size · 022f4e4f

由 Robin Murphy 提交于 4月 03, 2017

The recursive nature of __arm_lpae_{map,unmap}() means that
ARM_LPAE_BLOCK_SIZE() is evaluated for every level, including those
where block mappings aren't possible. This in itself is harmless enough,
as we will only ever be called with valid sizes from the pgsize_bitmap,
and thus always recurse down past any imaginary block sizes. The only
problem is that most of those imaginary sizes overflow the type used for
the calculation, and thus trigger warnings under UBsan:

[   63.020939] ================================================================================
[   63.021284] UBSAN: Undefined behaviour in drivers/iommu/io-pgtable-arm.c:312:22
[   63.021602] shift exponent 39 is too large for 32-bit type 'int'
[   63.021909] CPU: 0 PID: 1119 Comm: lkvm Not tainted 4.7.0-rc3+ #819
[   63.022163] Hardware name: FVP Base (DT)
[   63.022345] Call trace:
[   63.022629] [<ffffff900808f258>] dump_backtrace+0x0/0x3a8
[   63.022975] [<ffffff900808f614>] show_stack+0x14/0x20
[   63.023294] [<ffffff90086bc9dc>] dump_stack+0x104/0x148
[   63.023609] [<ffffff9008713ce8>] ubsan_epilogue+0x18/0x68
[   63.023956] [<ffffff9008714410>] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
[   63.024365] [<ffffff900890fcb0>] __arm_lpae_map+0x720/0xae0
[   63.024732] [<ffffff9008910170>] arm_lpae_map+0x100/0x190
[   63.025049] [<ffffff90089183d8>] arm_smmu_map+0x78/0xc8
[   63.025390] [<ffffff9008906c18>] iommu_map+0x130/0x230
[   63.025763] [<ffffff9008bf7564>] vfio_iommu_type1_attach_group+0x4bc/0xa00
[   63.026156] [<ffffff9008bf3c78>] vfio_fops_unl_ioctl+0x320/0x580
[   63.026515] [<ffffff9008377420>] do_vfs_ioctl+0x140/0xd28
[   63.026858] [<ffffff9008378094>] SyS_ioctl+0x8c/0xa0
[   63.027179] [<ffffff9008086e70>] el0_svc_naked+0x24/0x28
[   63.027412] ================================================================================

Perform the shift in a 64-bit type to prevent the theoretical overflow
and keep the peace. As it turns out, this generates identical code for
32-bit ARM, and marginally shorter AArch64 code, so it's good all round.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

022f4e4f

iommu: Allow default domain type to be set on the kernel command line · fccb4e3b

由 Will Deacon 提交于 1月 05, 2017

The IOMMU core currently initialises the default domain for each group
to IOMMU_DOMAIN_DMA, under the assumption that devices will use
IOMMU-backed DMA ops by default. However, in some cases it is desirable
for the DMA ops to bypass the IOMMU for performance reasons, reserving
use of translation for subsystems such as VFIO that require it for
enforcing device isolation.

Rather than modify each IOMMU driver to provide different semantics for
DMA domains, instead we introduce a command line parameter that can be
used to change the type of the default domain. Passthrough can then be
specified using "iommu.passthrough=1" on the kernel command line.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fccb4e3b

iommu/arm-smmu-v3: Install bypass STEs for IOMMU_DOMAIN_IDENTITY domains · beb3c6a0

由 Will Deacon 提交于 1月 06, 2017

In preparation for allowing the default domain type to be overridden,
this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
ARM SMMUv3 driver.

An identity domain is created by placing the corresponding stream table
entries into "bypass" mode, which allows transactions to flow through
the SMMU without any translation.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

beb3c6a0

iommu/arm-smmu-v3: Make arm_smmu_install_ste_for_dev return void · 67560edc

由 Will Deacon 提交于 3月 01, 2017

arm_smmu_install_ste_for_dev cannot fail and always returns 0, however
the fact that it returns int means that callers end up implementing
redundant error handling code which complicates STE tracking and is
never executed.

This patch changes the return type of arm_smmu_install_ste_for_dev
to void, to make it explicit that it cannot fail.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

67560edc

iommu/arm-smmu: Install bypass S2CRs for IOMMU_DOMAIN_IDENTITY domains · 61bc6711

由 Will Deacon 提交于 1月 06, 2017

In preparation for allowing the default domain type to be overridden,
this patch adds support for IOMMU_DOMAIN_IDENTITY domains to the
ARM SMMU driver.

An identity domain is created by placing the corresponding S2CR
registers into "bypass" mode, which allows transactions to flow through
the SMMU without any translation.
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

61bc6711

iommu/arm-smmu: Restrict domain attributes to UNMANAGED domains · 0834cc28

由 Will Deacon 提交于 1月 06, 2017

The ARM SMMU drivers provide a DOMAIN_ATTR_NESTING domain attribute,
which allows callers of the IOMMU API to request that the page table
for a domain is installed at stage-2, if supported by the hardware.

Since setting this attribute only makes sense for UNMANAGED domains,
this patch returns -ENODEV if the domain_{get,set}_attr operations are
called on other domain types.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0834cc28

iommu/arm-smmu: Add global SMR masking property · 56fbf600

由 Robin Murphy 提交于 3月 31, 2017

The current SMR masking support using a 2-cell iommu-specifier is
primarily intended to handle individual masters with large and/or
complex Stream ID assignments; it quickly gets a bit clunky in other SMR
use-cases where we just want to consistently mask out the same part of
every Stream ID (e.g. for MMU-500 configurations where the appended TBU
number gets in the way unnecessarily). Let's add a new property to allow
a single global mask value to better fit the latter situation.
Acked-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NNipun Gupta <nipun.gupta@nxp.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

56fbf600

iommu/arm-smmu: Poll for TLB sync completion more effectively · 8513c893

由 Robin Murphy 提交于 3月 30, 2017

On relatively slow development platforms and software models, the
inefficiency of our TLB sync loop tends not to show up - for instance on
a Juno r1 board I typically see the TLBI has completed of its own accord
by the time we get to the sync, such that the latter finishes instantly.

However, on larger systems doing real I/O, it's less realistic for the
TLBs to go idle immediately, and at that point falling into the 1MHz
polling loop turns out to throw away performance drastically. Let's
strike a balance by polling more than once between pauses, such that we
have much more chance of catching normal operations completing before
committing to the fixed delay, but also backing off exponentially, since
if a sync really hasn't completed within one or two "reasonable time"
periods, it becomes increasingly unlikely that it ever will.
Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

8513c893

iommu/arm-smmu: Use per-context TLB sync as appropriate · 11febfca

由 Robin Murphy 提交于 3月 30, 2017

TLB synchronisation typically involves the SMMU blocking all incoming
transactions until the TLBs report completion of all outstanding
operations. In the common SMMUv2 configuration of a single distributed
SMMU serving multiple peripherals, that means that a single unmap
request has the potential to bring the hammer down on the entire system
if synchronised globally. Since stage 1 contexts, and stage 2 contexts
under SMMUv2, offer local sync operations, let's make use of those
wherever we can in the hope of minimising global disruption.

To that end, rather than add any more branches to the already unwieldy
monolithic TLB maintenance ops, break them up into smaller, neater,
functions which we can then mix and match as appropriate.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

11febfca

iommu/arm-smmu: Tidy up context bank indexing · 452107c7

由 Robin Murphy 提交于 3月 30, 2017

ARM_AMMU_CB() is calculated relative to ARM_SMMU_CB_BASE(), but the
latter is never of use on its own, and what we end up with is the same
ARM_SMMU_CB_BASE() + ARM_AMMU_CB() expression being duplicated at every
callsite. Folding the two together gives us a self-contained context
bank accessor which is much more pleasant to work with.

Secondly, we might as well simplify CB_BASE itself at the same time.
We use the address space size for its own sake precisely once, at probe
time, and every other usage is to dynamically calculate CB_BASE over
and over and over again. Let's flip things around so that we just
maintain the CB_BASE address directly.
Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

452107c7

iommu/arm-smmu: Simplify ASID/VMID handling · 280b683c

由 Robin Murphy 提交于 3月 30, 2017

Calculating ASIDs/VMIDs dynamically from arm_smmu_cfg was a neat trick,
but the global uniqueness workaround makes it somewhat more awkward, and
means we end up having to pass extra state around in certain cases just
to keep a handle on the offset.

We already have 16 bits going spare in arm_smmu_cfg; let's just
precalculate an ASID/VMID, plop it in there, and tidy up the users
accordingly. We'd also need something like this anyway if we ever get
near to thinking about SVM, so it's no bad thing.
Reviewed-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

280b683c

iommu/arm-smmu: Fix 16-bit ASID configuration · 125458ab

由 Sunil Goutham 提交于 3月 28, 2017

16-bit ASID should be enabled before initializing TTBR0/1,
otherwise only LSB 8-bit ASID will be considered. Hence
moving configuration of TTBCR register ahead of TTBR0/1
while initializing context bank.
Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
[will: rewrote comment]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

125458ab