提交 · 518d9b450387a3508363af58d1f62db9fc92d438 · openeuler / raspberrypi-kernel

13 7月, 2016 8 次提交

iommu/amd: Remove special mapping code for dma_ops path · 518d9b45

由 Joerg Roedel 提交于 7月 05, 2016

Use the iommu-api map/unmap functions instead. This will be
required anyway when IOVA code is used for address
allocation.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

518d9b45

iommu/amd: Pass gfp-flags to iommu_map_page() · b911b89b

由 Joerg Roedel 提交于 7月 05, 2016

Make this function ready to be used in the DMA-API path.
Reorder parameters a bit while at it.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b911b89b

iommu/amd: Implement apply_dm_region call-back · 8d54d6c8

由 Joerg Roedel 提交于 7月 05, 2016

It is used to reserve the dm-regions in the iova-tree.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

8d54d6c8

iommu/amd: Create a list of reserved iova addresses · 81cd07b9

由 Joerg Roedel 提交于 7月 07, 2016

Put the MSI-range, the HT-range and the MMIO ranges of PCI
devices into that range, so that these addresses are not
allocated for DMA.

Copy this address list into every created dma_ops_domain.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

81cd07b9

J
iommu/amd: Allocate iova_domain for dma_ops_domain · 307d5851
由 Joerg Roedel 提交于 7月 05, 2016
```
Use it later for allocating the IO virtual addresses.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
```
307d5851

iommu/amd: Select IOMMU_IOVA for AMD IOMMU · a72c4225

由 Joerg Roedel 提交于 7月 05, 2016

Include the generic IOVA code to make use of it in the AMD
IOMMU driver too.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

a72c4225

iommu: Add apply_dm_region call-back to iommu-ops · 33b21a6b

由 Joerg Roedel 提交于 7月 05, 2016

This new call-back will be used by the iommu driver to do
reserve the given dm_region in its iova space before the
mapping is created.

The call-back is temporary until the dma-ops implementation
is part of the common iommu code.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

33b21a6b

iommu/amd: Init unity mappings only for dma_ops domains · b548e786

由 Joerg Roedel 提交于 7月 13, 2016

The default domain for a device might also be
identity-mapped. In this case the kernel would crash when
unity mappings are defined for the device. Fix that by
making sure the domain is a dma_ops domain.

Fixes: 0bb6e243 ('iommu/amd: Support IOMMU_DOMAIN_DMA type allocation')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b548e786

21 6月, 2016 1 次提交

iommu/amd: Remove create_workqueue · cf7513e7

由 Bhaktipriya Shridhar 提交于 6月 18, 2016

alloc_workqueue replaces deprecated create_workqueue().

A dedicated workqueue has been used since the workitem (viz
&fault->work), is involved in IO page-fault handling.
WQ_MEM_RECLAIM has been set to guarantee forward progress under memory
pressure, which is a requirement here.
Since there are only a fixed number of work items, explicit concurrency
limit is unnecessary.
Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

cf7513e7

15 6月, 2016 1 次提交

iommu/amd: Set AMD iommu callbacks for platform bus driver · 0076cd3d

由 Wan Zongshun 提交于 5月 10, 2016

AMD has more drivers will use ACPI to platform bus driver later,
all those devices need iommu support, for example: eMMC driver.

For latest AMD eMMC controller, it will utilize sdhci-acpi.c driver,
which will rely on platform bus to match device and driver, where we
will set 'dev' of struct platform_device as map_sg parameter passing
to iommu driver for DMA request, so the iommu-ops are needed on the
platform bus.
Signed-off-by: NWan Zongshun <Vincent.Wan@amd.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

0076cd3d

28 5月, 2016 1 次提交

remove lots of IS_ERR_VALUE abuses · 287980e4

由 Arnd Bergmann 提交于 5月 27, 2016

Most users of IS_ERR_VALUE() in the kernel are wrong, as they
pass an 'int' into a function that takes an 'unsigned long'
argument. This happens to work because the type is sign-extended
on 64-bit architectures before it gets converted into an
unsigned type.

However, anything that passes an 'unsigned short' or 'unsigned int'
argument into IS_ERR_VALUE() is guaranteed to be broken, as are
8-bit integers and types that are wider than 'unsigned long'.

Andrzej Hajda has already fixed a lot of the worst abusers that
were causing actual bugs, but it would be nice to prevent any
users that are not passing 'unsigned long' arguments.

This patch changes all users of IS_ERR_VALUE() that I could find
on 32-bit ARM randconfig builds and x86 allmodconfig. For the
moment, this doesn't change the definition of IS_ERR_VALUE()
because there are probably still architecture specific users
elsewhere.

Almost all the warnings I got are for files that are better off
using 'if (err)' or 'if (err < 0)'.
The only legitimate user I could find that we get a warning for
is the (32-bit only) freescale fman driver, so I did not remove
the IS_ERR_VALUE() there but changed the type to 'unsigned long'.
For 9pfs, I just worked around one user whose calling conventions
are so obscure that I did not dare change the behavior.

I was using this definition for testing:

 #define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \
       unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO))

which ends up making all 16-bit or wider types work correctly with
the most plausible interpretation of what IS_ERR_VALUE() was supposed
to return according to its users, but also causes a compile-time
warning for any users that do not pass an 'unsigned long' argument.

I suggested this approach earlier this year, but back then we ended
up deciding to just fix the users that are obviously broken. After
the initial warning that caused me to get involved in the discussion
(fs/gfs2/dir.c) showed up again in the mainline kernel, Linus
asked me to send the whole thing again.

[ Updated the 9p parts as per Al Viro  - Linus ]
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.org/lkml/2016/1/7/363
Link: https://lkml.org/lkml/2016/5/27/486
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

287980e4

10 5月, 2016 1 次提交

iommu/arm-smmu: Use per-domain page sizes. · d5466357

由 Robin Murphy 提交于 5月 09, 2016

Now that we can accurately reflect the context format we choose for each
domain, do that instead of imposing the global lowest-common-denominator
restriction and potentially ending up with nothing. We currently have a
strict 1:1 correspondence between domains and context banks, so we don't
need to entertain the possibility of multiple formats _within_ a domain.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
[rm: split from original patch, added SMMUv3]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

d5466357

09 5月, 2016 5 次提交

iommu/amd: Remove statistics code · e85e8f69

由 Joerg Roedel 提交于 5月 09, 2016

The statistics are not really used for anything and should
be replaced by generic and per-device statistic counters.
Remove the code for now.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e85e8f69

iommu/dma: Finish optimising higher-order allocations · 3b6b7e19

由 Robin Murphy 提交于 4月 13, 2016

Now that we know exactly which page sizes our caller wants to use in the
given domain, we can restrict higher-order allocation attempts to just
those sizes, if any, and avoid wasting any time or effort on other sizes
which offer no benefit. In the same vein, this also lets us accommodate
a minimum order greater than 0 for special cases.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Tested-by: NYong Wu <yong.wu@mediatek.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3b6b7e19

iommu: Allow selecting page sizes per domain · d16e0faa

由 Robin Murphy 提交于 4月 07, 2016

Many IOMMUs support multiple page table formats, meaning that any given
domain may only support a subset of the hardware page sizes presented in
iommu_ops->pgsize_bitmap. There are also certain use-cases where the
creator of a domain may want to control which page sizes are used, for
example to force the use of hugepage mappings to reduce pagetable walk
depth.

To this end, add a per-domain pgsize_bitmap to represent the subset of
page sizes actually in use, to make it possible for domains with
different requirements to coexist.
Signed-off-by: NWill Deacon <will.deacon@arm.com>
[rm: hijacked and rebased original patch with new commit message]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

d16e0faa

iommu: of: enforce const-ness of struct iommu_ops · 53c92d79

由 Robin Murphy 提交于 4月 07, 2016

As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.
Acked-by: NThierry Reding <treding@nvidia.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

53c92d79

iommu/dma: Implement scatterlist segment merging · 809eac54

由 Robin Murphy 提交于 4月 11, 2016

Stop wasting IOVA space by over-aligning scatterlist segments for a
theoretical worst-case segment boundary mask, and instead take the real
limits into account to merge consecutive segments wherever appropriate,
so our callers can benefit from getting back nicely simplified lists.

This also represents the last piece of functionality wanted by users of
the current arch/arm implementation, thus brings us a small step closer
to converting that over to the common code.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

809eac54

04 5月, 2016 9 次提交

iommu/arm-smmu: Clear cache lock bit of ACR · 3ca3712a

由 Peng Fan 提交于 5月 03, 2016

According MMU-500r2 TRM, section 3.7.1 Auxiliary Control registers,
You can modify ACTLR only when the ACR.CACHE_LOCK bit is 0.

So before clearing ARM_MMU500_ACTLR_CPRE of each context bank,
need clear CACHE_LOCK bit of ACR register first.

Since CACHE_LOCK bit is only present in MMU-500r2 onwards,
need to check the major number of IDR7.
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NPeng Fan <van.freenix@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3ca3712a

iommu/arm-smmu: Support SMMUv1 64KB supplement · b7862e35

由 Robin Murphy 提交于 4月 13, 2016

The 64KB Translation Granule Supplement to the SMMUv1 architecture
allows an SMMUv1 implementation to support 64KB pages for stage 2
translations, using a constrained VMSAv8 descriptor format limited
to 40-bit addresses. Now that we can freely mix and match context
formats, we can actually handle having 4KB pages via an AArch32
context but 64KB pages via an AArch64 context, so plumb it in.

It is assumed that any implementations will have hardware capabilities
matching the format constraints, thus obviating the need for excessive
sanity-checking; this is the case for MMU-401, the only ARM Ltd.
implementation.

CC: Eric Auger <eric.auger@linaro.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b7862e35

iommu/arm-smmu: Decouple context format from kernel config · 7602b871

由 Robin Murphy 提交于 4月 28, 2016

The way the driver currently forces an AArch32 or AArch64 context format
based on the kernel config and SMMU architecture version is suboptimal,
in that it makes it very hard to support oddball mix-and-match cases
like the SMMUv1 64KB supplement, or situations where the reduced table
depth of an AArch32 short descriptor context may be desirable under an
AArch64 kernel. It also only happens to work on current implementations
which do support all the relevant formats.

Introduce an explicit notion of context format, so we can manage that
independently and get rid of the inflexible #ifdeffery.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7602b871

iommu/arm-smmu: Tidy up 64-bit/atomic I/O accesses · f9a05f05

由 Robin Murphy 提交于 4月 13, 2016

With {read,write}q_relaxed now able to fall back to the common
nonatomic-hi-lo helper, make use of that so that we don't have to
open-code our own. In the process, also convert the other remaining
split accesses, and repurpose the custom accessor to smooth out the
couple of troublesome instances where we really want to avoid
nonatomic writes (and a 64-bit access is unnecessary in the 32-bit
context formats we would use on a 32-bit CPU).

This paves the way for getting rid of some of the assumptions currently
baked into the driver which make it really awkward to use 32-bit context
formats with SMMUv2 under a 64-bit kernel.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f9a05f05

iommu/arm-smmu: Work around MMU-500 prefetch errata · f0cfffc4

由 Robin Murphy 提交于 4月 13, 2016

MMU-500 erratum #841119 is tickled by a particular set of circumstances
interacting with the next-page prefetcher. Since said prefetcher is
quite dumb and actually detrimental to performance in some cases (by
causing unwanted TLB evictions for non-sequential access patterns), we
lose very little by turning it off, and what we gain is a guarantee that
the erratum is never hit.

As a bonus, the same workaround will also prevent erratum #826419 once
v7 short descriptor support is implemented.

CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Will Deacon <will.deacon@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f0cfffc4

iommu/arm-smmu: Convert ThunderX workaround to new method · e086d912

由 Robin Murphy 提交于 4月 13, 2016

With a framework for implementation-specific funtionality in place, the
currently-FDT-dependent ThunderX workaround gets to be the first user.
Acked-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e086d912

iommu/arm-smmu: Differentiate specific implementations · 67b65a3f

由 Robin Murphy 提交于 4月 13, 2016

As the inevitable reality of implementation-specific errata workarounds
begin to accrue alongside our integration quirk handling, it's about
time the driver had a decent way of keeping track. Extend the per-SMMU
data so we can identify specific implementations in an efficient and
firmware-agnostic manner.
Acked-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

67b65a3f

iommu/arm-smmu: Workaround for ThunderX erratum #27704 · 1bd37a68

由 Tirumalesh Chalamarla 提交于 3月 04, 2016

Due to erratum #27704, the CN88xx SMMUv2 implementation supports only
shared ASID and VMID numberspaces.

This patch ensures that ASID and VMIDs are unique across all SMMU
instances on affected Cavium systems.
Signed-off-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com>
Signed-off-by: NAkula Geethasowjanya <Geethasowjanya.Akula@caviumnetworks.com>
[will: commit message, comments and formatting]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

1bd37a68

iommu/arm-smmu: Add support for 16 bit VMID · 4e3e9b69

由 Tirumalesh Chalamarla 提交于 2月 23, 2016

This patch adds support for 16-bit VMIDs on implementations of SMMUv2
that support it.
Signed-off-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com>
[will: commit messsage and comments]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

4e3e9b69

22 4月, 2016 2 次提交

J
iommu/amd: Move get_device_id() and friends to beginning of file · fd6c50ee
由 Joerg Roedel 提交于 4月 21, 2016
```
They will be needed there later.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
```
fd6c50ee

iommu/amd: Don't use IS_ERR_VALUE to check integer values · 9ee35e4c

由 Joerg Roedel 提交于 4月 21, 2016

Use the better 'var < 0' check.

Fixes: 7aba6cb9 ('iommu/amd: Make call-sites of get_device_id aware of its return value')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

9ee35e4c

21 4月, 2016 10 次提交

iommu/arm-smmu: Don't allocate resources for bypass domains · 9800699c

由 Robin Murphy 提交于 4月 20, 2016

Until we get fully plumbed into of_iommu_configure, our default
IOMMU_DOMAIN_DMA domains just bypass translation. Since we achieve that
by leaving the stream table entries set to bypass instead of pointing at
a translation context, the context bank we allocate for the domain is
completely wasted. Context banks are typically a rather limited
resource, so don't hog ones we don't need.
Reported-by: NEric Auger <eric.auger@linaro.org>
Tested-by: NEric Auger <eric.auger@linaro.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

9800699c

iommu/arm-smmu: Fix stream-match conflict with IOMMU_DOMAIN_DMA · 5f634956

由 Will Deacon 提交于 4月 20, 2016

Commit cbf8277e ("iommu/arm-smmu: Treat IOMMU_DOMAIN_DMA as bypass
for now") ignores requests to attach a device to the default domain
since, without IOMMU-basked DMA ops available everywhere, the default
domain will just lead to unexpected transaction faults being reported.

Unfortunately, the way this was implemented on SMMUv2 causes a
regression with VFIO PCI device passthrough under KVM on AMD Seattle.
On this system, the host controller device is associated with both a
pci_dev *and* a platform_device, and can therefore end up with duplicate
SMR entries, resulting in a stream-match conflict at runtime.

This patch amends the original fix so that attaching to IOMMU_DOMAIN_DMA
is rejected even before configuring the SMRs. This restores the old
behaviour for now, but we'll need to look at handing host controllers
specially when we come to supporting the default domain fully.
Reported-by: NEric Auger <eric.auger@linaro.org>
Tested-by: NEric Auger <eric.auger@linaro.org>
Tested-by: NYang Shi <yang.shi@linaro.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5f634956

iommu/vt-d: Use per-cpu IOVA caching · 22e2f9fa

由 Omer Peleg 提交于 4月 20, 2016

Commit 9257b4a2 ('iommu/iova: introduce per-cpu caching to iova allocation')
introduced per-CPU IOVA caches to massively improve scalability. Use them.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
[dwmw2: split out VT-d part into a separate patch]
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

22e2f9fa

iommu/iova: introduce per-cpu caching to iova allocation · 9257b4a2

由 Omer Peleg 提交于 4月 20, 2016

IOVA allocation has two problems that impede high-throughput I/O.
First, it can do a linear search over the allocated IOVA ranges.
Second, the rbtree spinlock that serializes IOVA allocations becomes
contended.

Address these problems by creating an API for caching allocated IOVA
ranges, so that the IOVA allocator isn't accessed frequently.  This
patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs
without taking the rbtree spinlock.  The per-CPU caches are backed by
a global cache, to avoid invoking the (linear-time) IOVA allocator
without needing to make the per-CPU cache size excessive.  This design
is based on magazines, as described in "Magazines and Vmem: Extending
the Slab Allocator to Many CPUs and Arbitrary Resources" (currently
available at https://www.usenix.org/legacy/event/usenix01/bonwick.html)

Adding caching on top of the existing rbtree allocator maintains the
property that IOVAs are densely packed in the IO virtual address space,
which is important for keeping IOMMU page table usage low.

To keep the cache size reasonable, we bound the IOVA space a CPU can
cache by 32 MiB (we cache a bounded number of IOVA ranges, and only
ranges of size <= 128 KiB).  The shared global cache is bounded at
4 MiB of IOVA space.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
[dwmw2: split out VT-d part into a separate patch]
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

9257b4a2

iommu/vt-d: change intel-iommu to use IOVA frame numbers · 2aac6304

由 Omer Peleg 提交于 4月 20, 2016

Make intel-iommu map/unmap/invalidate work with IOVA pfns instead of
pointers to "struct iova". This avoids using the iova struct from the IOVA
red-black tree and the resulting explicit find_iova() on unmap.

This patch will allow us to cache IOVAs in the next patch, in order to
avoid rbtree operations for the majority of map/unmap operations.

Note: In eliminating the find_iova() operation, we have also eliminated
the sanity check previously done in the unmap flow. Arguably, this was
overhead that is better avoided in production code, but it could be
brought back as a debug option for driver development.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, fixed to not break iova api, and reworded
 the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

2aac6304

iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs · 0824c592

由 Omer Peleg 提交于 4月 20, 2016

This patch avoids taking the device_domain_lock in iommu_flush_dev_iotlb()
for domains with no dev iotlb devices.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[gvdl@google.com: fixed locking issues]
Signed-off-by: NGodfrey van der Linden <gvdl@google.com>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

0824c592

iommu/vt-d: only unmap mapped entries · 769530e4

由 Omer Peleg 提交于 4月 20, 2016

Current unmap implementation unmaps the entire area covered by the IOVA
range, which is a power-of-2 aligned region. The corresponding map,
however, only maps those pages originally mapped by the user. This
discrepancy can lead to unmapping of already unmapped entries, which is
unneeded work.

With this patch, only mapped pages are unmapped. This is also a baseline
for a map/unmap implementation based on IOVAs and not iova structures,
which will allow caching.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

769530e4

iommu/vt-d: correct flush_unmaps pfn usage · f5c0c08b

由 Omer Peleg 提交于 4月 20, 2016

Change flush_unmaps() to correctly pass iommu_flush_iotlb_psi()
dma addresses.  (x86_64 mm and dma have the same size for pages
at the moment, but this usage improves consistency.)
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

f5c0c08b

iommu/vt-d: per-cpu deferred invalidation queues · aa473240

由 Omer Peleg 提交于 4月 20, 2016

The IOMMU's IOTLB invalidation is a costly process.  When iommu mode
is not set to "strict", it is done asynchronously. Current code
amortizes the cost of invalidating IOTLB entries by batching all the
invalidations in the system and performing a single global invalidation
instead. The code queues pending invalidations in a global queue that
is accessed under the global "async_umap_flush_lock" spinlock, which
can result is significant spinlock contention.

This patch splits this deferred queue into multiple per-cpu deferred
queues, and thus gets rid of the "async_umap_flush_lock" and its
contention.  To keep existing deferred invalidation behavior, it still
invalidates the pending invalidations of all CPUs whenever a CPU
reaches its watermark or a timeout occurs.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

aa473240

iommu/vt-d: refactoring of deferred flush entries · 314f1dc1

由 Omer Peleg 提交于 4月 20, 2016

Currently, deferred flushes' info is striped between several lists in
the flush tables. Instead, move all information about a specific flush
to a single entry in this table.

This patch does not introduce any functional change.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

314f1dc1

20 4月, 2016 1 次提交

iommu/arm-smmu: Make use of phandle iterators in device-tree parsing · cb6c27bb

由 Joerg Roedel 提交于 4月 04, 2016

Remove the usage of of_parse_phandle_with_args() and replace
it by the phandle-iterator implementation so that we can
parse out all of the potentially present 128 stream-ids.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRob Herring <robh@kernel.org>

cb6c27bb

15 4月, 2016 1 次提交

iommu/amd: Signedness bug in acpihid_device_group() · 2d8e1f03

由 Dan Carpenter 提交于 4月 11, 2016

"devid" needs to be signed for the error handling to work.

Fixes: b097d11a ('iommu/amd: Manage iommu_group for ACPI HID devices')
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

2d8e1f03