- 07 7月, 2016 1 次提交
-
-
由 Joerg Roedel 提交于
There is a race condition in the AMD IOMMU init code that causes requested unity mappings to be blocked by the IOMMU for a short period of time. This results on boot failures and IO_PAGE_FAULTs on some machines. Fix this by making sure the unity mappings are installed before all other DMA is blocked. Fixes: aafd8ba0 ('iommu/amd: Implement add_device and remove_device') Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 04 7月, 2016 1 次提交
-
-
由 Aaron Campbell 提交于
Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum number of possible domains is 64K; indeed this is the maximum value that the cap_ndoms() macro will expand to. Since the value 65536 will not fix in a u16, the 'did' variable must be promoted to an int, otherwise the test for < 65536 will always be true and the loop will never end. The symptom, in my case, was a hung machine during suspend. Fixes: 3bd4f911 ("iommu/vt-d: Fix overflow of iommu->domains array") Signed-off-by: NAaron Campbell <aaron@monkey.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 27 6月, 2016 3 次提交
-
-
由 Nicolas Iooss 提交于
Commit 2a0cb4e2 ("iommu/amd: Add new map for storing IVHD dev entry type HID") added a call to DUMP_printk in init_iommu_from_acpi() which used the value of devid before this variable was initialized. Fixes: 2a0cb4e2 ('iommu/amd: Add new map for storing IVHD dev entry type HID') Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Jan Niehusmann 提交于
The valid range of 'did' in get_iommu_domain(*iommu, did) is 0..cap_ndoms(iommu->cap), so don't exceed that range in free_all_cpu_cached_iovas(). The user-visible impact of the out-of-bounds access is the machine hanging on suspend-to-ram. It is, in fact, a kernel panic, but due to already suspended devices, that's often not visible to the user. Fixes: 22e2f9fa ("iommu/vt-d: Use per-cpu IOVA caching") Signed-off-by: NJan Niehusmann <jan@gondor.com> Tested-By: NMarius Vlad <marius.c.vlad@intel.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Chris Wilson 提交于
Between acquiring the this_cpu_ptr() and using it, ideally we don't want to be preempted and work on another CPU's private data. this_cpu_ptr() checks whether or not preemption is disable, and get_cpu_ptr() provides a convenient wrapper for operating on the cpu ptr inside a preemption disabled critical section (which currently is provided by the spinlock). [ 167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216 [ 167.997940] caller is debug_smp_processor_id+0x17/0x20 [ 167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G U 4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1 [ 167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012 [ 167.997951] 0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007 [ 167.997958] ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000 [ 167.997965] ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08 [ 167.997971] Call Trace: [ 167.997977] [<ffffffff8140dca5>] dump_stack+0x67/0x92 [ 167.997981] [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0 [ 167.997985] [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20 [ 167.997990] [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210 [ 167.997994] [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0 [ 167.997998] [<ffffffff8151021d>] intel_map_sg+0xbd/0x240 [ 167.998002] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 167.998009] [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0 [ 167.998013] [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0 [ 167.998017] [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0 [ 167.998022] [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50 [ 167.998025] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 167.998028] [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0 [ 167.998032] [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0 [ 167.998035] [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150 [ 167.998039] [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0 [ 167.998042] [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60 [ 167.998045] [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420 [ 167.998049] [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540 [ 167.998052] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [ 167.998058] [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10 [ 167.998061] [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260 [ 167.998064] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20 [ 167.998067] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20 [ 167.998071] [<ffffffff8109ddfa>] kthread+0xea/0x100 [ 167.998078] [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40 [ 167.998081] [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: iommu@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Fixes: 9257b4a2 ('iommu/iova: introduce per-cpu caching to iova allocation') Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 17 6月, 2016 1 次提交
-
-
由 Joerg Roedel 提交于
This seems to be required on some X58 chipsets on systems with more than one IOMMU. QI does not work until it is enabled on all IOMMUs in the system. Reported-by: NDheeraj CVR <cvr.dheeraj@gmail.com> Tested-by: NDheeraj CVR <cvr.dheeraj@gmail.com> Fixes: 5f0a7f76 ('iommu/vt-d: Make root entry visible for hardware right after allocation') Cc: stable@vger.kernel.org Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 15 6月, 2016 1 次提交
-
-
由 John Keeping 提交于
rk_iommu_command() takes a struct rk_iommu and iterates over the slave MMUs, so this is doubly wrong in that we're passing in the wrong pointer and talking to MMUs that we shouldn't be. Fixes: cd6438c5 ("iommu/rockchip: Reconstruct to support multi slaves") Cc: stable@vger.kernel.org Signed-off-by: NJohn Keeping <john@metanate.com> Tested-by: NHeiko Stuebner <heiko@sntech.de> Reviewed-by: NHeiko Stuebner <heiko@sntech.de> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 13 6月, 2016 1 次提交
-
-
由 Jean-Philippe Brucker 提交于
The map_sg callback is missing from arm_smmu_ops, but is required by iommu.h. Similarly to most other IOMMU drivers, connect it to default_iommu_map_sg. Cc: <stable@vger.kernel.org> Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 28 5月, 2016 1 次提交
-
-
由 Arnd Bergmann 提交于
Most users of IS_ERR_VALUE() in the kernel are wrong, as they pass an 'int' into a function that takes an 'unsigned long' argument. This happens to work because the type is sign-extended on 64-bit architectures before it gets converted into an unsigned type. However, anything that passes an 'unsigned short' or 'unsigned int' argument into IS_ERR_VALUE() is guaranteed to be broken, as are 8-bit integers and types that are wider than 'unsigned long'. Andrzej Hajda has already fixed a lot of the worst abusers that were causing actual bugs, but it would be nice to prevent any users that are not passing 'unsigned long' arguments. This patch changes all users of IS_ERR_VALUE() that I could find on 32-bit ARM randconfig builds and x86 allmodconfig. For the moment, this doesn't change the definition of IS_ERR_VALUE() because there are probably still architecture specific users elsewhere. Almost all the warnings I got are for files that are better off using 'if (err)' or 'if (err < 0)'. The only legitimate user I could find that we get a warning for is the (32-bit only) freescale fman driver, so I did not remove the IS_ERR_VALUE() there but changed the type to 'unsigned long'. For 9pfs, I just worked around one user whose calling conventions are so obscure that I did not dare change the behavior. I was using this definition for testing: #define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \ unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO)) which ends up making all 16-bit or wider types work correctly with the most plausible interpretation of what IS_ERR_VALUE() was supposed to return according to its users, but also causes a compile-time warning for any users that do not pass an 'unsigned long' argument. I suggested this approach earlier this year, but back then we ended up deciding to just fix the users that are obviously broken. After the initial warning that caused me to get involved in the discussion (fs/gfs2/dir.c) showed up again in the mainline kernel, Linus asked me to send the whole thing again. [ Updated the 9p parts as per Al Viro - Linus ] Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Andrzej Hajda <a.hajda@samsung.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.org/lkml/2016/1/7/363 Link: https://lkml.org/lkml/2016/5/27/486 Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 10 5月, 2016 1 次提交
-
-
由 Robin Murphy 提交于
Now that we can accurately reflect the context format we choose for each domain, do that instead of imposing the global lowest-common-denominator restriction and potentially ending up with nothing. We currently have a strict 1:1 correspondence between domains and context banks, so we don't need to entertain the possibility of multiple formats _within_ a domain. Signed-off-by: NWill Deacon <will.deacon@arm.com> [rm: split from original patch, added SMMUv3] Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 09 5月, 2016 5 次提交
-
-
由 Joerg Roedel 提交于
The statistics are not really used for anything and should be replaced by generic and per-device statistic counters. Remove the code for now. Signed-off-by: NJoerg Roedel <jroedel@suse.de> -
由 Robin Murphy 提交于
Now that we know exactly which page sizes our caller wants to use in the given domain, we can restrict higher-order allocation attempts to just those sizes, if any, and avoid wasting any time or effort on other sizes which offer no benefit. In the same vein, this also lets us accommodate a minimum order greater than 0 for special cases. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Tested-by: NYong Wu <yong.wu@mediatek.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Robin Murphy 提交于
Many IOMMUs support multiple page table formats, meaning that any given domain may only support a subset of the hardware page sizes presented in iommu_ops->pgsize_bitmap. There are also certain use-cases where the creator of a domain may want to control which page sizes are used, for example to force the use of hugepage mappings to reduce pagetable walk depth. To this end, add a per-domain pgsize_bitmap to represent the subset of page sizes actually in use, to make it possible for domains with different requirements to coexist. Signed-off-by: NWill Deacon <will.deacon@arm.com> [rm: hijacked and rebased original patch with new commit message] Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Robin Murphy 提交于
As a set of driver-provided callbacks and static data, there is no compelling reason for struct iommu_ops to be mutable in core code, so enforce const-ness throughout. Acked-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Robin Murphy 提交于
Stop wasting IOVA space by over-aligning scatterlist segments for a theoretical worst-case segment boundary mask, and instead take the real limits into account to merge consecutive segments wherever appropriate, so our callers can benefit from getting back nicely simplified lists. This also represents the last piece of functionality wanted by users of the current arch/arm implementation, thus brings us a small step closer to converting that over to the common code. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 04 5月, 2016 9 次提交
-
-
由 Peng Fan 提交于
According MMU-500r2 TRM, section 3.7.1 Auxiliary Control registers, You can modify ACTLR only when the ACR.CACHE_LOCK bit is 0. So before clearing ARM_MMU500_ACTLR_CPRE of each context bank, need clear CACHE_LOCK bit of ACR register first. Since CACHE_LOCK bit is only present in MMU-500r2 onwards, need to check the major number of IDR7. Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NPeng Fan <van.freenix@gmail.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
The 64KB Translation Granule Supplement to the SMMUv1 architecture allows an SMMUv1 implementation to support 64KB pages for stage 2 translations, using a constrained VMSAv8 descriptor format limited to 40-bit addresses. Now that we can freely mix and match context formats, we can actually handle having 4KB pages via an AArch32 context but 64KB pages via an AArch64 context, so plumb it in. It is assumed that any implementations will have hardware capabilities matching the format constraints, thus obviating the need for excessive sanity-checking; this is the case for MMU-401, the only ARM Ltd. implementation. CC: Eric Auger <eric.auger@linaro.org> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
The way the driver currently forces an AArch32 or AArch64 context format based on the kernel config and SMMU architecture version is suboptimal, in that it makes it very hard to support oddball mix-and-match cases like the SMMUv1 64KB supplement, or situations where the reduced table depth of an AArch32 short descriptor context may be desirable under an AArch64 kernel. It also only happens to work on current implementations which do support all the relevant formats. Introduce an explicit notion of context format, so we can manage that independently and get rid of the inflexible #ifdeffery. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
With {read,write}q_relaxed now able to fall back to the common nonatomic-hi-lo helper, make use of that so that we don't have to open-code our own. In the process, also convert the other remaining split accesses, and repurpose the custom accessor to smooth out the couple of troublesome instances where we really want to avoid nonatomic writes (and a 64-bit access is unnecessary in the 32-bit context formats we would use on a 32-bit CPU). This paves the way for getting rid of some of the assumptions currently baked into the driver which make it really awkward to use 32-bit context formats with SMMUv2 under a 64-bit kernel. Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> -
由 Robin Murphy 提交于
MMU-500 erratum #841119 is tickled by a particular set of circumstances interacting with the next-page prefetcher. Since said prefetcher is quite dumb and actually detrimental to performance in some cases (by causing unwanted TLB evictions for non-sequential access patterns), we lose very little by turning it off, and what we gain is a guarantee that the erratum is never hit. As a bonus, the same workaround will also prevent erratum #826419 once v7 short descriptor support is implemented. CC: Catalin Marinas <catalin.marinas@arm.com> CC: Will Deacon <will.deacon@arm.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
With a framework for implementation-specific funtionality in place, the currently-FDT-dependent ThunderX workaround gets to be the first user. Acked-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Robin Murphy 提交于
As the inevitable reality of implementation-specific errata workarounds begin to accrue alongside our integration quirk handling, it's about time the driver had a decent way of keeping track. Extend the per-SMMU data so we can identify specific implementations in an efficient and firmware-agnostic manner. Acked-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Tirumalesh Chalamarla 提交于
Due to erratum #27704, the CN88xx SMMUv2 implementation supports only shared ASID and VMID numberspaces. This patch ensures that ASID and VMIDs are unique across all SMMU instances on affected Cavium systems. Signed-off-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: NAkula Geethasowjanya <Geethasowjanya.Akula@caviumnetworks.com> [will: commit message, comments and formatting] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
由 Tirumalesh Chalamarla 提交于
This patch adds support for 16-bit VMIDs on implementations of SMMUv2 that support it. Signed-off-by: NTirumalesh Chalamarla <tchalamarla@caviumnetworks.com> [will: commit messsage and comments] Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 22 4月, 2016 2 次提交
-
-
由 Joerg Roedel 提交于
They will be needed there later. Signed-off-by: NJoerg Roedel <jroedel@suse.de> -
由 Joerg Roedel 提交于
Use the better 'var < 0' check. Fixes: 7aba6cb9 ('iommu/amd: Make call-sites of get_device_id aware of its return value') Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 21 4月, 2016 10 次提交
-
-
由 Robin Murphy 提交于
Until we get fully plumbed into of_iommu_configure, our default IOMMU_DOMAIN_DMA domains just bypass translation. Since we achieve that by leaving the stream table entries set to bypass instead of pointing at a translation context, the context bank we allocate for the domain is completely wasted. Context banks are typically a rather limited resource, so don't hog ones we don't need. Reported-by: NEric Auger <eric.auger@linaro.org> Tested-by: NEric Auger <eric.auger@linaro.org> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Will Deacon 提交于
Commit cbf8277e ("iommu/arm-smmu: Treat IOMMU_DOMAIN_DMA as bypass for now") ignores requests to attach a device to the default domain since, without IOMMU-basked DMA ops available everywhere, the default domain will just lead to unexpected transaction faults being reported. Unfortunately, the way this was implemented on SMMUv2 causes a regression with VFIO PCI device passthrough under KVM on AMD Seattle. On this system, the host controller device is associated with both a pci_dev *and* a platform_device, and can therefore end up with duplicate SMR entries, resulting in a stream-match conflict at runtime. This patch amends the original fix so that attaching to IOMMU_DOMAIN_DMA is rejected even before configuring the SMRs. This restores the old behaviour for now, but we'll need to look at handing host controllers specially when we come to supporting the default domain fully. Reported-by: NEric Auger <eric.auger@linaro.org> Tested-by: NEric Auger <eric.auger@linaro.org> Tested-by: NYang Shi <yang.shi@linaro.org> Signed-off-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Omer Peleg 提交于
Commit 9257b4a2 ('iommu/iova: introduce per-cpu caching to iova allocation') introduced per-CPU IOVA caches to massively improve scalability. Use them. Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Omer Peleg 提交于
IOVA allocation has two problems that impede high-throughput I/O. First, it can do a linear search over the allocated IOVA ranges. Second, the rbtree spinlock that serializes IOVA allocations becomes contended. Address these problems by creating an API for caching allocated IOVA ranges, so that the IOVA allocator isn't accessed frequently. This patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs without taking the rbtree spinlock. The per-CPU caches are backed by a global cache, to avoid invoking the (linear-time) IOVA allocator without needing to make the per-CPU cache size excessive. This design is based on magazines, as described in "Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources" (currently available at https://www.usenix.org/legacy/event/usenix01/bonwick.html) Adding caching on top of the existing rbtree allocator maintains the property that IOVAs are densely packed in the IO virtual address space, which is important for keeping IOMMU page table usage low. To keep the cache size reasonable, we bound the IOVA space a CPU can cache by 32 MiB (we cache a bounded number of IOVA ranges, and only ranges of size <= 128 KiB). The shared global cache is bounded at 4 MiB of IOVA space. Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> [dwmw2: split out VT-d part into a separate patch] Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Omer Peleg 提交于
Make intel-iommu map/unmap/invalidate work with IOVA pfns instead of pointers to "struct iova". This avoids using the iova struct from the IOVA red-black tree and the resulting explicit find_iova() on unmap. This patch will allow us to cache IOVAs in the next patch, in order to avoid rbtree operations for the majority of map/unmap operations. Note: In eliminating the find_iova() operation, we have also eliminated the sanity check previously done in the unmap flow. Arguably, this was overhead that is better avoided in production code, but it could be brought back as a debug option for driver development. Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, fixed to not break iova api, and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Omer Peleg 提交于
This patch avoids taking the device_domain_lock in iommu_flush_dev_iotlb() for domains with no dev iotlb devices. Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [gvdl@google.com: fixed locking issues] Signed-off-by: NGodfrey van der Linden <gvdl@google.com> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Omer Peleg 提交于
Current unmap implementation unmaps the entire area covered by the IOVA range, which is a power-of-2 aligned region. The corresponding map, however, only maps those pages originally mapped by the user. This discrepancy can lead to unmapping of already unmapped entries, which is unneeded work. With this patch, only mapped pages are unmapped. This is also a baseline for a map/unmap implementation based on IOVAs and not iova structures, which will allow caching. Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Omer Peleg 提交于
Change flush_unmaps() to correctly pass iommu_flush_iotlb_psi() dma addresses. (x86_64 mm and dma have the same size for pages at the moment, but this usage improves consistency.) Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Omer Peleg 提交于
The IOMMU's IOTLB invalidation is a costly process. When iommu mode is not set to "strict", it is done asynchronously. Current code amortizes the cost of invalidating IOTLB entries by batching all the invalidations in the system and performing a single global invalidation instead. The code queues pending invalidations in a global queue that is accessed under the global "async_umap_flush_lock" spinlock, which can result is significant spinlock contention. This patch splits this deferred queue into multiple per-cpu deferred queues, and thus gets rid of the "async_umap_flush_lock" and its contention. To keep existing deferred invalidation behavior, it still invalidates the pending invalidations of all CPUs whenever a CPU reaches its watermark or a timeout occurs. Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
由 Omer Peleg 提交于
Currently, deferred flushes' info is striped between several lists in the flush tables. Instead, move all information about a specific flush to a single entry in this table. This patch does not introduce any functional change. Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il> [mad@cs.technion.ac.il: rebased and reworded the commit message] Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il> Reviewed-by: NShaohua Li <shli@fb.com> Reviewed-by: NBen Serebrin <serebrin@google.com> Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>
-
- 20 4月, 2016 1 次提交
-
-
由 Joerg Roedel 提交于
Remove the usage of of_parse_phandle_with_args() and replace it by the phandle-iterator implementation so that we can parse out all of the potentially present 128 stream-ids. Signed-off-by: NJoerg Roedel <jroedel@suse.de> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NRob Herring <robh@kernel.org>
-
- 15 4月, 2016 1 次提交
-
-
由 Dan Carpenter 提交于
"devid" needs to be signed for the error handling to work. Fixes: b097d11a ('iommu/amd: Manage iommu_group for ACPI HID devices') Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 13 4月, 2016 1 次提交
-
-
由 Borislav Petkov 提交于
Signed-off-by: NBorislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: iommu@lists.linux-foundation.org Cc: linux-pm@vger.kernel.org Cc: oprofile-list@lists.sf.net Link: http://lkml.kernel.org/r/1459801503-15600-8-git-send-email-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 12 4月, 2016 1 次提交
-
-
由 Jacek Lawrynowicz 提交于
Solve IOMMU support issues with PCIe non-transparent bridges that use Requester ID look-up tables (RID-LUT), e.g., the PEX8733. The NTB connects devices in two independent PCI domains. Devices separated by the NTB are not able to discover each other. A PCI packet being forwared from one domain to another has to have its RID modified so it appears on correct bus and completions are forwarded back to the original domain through the NTB. The RID is translated using a preprogrammed table (LUT) and the PCI packet propagates upstream away from the NTB. If the destination system has IOMMU enabled, the packet will be discarded because the new RID is unknown to the IOMMU. Adding a DMA alias for the new RID allows IOMMU to properly recognize the packet. Each device behind the NTB has a unique RID assigned in the RID-LUT. The current DMA alias implementation supports only a single alias, so it's not possible to support mutiple devices behind the NTB when IOMMU is enabled. Enable all possible aliases on a given bus (256) that are stored in a bitset. Alias devfn is directly translated to a bit number. The bitset is not allocated for devices that have no need for DMA aliases. More details can be found in the following article: http://www.plxtech.com/files/pdf/technical/expresslane/RTC_Enabling%20MulitHostSystemDesigns.pdfSigned-off-by: NJacek Lawrynowicz <jacek.lawrynowicz@intel.com> Signed-off-by: NBjorn Helgaas <bhelgaas@google.com> Reviewed-by: NAlex Williamson <alex.williamson@redhat.com> Acked-by: NDavid Woodhouse <David.Woodhouse@intel.com> Acked-by: NJoerg Roedel <jroedel@suse.de>
-