提交 · 5fb190b0b52552de880536d4f409c4300c25e3d4 · openeuler / Kernel

05 11月, 2019 7 次提交

iommu/io-pgtable-arm: Simplify level indexing · 5fb190b0

由 Robin Murphy 提交于 10月 25, 2019

The nature of the LPAE format means that data->pg_shift is always
redundant with data->bits_per_level, since they represent the size of a
page and the number of PTEs per page respectively, and the size of a PTE
is constant. Thus it works out more efficient to only store the latter,
and derive the former via a trivial addition where necessary.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: Reworked granule check in iopte_to_paddr()]
Signed-off-by: NWill Deacon <will@kernel.org>

5fb190b0

iommu/io-pgtable-arm: Simplify PGD size handling · c79278c1

由 Robin Murphy 提交于 10月 25, 2019

We use data->pgd_size directly for the one-off allocation and freeing of
the top-level table, but otherwise it serves for ARM_LPAE_PGD_IDX() to
repeatedly re-calculate the effective number of top-level address bits
it represents. Flip this around so we store the form we most commonly
need, and derive the lesser-used one instead. This cuts a whole bunch of
code out of the map/unmap/iova_to_phys fast-paths.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

c79278c1

iommu/io-pgtable-arm: Simplify start level lookup · 594ab90f

由 Robin Murphy 提交于 10月 25, 2019

Beyond a couple of allocation-time calculations, data->levels is only
ever used to derive the start level. Storing the start level directly
leads to a small reduction in object code, which should help eke out a
little more efficiency, and slightly more readable source to boot.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

594ab90f

iommu/io-pgtable-arm: Simplify bounds checks · 67f3e53d

由 Robin Murphy 提交于 10月 25, 2019

We're merely checking that the relevant upper bits of each address
are all zero, so there are cheaper ways to achieve that.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

67f3e53d

iommu/io-pgtable-arm: Rationalise size check · f7b90d2c

由 Robin Murphy 提交于 10月 25, 2019

It makes little sense to only validate the requested size after we think
we've found a matching block size - making the check up-front is simple,
and far more logical than waiting to walk off the bottom of the table to
infer that we must have been passed a bogus size to start with.

We're missing an equivalent check on the unmap path, so add that as well
for consistency.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

f7b90d2c

iommu/io-pgtable: Make selftest gubbins consistently __init · b5813c16

由 Robin Murphy 提交于 10月 25, 2019

The selftests run as an initcall, but the annotation of the various
callbacks and data seems to be somewhat arbitrary. Add it consistently
for everything related to the selftests.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

b5813c16

iommu: arm-smmu-impl: Add sdm845 implementation hook · 759aaa10

由 Vivek Gautam 提交于 9月 20, 2019

Add reset hook for sdm845 based platforms to turn off
the wait-for-safe sequence.

Understanding how wait-for-safe logic affects USB and UFS performance
on MTP845 and DB845 boards:

Qcom's implementation of arm,mmu-500 adds a WAIT-FOR-SAFE logic
to address under-performance issues in real-time clients, such as
Display, and Camera.
On receiving an invalidation requests, the SMMU forwards SAFE request
to these clients and waits for SAFE ack signal from real-time clients.
The SAFE signal from such clients is used to qualify the start of
invalidation.
This logic is controlled by chicken bits, one for each - MDP (display),
IFE0, and IFE1 (camera), that can be accessed only from secure software
on sdm845.

This configuration, however, degrades the performance of non-real time
clients, such as USB, and UFS etc. This happens because, with wait-for-safe
logic enabled the hardware tries to throttle non-real time clients while
waiting for SAFE ack signals from real-time clients.

On mtp845 and db845 devices, with wait-for-safe logic enabled by the
bootloaders we see degraded performance of USB and UFS when kernel
enables the smmu stage-1 translations for these clients.
Turn off this wait-for-safe logic from the kernel gets us back the perf
of USB and UFS devices until we re-visit this when we start seeing perf
issues on display/camera on upstream supported SDM845 platforms.
The bootloaders on these boards implement secure monitor callbacks to
handle a specific command - QCOM_SCM_SVC_SMMU_PROGRAM with which the
logic can be toggled.

There are other boards such as cheza whose bootloaders don't enable this
logic. Such boards don't implement callbacks to handle the specific SCM
call so disabling this logic for such boards will be a no-op.

This change is inspired by the downstream change from Patrick Daly
to address performance issues with display and camera by handling
this wait-for-safe within separte io-pagetable ops to do TLB
maintenance. So a big thanks to him for the change and for all the
offline discussions.

Without this change the UFS reads are pretty slow:
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=10 conv=sync
10+0 records in
10+0 records out
10485760 bytes (10.0MB) copied, 22.394903 seconds, 457.2KB/s
real    0m 22.39s
user    0m 0.00s
sys     0m 0.01s

With this change they are back to rock!
$ time dd if=/dev/sda of=/dev/zero bs=1048576 count=300 conv=sync
300+0 records in
300+0 records out
314572800 bytes (300.0MB) copied, 1.030541 seconds, 291.1MB/s
real    0m 1.03s
user    0m 0.00s
sys     0m 0.54s
Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Reviewed-by: NStephen Boyd <swboyd@chromium.org>
Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: NWill Deacon <will@kernel.org>

759aaa10

02 11月, 2019 1 次提交

iommu/arm-smmu: Avoid pathological RPM behaviour for unmaps · ee9bdfed

由 Rob Clark 提交于 10月 31, 2019

When games, browser, or anything using a lot of GPU buffers exits, there
can be many hundreds or thousands of buffers to unmap and free.  If the
GPU is otherwise suspended, this can cause arm-smmu to resume/suspend
for each buffer, resulting 5-10 seconds worth of reprogramming the
context bank (arm_smmu_write_context_bank()/arm_smmu_write_s2cr()/etc).
To the user it would appear that the system just locked up.

A simple solution is to use pm_runtime_put_autosuspend() instead, so we
don't immediately suspend the SMMU device.
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NRob Clark <robdclark@chromium.org>
Signed-off-by: NWill Deacon <will@kernel.org>

ee9bdfed

01 10月, 2019 10 次提交

iommu/arm-smmu: Axe a useless test in 'arm_smmu_master_alloc_smes()' · bdde4718

由 Christophe JAILLET 提交于 9月 15, 2019

'iommu_group_get_for_dev()' never returns NULL, so this test can be removed.
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NWill Deacon <will@kernel.org>

bdde4718

iommu/io-pgtable: Move some initialization data to .init.rodata · 9062c1d0

由 Christophe JAILLET 提交于 9月 09, 2019

The memory used by '__init' functions can be freed once the initialization
phase has been performed.

Mark some 'static const' array defined and used within some '__init'
functions as '__initconst', so that the corresponding data can also be
discarded.

Without '__initconst', the data are put in the .rodata section.
With the qualifier, they are put in the .init.rodata section.

With gcc 8.3.0, the following changes have been measured:

Without '__initconst':
   section      size
  .rodata       00000720
  .init.rodata  00000018

With '__initconst':
   section      size
  .rodata       00000660
  .init.rodata  00000058
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NWill Deacon <will@kernel.org>

9062c1d0

iommu/arm-smmu: Report USF more clearly · 931a0ba6

由 Robin Murphy 提交于 9月 17, 2019

Although CONFIG_ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT is a welcome tool
for smoking out inadequate firmware, the failure mode is non-obvious
and can be confusing for end users. Add some special-case reporting of
Unidentified Stream Faults to help clarify this particular symptom.
Since we're adding yet another print to the mix, also break out an
explicit ratelimit state to make sure everything stays together (and
reduce the static storage footprint a little).
Reviewed-by: NDouglas Anderson <dianders@chromium.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

931a0ba6

iommu/arm-smmu: Remove arm_smmu_flush_ops · 696bcfb7

由 Robin Murphy 提交于 9月 18, 2019

Now it's just an empty wrapper.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

696bcfb7

iommu/arm-smmu: Move .tlb_sync method to implementation · ae2b60f3

由 Robin Murphy 提交于 9月 18, 2019

With the .tlb_sync interface no longer exposed directly to io-pgtable,
strip away the remains of that abstraction layer. Retain the callback
in spirit, though, by transforming it into an implementation override
for the low-level sync routine itself, for which we will have at least
one user.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

ae2b60f3

iommu/arm-smmu: Remove "leaf" indirection · 3370cb6b

由 Robin Murphy 提交于 9月 18, 2019

Now that the "leaf" flag is no longer part of an external interface,
there's no need to use it to infer a register offset at runtime when
we can just as easily encode the offset directly in its place.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

3370cb6b

iommu/arm-smmu: Remove .tlb_inv_range indirection · 3f3b8d0c

由 Robin Murphy 提交于 9月 18, 2019

Fill in 'native' iommu_flush_ops callbacks for all the
arm_smmu_flush_ops variants, and clear up the remains of the previous
.tlb_inv_range abstraction.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

3f3b8d0c

iommu/io-pgtable-arm: Support all Mali configurations · 1be08f45

由 Robin Murphy 提交于 9月 30, 2019

In principle, Midgard GPUs supporting smaller VA sizes should only
require 3-level pagetables, since level 0 only resolves bits 48:40 of
the address. However, the kbase driver does not appear to have any
notion of a variable start level, and empirically T720 and T820 rapidly
blow up with translation faults unless given a full 4-level table,
despite only supporting a 33-bit VA size.

The 'real' IAS value is still valuable in terms of validating addresses
on map/unmap, so tweak the allocator to allow smaller values while still
forcing the resultant tables to the full 4 levels. As far as I can test,
this should make all known Midgard variants happy.

Fixes: d08d42de ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format")
Tested-by: NNeil Armstrong <narmstrong@baylibre.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

1be08f45

iommu/io-pgtable-arm: Correct Mali attributes · 52f325f4

由 Robin Murphy 提交于 9月 30, 2019

Whilst Midgard's MEMATTR follows a similar principle to the VMSA MAIR,
the actual attribute values differ, so although it currently appears to
work to some degree, we probably shouldn't be using our standard stage 1
MAIR for that. Instead, generate a reasonable MEMATTR with attribute
values borrowed from the kbase driver; at this point we'll be overriding
or ignoring pretty much all of the LPAE config, so just implement these
Mali details in a dedicated allocator instead of pretending to subclass
the standard VMSA format.

Fixes: d08d42de ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format")
Tested-by: NNeil Armstrong <narmstrong@baylibre.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

52f325f4

iommu/arm-smmu: Free context bitmap in the err path of arm_smmu_init_domain_context · 6db7bfb4

由 Liu Xiang 提交于 9月 16, 2019

When alloc_io_pgtable_ops is failed, context bitmap which is just allocated
by __arm_smmu_alloc_bitmap should be freed to release the resource.
Signed-off-by: NLiu Xiang <liuxiang_1999@126.com>
Signed-off-by: NWill Deacon <will@kernel.org>

6db7bfb4

28 9月, 2019 6 次提交

iommu/amd: Lock code paths traversing protection_domain->dev_list · 2a78f996

由 Joerg Roedel 提交于 9月 25, 2019

The traversing of this list requires protection_domain->lock to be taken
to avoid nasty races with attach/detach code. Make sure the lock is held
on all code-paths traversing this list.
Reported-by: NFilippo Sironi <sironi@amazon.de>
Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
Reviewed-by: NFilippo Sironi <sironi@amazon.de>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

2a78f996

iommu/amd: Lock dev_data in attach/detach code paths · ab7b2577

由 Joerg Roedel 提交于 9月 25, 2019

Make sure that attaching a detaching a device can't race against each
other and protect the iommu_dev_data with a spin_lock in these code
paths.

Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
Reviewed-by: NFilippo Sironi <sironi@amazon.de>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

ab7b2577

iommu/amd: Check for busy devices earlier in attach_device() · 45e528d9

由 Joerg Roedel 提交于 9月 25, 2019

Check early in attach_device whether the device is already attached to a
domain. This also simplifies the code path so that __attach_device() can
be removed.

Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
Reviewed-by: NFilippo Sironi <sironi@amazon.de>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

45e528d9

iommu/amd: Take domain->lock for complete attach/detach path · f6c0bfce

由 Joerg Roedel 提交于 9月 25, 2019

The code-paths before __attach_device() and __detach_device() are called
also access and modify domain state, so take the domain lock there too.
This allows to get rid of the __detach_device() function.

Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
Reviewed-by: NFilippo Sironi <sironi@amazon.de>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f6c0bfce

iommu/amd: Remove amd_iommu_devtable_lock · 3a11905b

由 Joerg Roedel 提交于 9月 25, 2019

The lock is not necessary because the device table does not
contain shared state that needs protection. Locking is only
needed on an individual entry basis, and that needs to
happen on the iommu_dev_data level.

Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
Reviewed-by: NFilippo Sironi <sironi@amazon.de>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3a11905b

iommu/amd: Remove domain->updated · f15d9a99

由 Joerg Roedel 提交于 9月 25, 2019

This struct member was used to track whether a domain
change requires updates to the device-table and IOMMU cache
flushes. The problem is, that access to this field is racy
since locking in the common mapping code-paths has been
eliminated.

Move the updated field to the stack to get rid of all
potential races and remove the field from the struct.

Fixes: 92d420ec ("iommu/amd: Relax locking in dma_ops path")
Reviewed-by: NFilippo Sironi <sironi@amazon.de>
Reviewed-by: NJerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f15d9a99

24 9月, 2019 5 次提交

iommu/amd: Wait for completion of IOTLB flush in attach_device · 0b15e02f

由 Filippo Sironi 提交于 9月 10, 2019

To make sure the domain tlb flush completes before the
function returns, explicitly wait for its completion.
Signed-off-by: NFilippo Sironi <sironi@amazon.de>
Fixes: 42a49f96 ("amd-iommu: flush domain tlb when attaching a new device")
[joro: Added commit message and fixes tag]
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

0b15e02f

iommu/amd: Unmap all L7 PTEs when downgrading page-sizes · cc449541

由 Andrei Dulea 提交于 9月 13, 2019

When replacing a large mapping created with page-mode 7 (i.e.
non-default page size), tear down the entire series of replicated PTEs.
Besides providing access to the old mapping, another thing that might go
wrong with this issue is on the fetch_pte() code path that can return a
PDE entry of the newly re-mapped range.

While at it, make sure that we flush the TLB in case alloc_pte() fails
and returns NULL at a lower level.

Fixes: 6d568ef9 ("iommu/amd: Allow downgrading page-sizes in alloc_pte()")
Signed-off-by: NAndrei Dulea <adulea@amazon.de>

cc449541

iommu/amd: Introduce first_pte_l7() helper · 7f1f1683

由 Andrei Dulea 提交于 9月 13, 2019

Given an arbitrary pte that is part of a large mapping, this function
returns the first pte of the series (and optionally the mapped size and
number of PTEs)
It will be re-used in a subsequent patch to replace an existing L7
mapping.

Fixes: 6d568ef9 ("iommu/amd: Allow downgrading page-sizes in alloc_pte()")
Signed-off-by: NAndrei Dulea <adulea@amazon.de>

7f1f1683

iommu/amd: Fix downgrading default page-sizes in alloc_pte() · 6ccb72f8

由 Andrei Dulea 提交于 9月 13, 2019

Downgrading an existing large mapping to a mapping using smaller
page-sizes works only for the mappings created with page-mode 7 (i.e.
non-default page size).

Treat large mappings created with page-mode 0 (i.e. default page size)
like a non-present mapping and allow to overwrite it in alloc_pte().

While around, make sure that we flush the TLB only if we change an
existing mapping, otherwise we might end up acting on garbage PTEs.

Fixes: 6d568ef9 ("iommu/amd: Allow downgrading page-sizes in alloc_pte()")
Signed-off-by: NAndrei Dulea <adulea@amazon.de>

6ccb72f8

iommu/amd: Fix pages leak in free_pagetable() · 34c0989c

由 Andrei Dulea 提交于 9月 13, 2019

Take into account the gathered freelist in free_sub_pt(), otherwise we
end up leaking all that pages.

Fixes: 409afa44 ("iommu/amd: Introduce free_sub_pt() function")
Signed-off-by: NAndrei Dulea <adulea@amazon.de>

34c0989c

14 9月, 2019 1 次提交

iommu: pass cell_count = -1 to of_for_each_phandle with cells_name · c680e9ab

由 Uwe Kleine-König 提交于 8月 24, 2019

Currently of_for_each_phandle ignores the cell_count parameter when a
cells_name is given. I intend to change that and let the iterator fall
back to a non-negative cell_count if the cells_name property is missing
in the referenced node.

To not change how existing of_for_each_phandle's users iterate, fix them
to pass cell_count = -1 when also cells_name is given which yields the
expected behaviour with and without my change.
Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NRob Herring <robh@kernel.org>

c680e9ab

11 9月, 2019 6 次提交

iommu/vt-d: Declare Broadwell igfx dmar support snafu · 1f76249c

由 Chris Wilson 提交于 9月 09, 2019

Despite the widespread and complete failure of Broadwell integrated
graphics when DMAR is enabled, known over the years, we have never been
able to root cause the issue. Instead, we let the failure undermine our
confidence in the iommu system itself when we should be pushing for it to
be always enabled. Quirk away Broadwell and remove the rotten apple.

References: https://bugs.freedesktop.org/show_bug.cgi?id=89360Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

1f76249c

iommu/vt-d: Add Scalable Mode fault information · fd730007

由 Kyung Min Park 提交于 9月 06, 2019

Intel VT-d specification revision 3 added support for Scalable Mode
Translation for DMA remapping. Add the Scalable Mode fault reasons to
show detailed fault reasons when the translation fault happens.

Link: https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdfReviewed-by: NSohil Mehta <sohil.mehta@intel.com>
Signed-off-by: NKyung Min Park <kyung.min.park@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

fd730007

iommu/vt-d: Use bounce buffer for untrusted devices · cfb94a37

由 Lu Baolu 提交于 9月 06, 2019

The Intel VT-d hardware uses paging for DMA remapping.
The minimum mapped window is a page size. The device
drivers may map buffers not filling the whole IOMMU
window. This allows the device to access to possibly
unrelated memory and a malicious device could exploit
this to perform DMA attacks. To address this, the
Intel IOMMU driver will use bounce pages for those
buffers which don't fill whole IOMMU pages.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NXu Pengfei <pengfei.xu@intel.com>
Tested-by: NMika Westerberg <mika.westerberg@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

cfb94a37

iommu/vt-d: Add trace events for device dma map/unmap · 3b53034c

由 Lu Baolu 提交于 9月 06, 2019

This adds trace support for the Intel IOMMU driver. It
also declares some events which could be used to trace
the events when an IOVA is being mapped or unmapped in
a domain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3b53034c

iommu/vt-d: Don't switch off swiotlb if bounce page is used · c5a5dc4c

由 Lu Baolu 提交于 9月 06, 2019

The bounce page implementation depends on swiotlb. Hence, don't
switch off swiotlb if the system has untrusted devices or could
potentially be hot-added with any untrusted devices.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

c5a5dc4c

iommu/vt-d: Check whether device requires bounce buffer · e5e04d05

由 Lu Baolu 提交于 9月 06, 2019

This adds a helper to check whether a device needs to
use bounce buffer. It also provides a boot time option
to disable the bounce buffer. Users can use this to
prevent the iommu driver from using the bounce buffer
for performance gain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NXu Pengfei <pengfei.xu@intel.com>
Tested-by: NMika Westerberg <mika.westerberg@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e5e04d05

06 9月, 2019 3 次提交

iommu/omap: Mark pm functions __maybe_unused · 96088a20

由 Arnd Bergmann 提交于 9月 06, 2019

The runtime_pm functions are unused when CONFIG_PM is disabled:

drivers/iommu/omap-iommu.c:1022:12: error: unused function 'omap_iommu_runtime_suspend' [-Werror,-Wunused-function]
static int omap_iommu_runtime_suspend(struct device *dev)
drivers/iommu/omap-iommu.c:1064:12: error: unused function 'omap_iommu_runtime_resume' [-Werror,-Wunused-function]
static int omap_iommu_runtime_resume(struct device *dev)

Mark them as __maybe_unused to let gcc silently drop them
instead of warning.

Fixes: db8918f6 ("iommu/omap: streamline enable/disable through runtime pm callbacks")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NSuman Anna <s-anna@ti.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

96088a20

iommu/amd: Fix race in increase_address_space() · 754265bc

由 Joerg Roedel 提交于 9月 06, 2019

After the conversion to lock-less dma-api call the
increase_address_space() function can be called without any
locking. Multiple CPUs could potentially race for increasing
the address space, leading to invalid domain->mode settings
and invalid page-tables. This has been happening in the wild
under high IO load and memory pressure.

Fix the race by locking this operation. The function is
called infrequently so that this does not introduce
a performance regression in the dma-api path again.
Reported-by: NQian Cai <cai@lca.pw>
Fixes: 256e4621 ('iommu/amd: Make use of the generic IOVA allocator')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

754265bc

iommu/amd: Flush old domains in kdump kernel · 36b7200f

由 Stuart Hayes 提交于 9月 05, 2019

When devices are attached to the amd_iommu in a kdump kernel, the old device
table entries (DTEs), which were copied from the crashed kernel, will be
overwritten with a new domain number.  When the new DTE is written, the IOMMU
is told to flush the DTE from its internal cache--but it is not told to flush
the translation cache entries for the old domain number.

Without this patch, AMD systems using the tg3 network driver fail when kdump
tries to save the vmcore to a network system, showing network timeouts and
(sometimes) IOMMU errors in the kernel log.

This patch will flush IOMMU translation cache entries for the old domain when
a DTE gets overwritten with a new domain number.
Signed-off-by: NStuart Hayes <stuart.w.hayes@gmail.com>
Fixes: 3ac3e5ee ('iommu/amd: Copy old trans table from old kernel')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

36b7200f

05 9月, 2019 1 次提交

iommu/ipmmu-vmsa: Disable cache snoop transactions on R-Car Gen3 · 3623002f

由 Hai Nguyen Pham 提交于 9月 04, 2019

According to the Hardware Manual Errata for Rev. 1.50 of April 10, 2019,
cache snoop transactions for page table walk requests are not supported
on R-Car Gen3.

Hence, this patch removes setting these fields in the IMTTBCR register,
since it will have no effect, and adds comments to the register bit
definitions, to make it clear they apply to R-Car Gen2 only.
Signed-off-by: NHai Nguyen Pham <hai.pham.ud@renesas.com>
[geert: Reword, add comments]
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: NSimon Horman <horms+renesas@verge.net.au>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3623002f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功