提交 · bca6b146b15bd81b99947e60b45ceceb5e10582f · openeuler / Kernel

16 7月, 2021 3 次提交

iommu/io-pgtable-arm: Add and realize merge_page ops · bca6b146

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

If block(largepage) mappings are split during start dirty log, then
when stop dirty log, we need to recover them for better DMA performance.

This recovers block mappings and unmap the span of page mappings. BBML1
or BBML2 feature is required.

Merging page is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

bca6b146

iommu/io-pgtable-arm: Add and realize split_block ops · 8b8bda8e

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

Block(largepage) mapping is not a proper granule for dirty log tracking.
Take an extreme example, if DMA writes one byte, under 1G mapping, the
dirty amount reported is 1G, but under 4K mapping, the dirty amount is
just 4K.

This splits block descriptor to an span of page descriptors. BBML1 or
BBML2 feature is required.

Spliting block is designed to be only used by dirty log tracking, which
does not concurrently work with other pgtable ops that access underlying
page table, so race condition does not exist.
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

8b8bda8e

iommu/io-pgtable-arm: Add quirk ARM_HD and ARM_BBMLx · 341497bb

由 Kunkun Jiang 提交于 7月 15, 2021

virt inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3ZUKK
CVE: NA

------------------------------

These features are essential to support dirty log tracking for
SMMU with io-pgtable mapping.

The dirty state information is encoded using the access permission
bits AP[2] (stage 1) or S2AP[1] (stage 2) in conjunction with the
DBM (Dirty Bit Modifier) bit, where DBM means writable and AP[2]/
S2AP[1] means dirty.

When has ARM_HD, we set DBM bit for S1 mapping. As SMMU nested
mode is not upstreamed for now, we just aim to support dirty
log tracking for stage1 with io-pgtable mapping (means not support
SVA).
Co-developed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NKunkun Jiang <jiangkunkun@huawei.com>
Reviewed-by: NKeqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

341497bb

03 6月, 2021 1 次提交

iommu/io-pgtable: Remove tlb_flush_leaf · d629ae12

由 Robin Murphy 提交于 5月 17, 2021

mainline inclusion
from mainline-5.11-rc1
commit fefe8527
category: feature
bugzilla: 51855
CVE: NA

---------------------------------------------

The only user of tlb_flush_leaf is a particularly hairy corner of the
Arm short-descriptor code, which wants a synchronous invalidation to
minimise the races inherent in trying to split a large page mapping.
This is already far enough into "here be dragons" territory that no
sensible caller should ever hit it, and thus it really doesn't need
optimising. Although using tlb_flush_walk there may technically be
more heavyweight than needed, it does the job and saves everyone else
having to carry around useless baggage.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/9844ab0c5cb3da8b2f89c6c2da16941910702b41.1606324115.git.robin.murphy@arm.comSigned-off-by: NWill Deacon <will@kernel.org>
Signed-off-by: NLijun Fang <fanglijun3@huawei.com>
Reviewed-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

d629ae12

19 2月, 2021 1 次提交

iommu/io-pgtable-arm: Support coherency for Mali LPAE · 4267f7a8

由 Robin Murphy 提交于 2月 09, 2021

stable inclusion
from stable-5.10.14
commit b584862004020b3a555c48b549ed445d0a27e7e5
bugzilla: 48051

--------------------------------

commit 728da60d upstream.

Midgard GPUs have ACE-Lite master interfaces which allows systems to
integrate them in an I/O-coherent manner. It seems that from the GPU's
viewpoint, the rest of the system is its outer shareable domain, and so
even when snoop signals are wired up, they are only emitted for outer
shareable accesses. As such, setting the TTBR_SHARE_OUTER bit does
indeed get coherent pagetable walks working nicely for the coherent
T620 in the Arm Juno SoC.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NNeil Armstrong <narmstrong@baylibre.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Acked-by: NWill Deacon <will@kernel.org>
Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8df778355378127ea7eccc9521d6427e3e48d4f2.1600780574.git.robin.murphy@arm.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Acked-by: NXie XiuQi <xiexiuqi@huawei.com>

4267f7a8

29 9月, 2020 1 次提交

iommu/io-pgtable-arm: Move some definitions to a header · 7cef39dd

由 Jean-Philippe Brucker 提交于 9月 18, 2020

Extract some of the most generic TCR defines, so they can be reused by
the page table sharing code.
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200918101852.582559-6-jean-philippe@linaro.orgSigned-off-by: NWill Deacon <will@kernel.org>

7cef39dd

22 9月, 2020 1 次提交

iommu/io-pgtable-arm: Clean up faulty sanity check · b9bb694b

由 Robin Murphy 提交于 9月 21, 2020

Checking for a nonzero dma_pfn_offset was a quick shortcut to validate
whether the DMA == phys assumption could hold at all. Checking for a
non-NULL dma_range_map is not quite equivalent, since a map may be
present to describe a limited DMA window even without an offset, and
thus this check can now yield false positives.

However, it only ever served to short-circuit going all the way through
to __arm_lpae_alloc_pages(), failing the canonical test there, and
having a bit more to clean up. As such, we can simply remove it without
loss of correctness.
Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NChristoph Hellwig <hch@lst.de>

b9bb694b

18 9月, 2020 1 次提交

dma-mapping: introduce DMA range map, supplanting dma_pfn_offset · e0d07278

由 Jim Quinlan 提交于 9月 17, 2020

The new field 'dma_range_map' in struct device is used to facilitate the
use of single or multiple offsets between mapping regions of cpu addrs and
dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
capable of holding a single uniform offset and had no region bounds
checking.

The function of_dma_get_range() has been modified so that it takes a single
argument -- the device node -- and returns a map, NULL, or an error code.
The map is an array that holds the information regarding the DMA regions.
Each range entry contains the address offset, the cpu_start address, the
dma_start address, and the size of the region.

of_dma_configure() is the typical manner to set range offsets but there are
a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
driver code.  These cases now invoke the function
dma_direct_set_offset(dev, cpu_addr, dma_addr, size).
Signed-off-by: NJim Quinlan <james.quinlan@broadcom.com>
[hch: various interface cleanups]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: NMathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: NNathan Chancellor <natechancellor@gmail.com>

e0d07278

24 7月, 2020 1 次提交

iommu: Add gfp parameter to io_pgtable_ops->map() · f34ce7a7

由 Baolin Wang 提交于 6月 12, 2020

Now the ARM page tables are always allocated by GFP_ATOMIC parameter,
but the iommu_ops->map() function has been added a gfp_t parameter by
commit 781ca2de ("iommu: Add gfp parameter to iommu_ops::map"),
thus io_pgtable_ops->map() should use the gfp parameter passed from
iommu_ops->map() to allocate page pages, which can avoid wasting the
memory allocators atomic pools for some non-atomic contexts.
Signed-off-by: NBaolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: NWill Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/3093df4cb95497aaf713fca623ce4ecebb197c2e.1591930156.git.baolin.wang@linux.alibaba.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

f34ce7a7

09 7月, 2020 1 次提交

iommu: Remove unused IOMMU_SYS_CACHE_ONLY flag · ecd7274f

由 Will Deacon 提交于 6月 04, 2020

The IOMMU_SYS_CACHE_ONLY flag was never exposed via the DMA API and
has no in-tree users. Remove it.

Cc: Robin Murphy <robin.murphy@arm.com>
Cc: "Isaac J. Manjarres" <isaacm@codeaurora.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Signed-off-by: NWill Deacon <will@kernel.org>

ecd7274f

03 3月, 2020 1 次提交

iommu/io-pgtable-arm: Fix IOVA validation for 32-bit · 08090744

由 Robin Murphy 提交于 2月 28, 2020

Since we ony support the TTB1 quirk for AArch64 contexts, and
consequently only for 64-bit builds, the sign-extension aspect of the
"are all bits above IAS consistent?" check should implicitly only apply
to 64-bit IOVAs. Change the type of the cast to ensure that 32-bit longs
don't inadvertently get sign-extended, and thus considered invalid, if
they happen to be above 2GB in the TTB0 region.
Reported-by: NStephan Gerhold <stephan@gerhold.net>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NAcked-by: Will Deacon <will@kernel.org>
Fixes: db690301 ("iommu/io-pgtable-arm: Prepare for TTBR1 usage")
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

08090744

10 1月, 2020 7 次提交

iommu/io-pgtable-arm: Prepare for TTBR1 usage · db690301

由 Robin Murphy 提交于 10月 25, 2019

Now that we can correctly extract top-level indices without relying on
the remaining upper bits being zero, the only remaining impediments to
using a given table for TTBR1 are the address validation on map/unmap
and the awkward TCR translation granule format. Add a quirk so that we
can do the right thing at those points.
Tested-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

db690301

iommu/io-pgtable-arm: Rationalise VTCR handling · ac4b80e5

由 Will Deacon 提交于 1月 10, 2020

Commit 05a648cd2dd7 ("iommu/io-pgtable-arm: Rationalise TCR handling")
reworked the way in which the TCR register value is returned from the
io-pgtable code when targetting the Arm long-descriptor format, in
preparation for allowing page-tables to target TTBR1.

As it turns out, the new interface is a lot nicer to use, so do the same
conversion for the VTCR register even though there is only a single base
register for stage-2 translation.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

ac4b80e5

iommu/io-pgtable-arm: Rationalise TCR handling · fb485eb1

由 Robin Murphy 提交于 10月 25, 2019

Although it's conceptually nice for the io_pgtable_cfg to provide a
standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU
looks exactly like an Arm CPU, and they all have various other TCR
controls which io-pgtable can't be expected to understand. Thus since
there is an expectation that drivers will have to add to the given TCR
value anyway, let's strip it down to just the essentials that are
directly relevant to io-pgtable's inner workings - namely the various
sizes and the walk attributes.
Tested-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: Add missing include of bitfield.h]
Signed-off-by: NWill Deacon <will@kernel.org>

fb485eb1

iommu/io-pgtable-arm: Ensure ARM_64_LPAE_S2_TCR_RES1 is unsigned · 6f932ad3

由 Will Deacon 提交于 1月 10, 2020

ARM_64_LPAE_S2_TCR_RES1 is intended to map to bit 31 of the VTCR register,
which is required to be set to 1 by the architecture. Unfortunately, we
accidentally treat this as a signed quantity which means we also set the
upper 32 bits of the VTCR to one, and they are required to be zero.

Treat ARM_64_LPAE_S2_TCR_RES1 as unsigned to avoid the unwanted
sign-extension up to 64 bits.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

6f932ad3

iommu/io-pgtable-arm: Improve attribute handling · 7618e479

由 Robin Murphy 提交于 1月 10, 2020

By VMSA rules, using Normal Non-Cacheable type with a shareability
attribute of anything other than Outer Shareable is liable to lead into
unpredictable territory:

| Overlaying the shareability attribute (B3-1377, ARM DDI 0406C.c)
|
| A memory region with a resultant memory type attribute of Normal, and
| a resultant cacheability attribute of Inner Non-cacheable, Outer
| Non-cacheable, must have a resultant shareability attribute of Outer
| Shareable, otherwise shareability is UNPREDICTABLE

Although the SMMU architectures seem to give some slightly stronger
guarantees of Non-Cacheable output types becoming implicitly Outer
Shareable in most cases, we may as well be explicit and not take any
chances. It's also weird that LPAE attribute handling is currently split
between prot_to_pte() and init_pte() given that it can all be statically
determined up-front. Thus, collect *all* the LPAE attributes into
prot_to_pte() in order to logically pick the shareability based on the
incoming IOMMU API prot value, and tweak the short-descriptor code to
stop setting TTBR0.NOS for Non-Cacheable walks.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

7618e479

iommu/io-pgtable-arm: Support non-coherent stage-2 page tables · 30d2acb6

由 Will Deacon 提交于 1月 10, 2020

Commit 9e6ea59f ("iommu/io-pgtable: Support non-coherent page tables")
added support for non-coherent page-table walks to the Arm IOMMU page-table
backends. Unfortunately, it left the stage-2 allocator unchanged, so let's
hook that up in the same way.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

30d2acb6

iommu/io-pgtable-arm: Rationalise TTBRn handling · d1e5f26f

由 Robin Murphy 提交于 10月 25, 2019

TTBR1 values have so far been redundant since no users implement any
support for split address spaces. Crucially, though, one of the main
reasons for wanting to do so is to be able to manage each half entirely
independently, e.g. context-switching one set of mappings without
disturbing the other. Thus it seems unlikely that tying two tables
together in a single io_pgtable_cfg would ever be particularly desirable
or useful.

Streamline the configs to just a single conceptual TTBR value
representing the allocated table. This paves the way for future users to
support split address spaces by simply allocating a table and dealing
with the detailed TTBRn logistics themselves.
Tested-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: Drop change to ttbr value]
Signed-off-by: NWill Deacon <will@kernel.org>

d1e5f26f

07 11月, 2019 1 次提交

iommu/io-pgtable-arm: Rename IOMMU_QCOM_SYS_CACHE and improve doc · dd5ddd3c

由 Will Deacon 提交于 10月 24, 2019

The 'IOMMU_QCOM_SYS_CACHE' IOMMU protection flag is exposed to all
users of the IOMMU API. Despite its name, the idea behind it isn't
especially tied to Qualcomm implementations and could conceivably be
used by other systems.

Rename it to 'IOMMU_SYS_CACHE_ONLY' and update the comment to describe
a bit better the idea behind it.

Cc: Robin Murphy <robin.murphy@arm.com>
Cc: "Isaac J. Manjarres" <isaacm@codeaurora.org>
Signed-off-by: NWill Deacon <will@kernel.org>

dd5ddd3c

05 11月, 2019 7 次提交

iommu/io-pgtable-arm: Rationalise MAIR handling · 205577ab

由 Robin Murphy 提交于 10月 25, 2019

Between VMSAv8-64 and the various 32-bit formats, there is either one
64-bit MAIR or a pair of 32-bit MAIR0/MAIR1 or NMRR/PMRR registers.
As such, keeping two 64-bit values in io_pgtable_cfg has always been
overkill.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

205577ab

iommu/io-pgtable-arm: Simplify level indexing · 5fb190b0

由 Robin Murphy 提交于 10月 25, 2019

The nature of the LPAE format means that data->pg_shift is always
redundant with data->bits_per_level, since they represent the size of a
page and the number of PTEs per page respectively, and the size of a PTE
is constant. Thus it works out more efficient to only store the latter,
and derive the former via a trivial addition where necessary.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: Reworked granule check in iopte_to_paddr()]
Signed-off-by: NWill Deacon <will@kernel.org>

5fb190b0

iommu/io-pgtable-arm: Simplify PGD size handling · c79278c1

由 Robin Murphy 提交于 10月 25, 2019

We use data->pgd_size directly for the one-off allocation and freeing of
the top-level table, but otherwise it serves for ARM_LPAE_PGD_IDX() to
repeatedly re-calculate the effective number of top-level address bits
it represents. Flip this around so we store the form we most commonly
need, and derive the lesser-used one instead. This cuts a whole bunch of
code out of the map/unmap/iova_to_phys fast-paths.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

c79278c1

iommu/io-pgtable-arm: Simplify start level lookup · 594ab90f

由 Robin Murphy 提交于 10月 25, 2019

Beyond a couple of allocation-time calculations, data->levels is only
ever used to derive the start level. Storing the start level directly
leads to a small reduction in object code, which should help eke out a
little more efficiency, and slightly more readable source to boot.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

594ab90f

iommu/io-pgtable-arm: Simplify bounds checks · 67f3e53d

由 Robin Murphy 提交于 10月 25, 2019

We're merely checking that the relevant upper bits of each address
are all zero, so there are cheaper ways to achieve that.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

67f3e53d

iommu/io-pgtable-arm: Rationalise size check · f7b90d2c

由 Robin Murphy 提交于 10月 25, 2019

It makes little sense to only validate the requested size after we think
we've found a matching block size - making the check up-front is simple,
and far more logical than waiting to walk off the bottom of the table to
infer that we must have been passed a bogus size to start with.

We're missing an equivalent check on the unmap path, so add that as well
for consistency.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

f7b90d2c

iommu/io-pgtable: Make selftest gubbins consistently __init · b5813c16

由 Robin Murphy 提交于 10月 25, 2019

The selftests run as an initcall, but the annotation of the various
callbacks and data seems to be somewhat arbitrary. Add it consistently
for everything related to the selftests.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

b5813c16

01 10月, 2019 3 次提交

iommu/io-pgtable: Move some initialization data to .init.rodata · 9062c1d0

由 Christophe JAILLET 提交于 9月 09, 2019

The memory used by '__init' functions can be freed once the initialization
phase has been performed.

Mark some 'static const' array defined and used within some '__init'
functions as '__initconst', so that the corresponding data can also be
discarded.

Without '__initconst', the data are put in the .rodata section.
With the qualifier, they are put in the .init.rodata section.

With gcc 8.3.0, the following changes have been measured:

Without '__initconst':
   section      size
  .rodata       00000720
  .init.rodata  00000018

With '__initconst':
   section      size
  .rodata       00000660
  .init.rodata  00000058
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NWill Deacon <will@kernel.org>

9062c1d0

iommu/io-pgtable-arm: Support all Mali configurations · 1be08f45

由 Robin Murphy 提交于 9月 30, 2019

In principle, Midgard GPUs supporting smaller VA sizes should only
require 3-level pagetables, since level 0 only resolves bits 48:40 of
the address. However, the kbase driver does not appear to have any
notion of a variable start level, and empirically T720 and T820 rapidly
blow up with translation faults unless given a full 4-level table,
despite only supporting a 33-bit VA size.

The 'real' IAS value is still valuable in terms of validating addresses
on map/unmap, so tweak the allocator to allow smaller values while still
forcing the resultant tables to the full 4 levels. As far as I can test,
this should make all known Midgard variants happy.

Fixes: d08d42de ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format")
Tested-by: NNeil Armstrong <narmstrong@baylibre.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

1be08f45

iommu/io-pgtable-arm: Correct Mali attributes · 52f325f4

由 Robin Murphy 提交于 9月 30, 2019

Whilst Midgard's MEMATTR follows a similar principle to the VMSA MAIR,
the actual attribute values differ, so although it currently appears to
work to some degree, we probably shouldn't be using our standard stage 1
MAIR for that. Instead, generate a reasonable MEMATTR with attribute
values borrowed from the kbase driver; at this point we'll be overriding
or ignoring pretty much all of the LPAE config, so just implement these
Mali details in a dedicated allocator instead of pretending to subclass
the standard VMSA format.

Fixes: d08d42de ("iommu: io-pgtable: Add ARM Mali midgard MMU page table format")
Tested-by: NNeil Armstrong <narmstrong@baylibre.com>
Reviewed-by: NSteven Price <steven.price@arm.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will@kernel.org>

52f325f4

30 7月, 2019 5 次提交

iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page() · 3951c41a

由 Will Deacon 提交于 7月 02, 2019

With all the pieces in place, we can finally propagate the
iommu_iotlb_gather structure from the call to unmap() down to the IOMMU
drivers' implementation of ->tlb_add_page(). Currently everybody ignores
it, but the machinery is now there to defer invalidation.
Signed-off-by: NWill Deacon <will@kernel.org>

3951c41a

iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap() · a2d3a382

由 Will Deacon 提交于 7月 02, 2019

Update the io-pgtable ->unmap() function to take an iommu_iotlb_gather
pointer as an argument, and update the callers as appropriate.
Signed-off-by: NWill Deacon <will@kernel.org>

a2d3a382

iommu/io-pgtable: Remove unused ->tlb_sync() callback · e953f7f2

由 Will Deacon 提交于 7月 02, 2019

The ->tlb_sync() callback is no longer used, so it can be removed.
Signed-off-by: NWill Deacon <will@kernel.org>

e953f7f2

iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page() · abfd6fe0

由 Will Deacon 提交于 7月 02, 2019

The ->tlb_add_flush() callback in the io-pgtable API now looks a bit
silly:

  - It takes a size and a granule, which are always the same
  - It takes a 'bool leaf', which is always true
  - It only ever flushes a single page

With that in mind, replace it with an optional ->tlb_add_page() callback
that drops the useless parameters.
Signed-off-by: NWill Deacon <will@kernel.org>

abfd6fe0

iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf() · 10b7a7d9

由 Will Deacon 提交于 7月 02, 2019

Now that all IOMMU drivers using the io-pgtable API implement the
->tlb_flush_walk() and ->tlb_flush_leaf() callbacks, we can use them in
the io-pgtable code instead of ->tlb_add_flush() immediately followed by
->tlb_sync().
Signed-off-by: NWill Deacon <will@kernel.org>

10b7a7d9

24 7月, 2019 2 次提交

iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops · 298f7889

由 Will Deacon 提交于 7月 02, 2019

In preparation for TLB flush gathering in the IOMMU API, rename the
iommu_gather_ops structure in io-pgtable to iommu_flush_ops, which
better describes its purpose and avoids the potential for confusion
between different levels of the API.

$ find linux/ -type f -name '*.[ch]' | xargs sed -i 's/gather_ops/flush_ops/g'
Signed-off-by: NWill Deacon <will@kernel.org>

298f7889

iommu/io-pgtable-arm: Remove redundant call to io_pgtable_tlb_sync() · f71da467

由 Will Deacon 提交于 7月 02, 2019

Commit b6b65ca2 ("iommu/io-pgtable-arm: Add support for non-strict
mode") added an unconditional call to io_pgtable_tlb_sync() immediately
after the case where we replace a block entry with a table entry during
an unmap() call. This is redundant, since the IOMMU API will call
iommu_tlb_sync() on this path and the patch in question mentions this:

 | To save having to reason about it too much, make sure the invalidation
 | in arm_lpae_split_blk_unmap() just performs its own unconditional sync
 | to minimise the window in which we're technically violating the break-
 | before-make requirement on a live mapping. This might work out redundant
 | with an outer-level sync for strict unmaps, but we'll never be splitting
 | blocks on a DMA fastpath anyway.

However, this sync gets in the way of deferred TLB invalidation for leaf
entries and is at best a questionable, unproven hack. Remove it.
Signed-off-by: NWill Deacon <will@kernel.org>

f71da467

25 6月, 2019 2 次提交

iommu/io-pgtable: Support non-coherent page tables · 9e6ea59f

由 Bjorn Andersson 提交于 5月 15, 2019

Describe the memory related to page table walks as non-cacheable for
iommu instances that are not DMA coherent.
Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org>
[will: Use cfg->coherent_walk, fix arm-v7s, ensure outer-shareable for NC]
Signed-off-by: NWill Deacon <will@kernel.org>

9e6ea59f

iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag · 4f41845b

由 Will Deacon 提交于 6月 25, 2019

IO_PGTABLE_QUIRK_NO_DMA is a bit of a misnomer, since it's really just
an indication of whether or not the page-table walker for the IOMMU is
coherent with the CPU caches. Since cache coherency is more than just a
quirk, replace the flag with its own field in the io_pgtable_cfg
structure.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: NWill Deacon <will@kernel.org>

4f41845b

19 6月, 2019 2 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 · caab277b

由 Thomas Gleixner 提交于 6月 03, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not see http www gnu org
  licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Reviewed-by: NEnrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

caab277b

iommu/io-pgtable-arm: Add support to use system cache · 90ec7a76

由 Vivek Gautam 提交于 5月 16, 2019

Few Qualcomm platforms such as, sdm845 have an additional outer
cache called as System cache, aka. Last level cache (LLC) that
allows non-coherent devices to upgrade to using caching.
This cache sits right before the DDR, and is tightly coupled
with the memory controller. The clients using this cache request
their slices from this system cache, make it active, and can then
start using it.

There is a fundamental assumption that non-coherent devices can't
access caches. This change adds an exception where they *can* use
some level of cache despite still being non-coherent overall.
The coherent devices that use cacheable memory, and CPU make use of
this system cache by default.

Looking at memory types, we have following -
a) Normal uncached :- MAIR 0x44, inner non-cacheable,
                      outer non-cacheable;
b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
                      outer read write-back non-transient;
                      attribute setting for coherenet I/O devices.
and, for non-coherent i/o devices that can allocate in system cache
another type gets added -
c) Normal sys-cached :- MAIR 0xf4, inner non-cacheable,
                        outer read write-back non-transient

Coherent I/O devices use system cache by marking the memory as
normal cached.
Non-coherent I/O devices should mark the memory as normal
sys-cached in page tables to use system cache.
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

90ec7a76

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功