提交 · d08d42de6432d5064045159aed060e3db9fa7807 · openeuler / Kernel

13 4月, 2019 1 次提交

iommu: io-pgtable: Add ARM Mali midgard MMU page table format · d08d42de

由 Rob Herring 提交于 2月 21, 2019

ARM Mali midgard GPU is similar to standard 64-bit stage 1 page tables, but
have a few differences. Add a new format type to represent the format. The
input address size is 48-bits and the output address size is 40-bits (and
possibly less?). Note that the later bifrost GPUs follow the standard
64-bit stage 1 format.

The differences in the format compared to 64-bit stage 1 format are:

The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.

The access flags are not read-only and unprivileged, but read and write.
This is similar to stage 2 entries, but the memory attributes field matches
stage 1 being an index.

The nG bit is not set by the vendor driver. This one didn't seem to matter,
but we'll keep it aligned to the vendor driver.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Acked-by: NAlyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NRob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190409205427.6943-2-robh@kernel.org

d08d42de

11 2月, 2019 1 次提交

iommu: Allow io-pgtable to be used outside of drivers/iommu/ · b77cf11f

由 Rob Herring 提交于 2月 05, 2019

Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops
and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to
use the page table library. Specifically, some ARM Mali GPUs use the
ARM page table formats.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: NRob Herring <robh@kernel.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b77cf11f

01 10月, 2018 2 次提交

iommu/io-pgtable-arm: Add support for non-strict mode · b6b65ca2

由 Zhen Lei 提交于 9月 20, 2018

Non-strict mode is simply a case of skipping 'regular' leaf TLBIs, since
the sync is already factored out into ops->iotlb_sync at the core API
level. Non-leaf invalidations where we change the page table structure
itself still have to be issued synchronously in order to maintain walk
caches correctly.

To save having to reason about it too much, make sure the invalidation
in arm_lpae_split_blk_unmap() just performs its own unconditional sync
to minimise the window in which we're technically violating the break-
before-make requirement on a live mapping. This might work out redundant
with an outer-level sync for strict unmaps, but we'll never be splitting
blocks on a DMA fastpath anyway.
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
[rm: tweak comment, commit message, split_blk_unmap logic and barriers]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

b6b65ca2

iommu/io-pgtable-arm: Fix race handling in split_blk_unmap() · 85c7a0f1

由 Robin Murphy 提交于 9月 06, 2018

In removing the pagetable-wide lock, we gained the possibility of the
vanishingly unlikely case where we have a race between two concurrent
unmappers splitting the same block entry. The logic to handle this is
fairly straightforward - whoever loses the race frees their partial
next-level table and instead dereferences the winner's newly-installed
entry in order to fall back to a regular unmap, which intentionally
echoes the pre-existing case of recursively splitting a 1GB block down
to 4KB pages by installing a full table of 2MB blocks first.

Unfortunately, the chump who implemented that logic failed to update the
condition check for that fallback, meaning that if said race occurs at
the last level (where the loser's unmap_idx is valid) then the unmap
won't actually happen. Fix that to properly account for both the race
and recursive cases.

Fixes: 2c3d273e ("iommu/io-pgtable-arm: Support lockless operation")
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: re-jig control flow to avoid duplicate cmpxchg test]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

85c7a0f1

26 7月, 2018 1 次提交

iommu/io-pgtable-arm: Fix pgtable allocation in selftest · fac83d29

由 Jean-Philippe Brucker 提交于 6月 18, 2018

Commit 4b123757 ("iommu/io-pgtable-arm: Make allocations
NUMA-aware") added a NUMA hint to page table allocation, but the pgtable
selftest doesn't provide an SMMU device parameter. Since dev_to_node
doesn't accept a NULL argument, add a special case for selftest.
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fac83d29

29 5月, 2018 1 次提交

iommu/io-pgtable-arm: Make allocations NUMA-aware · 4b123757

由 Robin Murphy 提交于 5月 22, 2018

We would generally expect pagetables to be read by the IOMMU more than
written by the CPU, so in NUMA systems it makes sense to locate them
close to the former and avoid cross-node pagetable walks if at all
possible. As it turns out, we already have a handle on the IOMMU device
for the sake of coherency management, so it's trivial to grab the
appropriate NUMA node when allocating new pagetable pages.

Note that we drop the semantics of alloc_pages_exact(), but that's fine
since they have never been necessary: the only time we're allocating
more than one page is for stage 2 top-level concatenation, but since
that is based on the number of IPA bits, the size is always some exact
power of two anyway.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

4b123757

03 5月, 2018 1 次提交

iommu/io-pgtable-arm: Use for_each_set_bit to simplify code · f793b13e

由 YueHaibing 提交于 4月 26, 2018

We can use for_each_set_bit() to simplify code slightly in the
ARM io-pgtable self tests while unmapping.
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f793b13e

29 3月, 2018 1 次提交

iommu/io-pgtable-arm: Avoid warning with 32-bit phys_addr_t · 78688059

由 Robin Murphy 提交于 3月 29, 2018

It's not entirely unreasonable for io-pgtable-arm to be built for
configurations with 32-bit phys_addr_t, where the compiler rightly
raises a warning about the 36-bit shift. That particular code path
should never actually *run* on those systems, but we still want it
to compile cleanly, which is easily done by using an unambiguous u64
as the intermediate type instead.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

78688059

27 3月, 2018 1 次提交

iommu/io-pgtable-arm: Support 52-bit physical address · 6c89928f

由 Robin Murphy 提交于 3月 26, 2018

Bring io-pgtable-arm in line with the ARMv8.2-LPA feature allowing
52-bit physical addresses when using the 64KB translation granule.
This will be supported by SMMUv3.1.
Tested-by: NNate Watterson <nwatters@codeaurora.org>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

6c89928f

14 2月, 2018 1 次提交

iommu/io-pgtable: Use size_t return type for all foo_unmap · 193e67c0

由 Vivek Gautam 提交于 2月 05, 2018

Unmap returns a size_t all throughout the IOMMU framework.
Make io-pgtable match this convention.
Moreover, there isn't a need to have a signed int return type
as we return 0 in case of failures.
Signed-off-by: NVivek Gautam <vivek.gautam@codeaurora.org>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

193e67c0

02 10月, 2017 1 次提交

iommu/io-pgtable-arm: Convert to IOMMU API TLB sync · 32b12449

由 Robin Murphy 提交于 9月 28, 2017

Now that the core API issues its own post-unmap TLB sync call, push that
operation out from the io-pgtable-arm internals into the users. For now,
we leave the invalidation implicit in the unmap operation, since none of
the current users would benefit much from any change to that.

CC: Magnus Damm <damm+renesas@opensource.se>
CC: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

32b12449

20 7月, 2017 1 次提交

iommu/io-pgtable: Sanitise map/unmap addresses · 76557391

由 Robin Murphy 提交于 7月 03, 2017

It may be an egregious error to attempt to use addresses outside the
range of the pagetable format, but that still doesn't mean we should
merrily wreak havoc by silently mapping/unmapping whatever truncated
portions of them might happen to correspond to real addresses.

Add some up-front checks to sanitise our inputs so that buggy callers
don't invite potential memory corruption.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

76557391

24 6月, 2017 4 次提交

iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table · 77f34458

由 Will Deacon 提交于 6月 23, 2017

When writing a new table entry, we must ensure that the contents of the
table is made visible to the SMMU page table walker before the updated
table entry itself.

This is currently achieved using wmb(), which expands to an expensive and
unnecessary DSB instruction. Ideally, we'd just use cmpxchg64_release when
writing the table entry, but this doesn't have memory ordering semantics
on !SMP systems.

Instead, use dma_wmb(), which emits DMB OSHST. Strictly speaking, this
does more than we require (since it targets the outer-shareable domain),
but it's likely to be significantly faster than the DSB approach.
Reported-by: NLinu Cherian <linu.cherian@cavium.com>
Suggested-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

77f34458

iommu/io-pgtable-arm: Support lockless operation · 2c3d273e

由 Robin Murphy 提交于 6月 22, 2017

For parallel I/O with multiple concurrent threads servicing the same
device (or devices, if several share a domain), serialising page table
updates becomes a massive bottleneck. On reflection, though, we don't
strictly need to do that - for valid IOMMU API usage, there are in fact
only two races that we need to guard against: multiple map requests for
different blocks within the same region, when the intermediate-level
table for that region does not yet exist; and multiple unmaps of
different parts of the same block entry. Both of those are fairly easily
solved by using a cmpxchg to install the new table, such that if we then
find that someone else's table got there first, we can simply free ours
and continue.

Make the requisite changes such that we can withstand being called
without the caller maintaining a lock. In theory, this opens up a few
corners in which wildly misbehaving callers making nonsensical
overlapping requests might lead to crashes instead of just unpredictable
results, but correct code really does not deserve to pay a significant
performance cost for the sake of masking bugs in theoretical broken code.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2c3d273e

iommu/io-pgtable: Introduce explicit coherency · 81b3c252

由 Robin Murphy 提交于 6月 22, 2017

Once we remove the serialising spinlock, a potential race opens up for
non-coherent IOMMUs whereby a caller of .map() can be sure that cache
maintenance has been performed on their new PTE, but will have no
guarantee that such maintenance for table entries above it has actually
completed (e.g. if another CPU took an interrupt immediately after
writing the table entry, but before initiating the DMA sync).

Handling this race safely will add some potentially non-trivial overhead
to installing a table entry, which we would much rather avoid on
coherent systems where it will be unnecessary, and where we are stirivng
to minimise latency by removing the locking in the first place.

To that end, let's introduce an explicit notion of cache-coherency to
io-pgtable, such that we will be able to avoid penalising IOMMUs which
know enough to know when they are coherent.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

81b3c252

iommu/io-pgtable-arm: Improve split_blk_unmap · fb3a9579

由 Robin Murphy 提交于 6月 22, 2017

The current split_blk_unmap implementation suffers from some inscrutable
pointer trickery for creating the tables to replace the block entry, but
more than that it also suffers from hideous inefficiency. For example,
the most pathological case of unmapping a level 3 page from a level 1
block will allocate 513 lower-level tables to remap the entire block at
page granularity, when only 2 are actually needed (the rest can be
covered by level 2 block entries).

Also, we would like to be able to relax the spinlock requirement in
future, for which the roll-back-and-try-again logic for race resolution
would be pretty hideous under the current paradigm.

Both issues can be resolved most neatly by turning things sideways:
instead of repeatedly recursing into __arm_lpae_map() map to build up an
entire new sub-table depth-first, we can directly replace the block
entry with a next-level table of block/page entries, then repeat by
unmapping at the next level if necessary. With a little refactoring of
some helper functions, the code ends up not much bigger than before, but
considerably easier to follow and to adapt in future.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fb3a9579

06 4月, 2017 1 次提交

iommu/io-pgtable-arm: Avoid shift overflow in block size · 022f4e4f

由 Robin Murphy 提交于 4月 03, 2017

The recursive nature of __arm_lpae_{map,unmap}() means that
ARM_LPAE_BLOCK_SIZE() is evaluated for every level, including those
where block mappings aren't possible. This in itself is harmless enough,
as we will only ever be called with valid sizes from the pgsize_bitmap,
and thus always recurse down past any imaginary block sizes. The only
problem is that most of those imaginary sizes overflow the type used for
the calculation, and thus trigger warnings under UBsan:

[   63.020939] ================================================================================
[   63.021284] UBSAN: Undefined behaviour in drivers/iommu/io-pgtable-arm.c:312:22
[   63.021602] shift exponent 39 is too large for 32-bit type 'int'
[   63.021909] CPU: 0 PID: 1119 Comm: lkvm Not tainted 4.7.0-rc3+ #819
[   63.022163] Hardware name: FVP Base (DT)
[   63.022345] Call trace:
[   63.022629] [<ffffff900808f258>] dump_backtrace+0x0/0x3a8
[   63.022975] [<ffffff900808f614>] show_stack+0x14/0x20
[   63.023294] [<ffffff90086bc9dc>] dump_stack+0x104/0x148
[   63.023609] [<ffffff9008713ce8>] ubsan_epilogue+0x18/0x68
[   63.023956] [<ffffff9008714410>] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
[   63.024365] [<ffffff900890fcb0>] __arm_lpae_map+0x720/0xae0
[   63.024732] [<ffffff9008910170>] arm_lpae_map+0x100/0x190
[   63.025049] [<ffffff90089183d8>] arm_smmu_map+0x78/0xc8
[   63.025390] [<ffffff9008906c18>] iommu_map+0x130/0x230
[   63.025763] [<ffffff9008bf7564>] vfio_iommu_type1_attach_group+0x4bc/0xa00
[   63.026156] [<ffffff9008bf3c78>] vfio_fops_unl_ioctl+0x320/0x580
[   63.026515] [<ffffff9008377420>] do_vfs_ioctl+0x140/0xd28
[   63.026858] [<ffffff9008378094>] SyS_ioctl+0x8c/0xa0
[   63.027179] [<ffffff9008086e70>] el0_svc_naked+0x24/0x28
[   63.027412] ================================================================================

Perform the shift in a 64-bit type to prevent the theoretical overflow
and keep the peace. As it turns out, this generates identical code for
32-bit ARM, and marginally shorter AArch64 code, so it's good all round.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

022f4e4f

11 3月, 2017 1 次提交

iommu/io-pgtable-arm: Check for leaf entry before dereferencing it · ed46e66c

由 Oleksandr Tyshchenko 提交于 2月 27, 2017

Do a check for already installed leaf entry at the current level before
dereferencing it in order to avoid walking the page table down with
wrong pointer to the next level.
Signed-off-by: NOleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ed46e66c

19 1月, 2017 1 次提交

iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag · e7468a23

由 Jeremy Gebben 提交于 1月 06, 2017

Allow the creation of privileged mode mappings, for stage 1 only.
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e7468a23

29 11月, 2016 2 次提交

iommu/io-pgtable-arm: Use const and __initconst for iommu_gather_ops structures · dfed5f01

由 Bhumika Goyal 提交于 10月 25, 2016

Check for iommu_gather_ops structures that are only stored in the tlb
field of an io_pgtable_cfg structure. The tlb field is of type
const struct iommu_gather_ops *, so iommu_gather_ops structures
having this property can be declared as const. Also, replace __initdata
with __initconst.
Acked-by: NJulia Lawall <julia.lawall@lip6.fr>
Signed-off-by: NBhumika Goyal <bhumirks@gmail.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

dfed5f01

iommu/io-pgtable-arm: Use for_each_set_bit to simplify the code · 4ae8a5c5

由 Kefeng Wang 提交于 9月 21, 2016

We can use for_each_set_bit() to simplify the code slightly in the
ARM io-pgtable self tests.
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

4ae8a5c5

01 7月, 2016 1 次提交

iommu/io-pgtable-arm: Fix iova_to_phys for block entries · 7c6d90e2

由 Will Deacon 提交于 6月 16, 2016

The implementation of iova_to_phys for the long-descriptor ARM
io-pgtable code always masks with the granule size when inserting the
low virtual address bits into the physical address determined from the
page tables. In cases where the leaf entry is found before the final
level of table (i.e. due to a block mapping), this results in rounding
down to the bottom page of the block mapping. Consequently, the physical
address range batching in the vfio_unmap_unpin is defeated and we end
up taking the long way home.

This patch fixes the problem by masking the virtual address with the
appropriate mask for the level at which the leaf descriptor is located.
The short-descriptor code already gets this right, so no change is
needed there.

Cc: <stable@vger.kernel.org>
Reported-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

7c6d90e2

07 4月, 2016 1 次提交

iommu/io-pgtable-arm: Support IOMMU_MMIO flag · fb948251

由 Robin Murphy 提交于 4月 05, 2016

Teach the LPAE format to create Device mappings when asked.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NEric Auger <eric.auger@linaro.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

fb948251

17 2月, 2016 2 次提交

iommu/io-pgtable: Rationalise quirk handling · 3850db49

由 Robin Murphy 提交于 2月 12, 2016

As the number of io-pgtable implementations grows beyond 1, it's time
to rationalise the quirks mechanism before things have a chance to
start getting really ugly and out-of-hand.

To that end:
- Indicate exactly which quirks each format can/does support.
- Fail creating a table if a caller wants unsupported quirks.
- Properly document where each quirk applies and why.
Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

3850db49

iommu/io-pgtable: Add helper functions for TLB ops · 507e4c9d

由 Robin Murphy 提交于 1月 26, 2016

Add some simple wrappers to avoid having the guts of the TLB operations
spilled all over the page table implementations, and to provide a point
to implement extra common functionality.
Acked-by: NWill Deacon <will.deacon@arm.com>
Acked-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

507e4c9d

29 1月, 2016 1 次提交

iommu/io-pgtable-arm: Fix io-pgtable-arm build failure · 8f6aff98

由 Lada Trimasova 提交于 1月 27, 2016

Trying to build a kernel for ARC with both options CONFIG_COMPILE_TEST
and CONFIG_IOMMU_IO_PGTABLE_LPAE enabled (e.g. as a result of "make
allyesconfig") results in the following build failure:

 | CC drivers/iommu/io-pgtable-arm.o
 | linux/drivers/iommu/io-pgtable-arm.c: In
 | function ‘__arm_lpae_alloc_pages’:
 | linux/drivers/iommu/io-pgtable-arm.c:221:3:
 | error: implicit declaration of function ‘dma_map_single’
 | [-Werror=implicit-function-declaration]
 | dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
 | ^
 | linux/drivers/iommu/io-pgtable-arm.c:221:42:
 | error: ‘DMA_TO_DEVICE’ undeclared (first use in this function)
 | dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
 | ^

Since IOMMU_IO_PGTABLE_LPAE depends on DMA API, io-pgtable-arm.c should
include linux/dma-mapping.h. This fixes the reported failure.

Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: NLada Trimasova <ltrimas@synopsys.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

8f6aff98

17 12月, 2015 4 次提交

iommu/io-pgtable-arm: Ensure we free the final level on teardown · 12c2ab09

由 Will Deacon 提交于 12月 15, 2015

When tearing down page tables, we return early for the final level
since we know that we won't have any table pointers to follow.
Unfortunately, this also means that we forget to free the final level,
so we end up leaking memory.

Fix the issue by always freeing the current level, but just don't bother
to iterate over the ptes if we're at the final level.

Cc: <stable@vger.kernel.org>
Reported-by: NZhang Bo <zhangbo_a@xiaomi.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

12c2ab09

iommu/io-pgtable: Make io_pgtable_ops_to_pgtable() macro common · fdc38967

由 Robin Murphy 提交于 12月 04, 2015

There is no need to keep a useful accessor for a public structure hidden
away in a private implementation. Move it out alongside the structure
definition so that other implementations may reuse it.
Acked-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fdc38967

iommu/io-pgtable: Indicate granule for TLB maintenance · 06c610e8

由 Robin Murphy 提交于 12月 07, 2015

IOMMU hardware with range-based TLB maintenance commands can work
happily with the iova and size arguments passed via the tlb_add_flush
callback, but for IOMMUs which require separate commands per entry in
the range, it is not straightforward to infer the necessary granularity
when it comes to issuing the actual commands.

Add an additional argument indicating the granularity for the benefit
of drivers needing to know, and update the ARM LPAE code appropriately
(for non-leaf invalidations we currently just assume the worst-case
page granularity rather than walking the table to check).
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

06c610e8

iommu/io-pgtable-arm: Avoid dereferencing bogus PTEs · 2eb97c78

由 Robin Murphy 提交于 12月 04, 2015

In the case of corrupted page tables, or when an invalid size is given,
__arm_lpae_unmap() may recurse beyond the maximum number of levels.
Unfortunately the detection of this error condition only happens *after*
calculating a nonsense offset from something which might not be a valid
table pointer and dereferencing that to see if it is a valid PTE.

Make things a little more robust by checking the level is valid before
doing anything which depends on it being so.
Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

2eb97c78

23 9月, 2015 1 次提交

iommu/io-pgtable-arm: Don't use dma_to_phys() · ffcb6d16

由 Robin Murphy 提交于 9月 17, 2015

In checking whether DMA addresses differ from physical addresses, using
dma_to_phys() is actually the wrong thing to do, since it may hide any
DMA offset, which is precisely one of the things we are checking for.
Simply casting between the two address types, whilst ugly, is in fact
the appropriate course of action. Further care (and ugliness) is also
necessary in the comparison to avoid truncation if phys_addr_t and
dma_addr_t differ in size.

We can also reject any device with a fixed DMA offset up-front at page
table creation, leaving the allocation-time check for the more subtle
cases like bounce buffering due to an incorrect DMA mask.

Furthermore, we can then fix the hackish KConfig dependency so that
architectures without a dma_to_phys() implementation may still
COMPILE_TEST (or even use!) the code. The true dependency is on the
DMA API, so use the appropriate symbol for that.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: folded in selftest fix from Yong Wu]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

ffcb6d16

18 8月, 2015 1 次提交

iommu/io-pgtable-arm: Unmap and free table when overwriting with block · cf27ec93

由 Will Deacon 提交于 8月 11, 2015

When installing a block mapping, we unconditionally overwrite a non-leaf
PTE if we find one. However, this can cause a problem if the following
sequence of events occur:

  (1) iommu_map called for a 4k (i.e. PAGE_SIZE) mapping at some address
      - We initialise the page table all the way down to a leaf entry
      - No TLB maintenance is required, because we're going from invalid
        to valid.

  (2) iommu_unmap is called on the mapping installed in (1)
      - We walk the page table to the final (leaf) entry and zero it
      - We only changed a valid leaf entry, so we invalidate leaf-only

  (3) iommu_map is called on the same address as (1), but this time for
      a 2MB (i.e. BLOCK_SIZE) mapping)
      - We walk the page table down to the penultimate level, where we
        find a table entry
      - We overwrite the table entry with a block mapping and return
        without any TLB maintenance and without freeing the memory used
        by the now-orphaned table.

This last step can lead to a walk-cache caching the overwritten table
entry, causing unexpected faults when the new mapping is accessed by a
device. One way to fix this would be to collapse the page table when
freeing the last page at a given level, but this would require expensive
iteration on every map call. Instead, this patch detects the case when
we are overwriting a table entry and explicitly unmaps the table first,
which takes care of both freeing and TLB invalidation.

Cc: <stable@vger.kernel.org>
Reported-by: NBrian Starkey <brian.starkey@arm.com>
Tested-by: NBrian Starkey <brian.starkey@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

cf27ec93

06 8月, 2015 3 次提交

iommu/io-pgtable: Remove flush_pgtable callback · f5b83190

由 Robin Murphy 提交于 7月 29, 2015

With the users fully converted to DMA API operations, it's dead, Jim.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f5b83190

iommu/io-pgtable-arm: Centralise sync points · 87a91b15

由 Robin Murphy 提交于 7月 29, 2015

With all current users now opted in to DMA API operations, make the
iommu_dev pointer mandatory, rendering the flush_pgtable callback
redundant for cache maintenance. However, since the DMA calls could be
nops in the case of a coherent IOMMU, we still need to ensure the page
table updates are fully synchronised against a subsequent page table
walk. In the unmap path, the TLB sync will usually need to do this
anyway, so just cement that requirement; in the map path which may
consist solely of cacheable memory writes (in the coherent case),
insert an appropriate barrier at the end of the operation, and obviate
the need to call flush_pgtable on every individual update for
synchronisation.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
[will: slight clarification to tlb_sync comment]
Signed-off-by: NWill Deacon <will.deacon@arm.com>

87a91b15

iommu/io-pgtable-arm: Allow appropriate DMA API use · f8d54961

由 Robin Murphy 提交于 7月 29, 2015

Currently, users of the LPAE page table code are (ab)using dma_map_page()
as a means to flush page table updates for non-coherent IOMMUs. Since
from the CPU's point of view, creating IOMMU page tables *is* passing
DMA buffers to a device (the IOMMU's page table walker), there's little
reason not to use the DMA API correctly.

Allow IOMMU drivers to opt into DMA API operations for page table
allocation and updates by providing their appropriate device pointer.
The expectation is that an LPAE IOMMU should have a full view of system
memory, so use streaming mappings to avoid unnecessary pressure on
ZONE_DMA, and treat any DMA translation as a warning sign.
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

f8d54961

27 3月, 2015 1 次提交

iommu/io-pgtable-arm: avoid speculative walks through TTBR1 · 63979b8d

由 Will Deacon 提交于 3月 18, 2015

Although we set TCR.T1SZ to 0, the input address range covered by TTBR1
is actually calculated using T0SZ in this case on the ARM SMMU. This
could theoretically lead to speculative table walks through physical
address zero, leading to all sorts of fun and games if we have MMIO
regions down there.

This patch avoids the issue by setting EPD1 to disable walks through
the unused TTBR1 register.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

63979b8d

25 2月, 2015 1 次提交

iommu/io-pgtable-arm: Fix self-test WARNs on i386 · 367bd978

由 Will Deacon 提交于 2月 16, 2015

Various build/boot bots have reported WARNs being triggered by the ARM
iopgtable LPAE self-tests on i386 machines.

This boils down to two instances of right-shifting a 32-bit unsigned
long (i.e. an iova) by more than the size of the type. On 32-bit ARM,
this happens to give us zero, hence my testing didn't catch this
earlier.

This patch fixes the issue by using DIV_ROUND_UP and explicit case to
to avoid the erroneous shifts.
Reported-by: NFengguang Wu <fengguang.wu@intel.com>
Reported-by: NHuang Ying <ying.huang@intel.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

367bd978

19 1月, 2015 3 次提交

iommu: io-pgtable-arm: add non-secure quirk · c896c132

由 Laurent Pinchart 提交于 12月 14, 2014

The quirk causes the Non-Secure bit to be set in all page table entries.
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

c896c132

iommu: add self-consistency tests to ARM LPAE IO page table allocator · fe4b991d

由 Will Deacon 提交于 11月 17, 2014

This patch adds a series of basic self-consistency tests to the ARM LPAE
IO page table allocator that exercise corner cases in map/unmap, as well
as testing all valid configurations of pagesize, ias and stage.
Signed-off-by: NWill Deacon <will.deacon@arm.com>

fe4b991d

iommu: add ARM LPAE page table allocator · e1d3c0fd

由 Will Deacon 提交于 11月 14, 2014

A number of IOMMUs found in ARM SoCs can walk architecture-compatible
page tables.

This patch adds a generic allocator for Stage-1 and Stage-2 v7/v8
long-descriptor page tables. 4k, 16k and 64k pages are supported, with
up to 4-levels of walk to cover a 48-bit address space.
Tested-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

e1d3c0fd

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功