1. 13 4月, 2019 1 次提交
  2. 11 2月, 2019 1 次提交
    • R
      iommu: Allow io-pgtable to be used outside of drivers/iommu/ · b77cf11f
      Rob Herring 提交于
      Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops
      and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to
      use the page table library. Specifically, some ARM Mali GPUs use the
      ARM page table formats.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: iommu@lists.linux-foundation.org
      Cc: linux-mediatek@lists.infradead.org
      Cc: linux-arm-msm@vger.kernel.org
      Signed-off-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      b77cf11f
  3. 01 10月, 2018 2 次提交
    • Z
      iommu/io-pgtable-arm: Add support for non-strict mode · b6b65ca2
      Zhen Lei 提交于
      Non-strict mode is simply a case of skipping 'regular' leaf TLBIs, since
      the sync is already factored out into ops->iotlb_sync at the core API
      level. Non-leaf invalidations where we change the page table structure
      itself still have to be issued synchronously in order to maintain walk
      caches correctly.
      
      To save having to reason about it too much, make sure the invalidation
      in arm_lpae_split_blk_unmap() just performs its own unconditional sync
      to minimise the window in which we're technically violating the break-
      before-make requirement on a live mapping. This might work out redundant
      with an outer-level sync for strict unmaps, but we'll never be splitting
      blocks on a DMA fastpath anyway.
      Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
      [rm: tweak comment, commit message, split_blk_unmap logic and barriers]
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b6b65ca2
    • R
      iommu/io-pgtable-arm: Fix race handling in split_blk_unmap() · 85c7a0f1
      Robin Murphy 提交于
      In removing the pagetable-wide lock, we gained the possibility of the
      vanishingly unlikely case where we have a race between two concurrent
      unmappers splitting the same block entry. The logic to handle this is
      fairly straightforward - whoever loses the race frees their partial
      next-level table and instead dereferences the winner's newly-installed
      entry in order to fall back to a regular unmap, which intentionally
      echoes the pre-existing case of recursively splitting a 1GB block down
      to 4KB pages by installing a full table of 2MB blocks first.
      
      Unfortunately, the chump who implemented that logic failed to update the
      condition check for that fallback, meaning that if said race occurs at
      the last level (where the loser's unmap_idx is valid) then the unmap
      won't actually happen. Fix that to properly account for both the race
      and recursive cases.
      
      Fixes: 2c3d273e ("iommu/io-pgtable-arm: Support lockless operation")
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      [will: re-jig control flow to avoid duplicate cmpxchg test]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      85c7a0f1
  4. 26 7月, 2018 1 次提交
  5. 29 5月, 2018 1 次提交
    • R
      iommu/io-pgtable-arm: Make allocations NUMA-aware · 4b123757
      Robin Murphy 提交于
      We would generally expect pagetables to be read by the IOMMU more than
      written by the CPU, so in NUMA systems it makes sense to locate them
      close to the former and avoid cross-node pagetable walks if at all
      possible. As it turns out, we already have a handle on the IOMMU device
      for the sake of coherency management, so it's trivial to grab the
      appropriate NUMA node when allocating new pagetable pages.
      
      Note that we drop the semantics of alloc_pages_exact(), but that's fine
      since they have never been necessary: the only time we're allocating
      more than one page is for stage 2 top-level concatenation, but since
      that is based on the number of IPA bits, the size is always some exact
      power of two anyway.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      4b123757
  6. 03 5月, 2018 1 次提交
  7. 29 3月, 2018 1 次提交
  8. 27 3月, 2018 1 次提交
  9. 14 2月, 2018 1 次提交
  10. 02 10月, 2017 1 次提交
  11. 20 7月, 2017 1 次提交
  12. 24 6月, 2017 4 次提交
    • W
      iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table · 77f34458
      Will Deacon 提交于
      When writing a new table entry, we must ensure that the contents of the
      table is made visible to the SMMU page table walker before the updated
      table entry itself.
      
      This is currently achieved using wmb(), which expands to an expensive and
      unnecessary DSB instruction. Ideally, we'd just use cmpxchg64_release when
      writing the table entry, but this doesn't have memory ordering semantics
      on !SMP systems.
      
      Instead, use dma_wmb(), which emits DMB OSHST. Strictly speaking, this
      does more than we require (since it targets the outer-shareable domain),
      but it's likely to be significantly faster than the DSB approach.
      Reported-by: NLinu Cherian <linu.cherian@cavium.com>
      Suggested-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      77f34458
    • R
      iommu/io-pgtable-arm: Support lockless operation · 2c3d273e
      Robin Murphy 提交于
      For parallel I/O with multiple concurrent threads servicing the same
      device (or devices, if several share a domain), serialising page table
      updates becomes a massive bottleneck. On reflection, though, we don't
      strictly need to do that - for valid IOMMU API usage, there are in fact
      only two races that we need to guard against: multiple map requests for
      different blocks within the same region, when the intermediate-level
      table for that region does not yet exist; and multiple unmaps of
      different parts of the same block entry. Both of those are fairly easily
      solved by using a cmpxchg to install the new table, such that if we then
      find that someone else's table got there first, we can simply free ours
      and continue.
      
      Make the requisite changes such that we can withstand being called
      without the caller maintaining a lock. In theory, this opens up a few
      corners in which wildly misbehaving callers making nonsensical
      overlapping requests might lead to crashes instead of just unpredictable
      results, but correct code really does not deserve to pay a significant
      performance cost for the sake of masking bugs in theoretical broken code.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      2c3d273e
    • R
      iommu/io-pgtable: Introduce explicit coherency · 81b3c252
      Robin Murphy 提交于
      Once we remove the serialising spinlock, a potential race opens up for
      non-coherent IOMMUs whereby a caller of .map() can be sure that cache
      maintenance has been performed on their new PTE, but will have no
      guarantee that such maintenance for table entries above it has actually
      completed (e.g. if another CPU took an interrupt immediately after
      writing the table entry, but before initiating the DMA sync).
      
      Handling this race safely will add some potentially non-trivial overhead
      to installing a table entry, which we would much rather avoid on
      coherent systems where it will be unnecessary, and where we are stirivng
      to minimise latency by removing the locking in the first place.
      
      To that end, let's introduce an explicit notion of cache-coherency to
      io-pgtable, such that we will be able to avoid penalising IOMMUs which
      know enough to know when they are coherent.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      81b3c252
    • R
      iommu/io-pgtable-arm: Improve split_blk_unmap · fb3a9579
      Robin Murphy 提交于
      The current split_blk_unmap implementation suffers from some inscrutable
      pointer trickery for creating the tables to replace the block entry, but
      more than that it also suffers from hideous inefficiency. For example,
      the most pathological case of unmapping a level 3 page from a level 1
      block will allocate 513 lower-level tables to remap the entire block at
      page granularity, when only 2 are actually needed (the rest can be
      covered by level 2 block entries).
      
      Also, we would like to be able to relax the spinlock requirement in
      future, for which the roll-back-and-try-again logic for race resolution
      would be pretty hideous under the current paradigm.
      
      Both issues can be resolved most neatly by turning things sideways:
      instead of repeatedly recursing into __arm_lpae_map() map to build up an
      entire new sub-table depth-first, we can directly replace the block
      entry with a next-level table of block/page entries, then repeat by
      unmapping at the next level if necessary. With a little refactoring of
      some helper functions, the code ends up not much bigger than before, but
      considerably easier to follow and to adapt in future.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb3a9579
  13. 06 4月, 2017 1 次提交
    • R
      iommu/io-pgtable-arm: Avoid shift overflow in block size · 022f4e4f
      Robin Murphy 提交于
      The recursive nature of __arm_lpae_{map,unmap}() means that
      ARM_LPAE_BLOCK_SIZE() is evaluated for every level, including those
      where block mappings aren't possible. This in itself is harmless enough,
      as we will only ever be called with valid sizes from the pgsize_bitmap,
      and thus always recurse down past any imaginary block sizes. The only
      problem is that most of those imaginary sizes overflow the type used for
      the calculation, and thus trigger warnings under UBsan:
      
      [   63.020939] ================================================================================
      [   63.021284] UBSAN: Undefined behaviour in drivers/iommu/io-pgtable-arm.c:312:22
      [   63.021602] shift exponent 39 is too large for 32-bit type 'int'
      [   63.021909] CPU: 0 PID: 1119 Comm: lkvm Not tainted 4.7.0-rc3+ #819
      [   63.022163] Hardware name: FVP Base (DT)
      [   63.022345] Call trace:
      [   63.022629] [<ffffff900808f258>] dump_backtrace+0x0/0x3a8
      [   63.022975] [<ffffff900808f614>] show_stack+0x14/0x20
      [   63.023294] [<ffffff90086bc9dc>] dump_stack+0x104/0x148
      [   63.023609] [<ffffff9008713ce8>] ubsan_epilogue+0x18/0x68
      [   63.023956] [<ffffff9008714410>] __ubsan_handle_shift_out_of_bounds+0x18c/0x1bc
      [   63.024365] [<ffffff900890fcb0>] __arm_lpae_map+0x720/0xae0
      [   63.024732] [<ffffff9008910170>] arm_lpae_map+0x100/0x190
      [   63.025049] [<ffffff90089183d8>] arm_smmu_map+0x78/0xc8
      [   63.025390] [<ffffff9008906c18>] iommu_map+0x130/0x230
      [   63.025763] [<ffffff9008bf7564>] vfio_iommu_type1_attach_group+0x4bc/0xa00
      [   63.026156] [<ffffff9008bf3c78>] vfio_fops_unl_ioctl+0x320/0x580
      [   63.026515] [<ffffff9008377420>] do_vfs_ioctl+0x140/0xd28
      [   63.026858] [<ffffff9008378094>] SyS_ioctl+0x8c/0xa0
      [   63.027179] [<ffffff9008086e70>] el0_svc_naked+0x24/0x28
      [   63.027412] ================================================================================
      
      Perform the shift in a 64-bit type to prevent the theoretical overflow
      and keep the peace. As it turns out, this generates identical code for
      32-bit ARM, and marginally shorter AArch64 code, so it's good all round.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      022f4e4f
  14. 11 3月, 2017 1 次提交
  15. 19 1月, 2017 1 次提交
  16. 29 11月, 2016 2 次提交
  17. 01 7月, 2016 1 次提交
    • W
      iommu/io-pgtable-arm: Fix iova_to_phys for block entries · 7c6d90e2
      Will Deacon 提交于
      The implementation of iova_to_phys for the long-descriptor ARM
      io-pgtable code always masks with the granule size when inserting the
      low virtual address bits into the physical address determined from the
      page tables. In cases where the leaf entry is found before the final
      level of table (i.e. due to a block mapping), this results in rounding
      down to the bottom page of the block mapping. Consequently, the physical
      address range batching in the vfio_unmap_unpin is defeated and we end
      up taking the long way home.
      
      This patch fixes the problem by masking the virtual address with the
      appropriate mask for the level at which the leaf descriptor is located.
      The short-descriptor code already gets this right, so no change is
      needed there.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NRobin Murphy <robin.murphy@arm.com>
      Tested-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      7c6d90e2
  18. 07 4月, 2016 1 次提交
  19. 17 2月, 2016 2 次提交
  20. 29 1月, 2016 1 次提交
    • L
      iommu/io-pgtable-arm: Fix io-pgtable-arm build failure · 8f6aff98
      Lada Trimasova 提交于
      Trying to build a kernel for ARC with both options CONFIG_COMPILE_TEST
      and CONFIG_IOMMU_IO_PGTABLE_LPAE enabled (e.g. as a result of "make
      allyesconfig") results in the following build failure:
      
       | CC drivers/iommu/io-pgtable-arm.o
       | linux/drivers/iommu/io-pgtable-arm.c: In
       | function ‘__arm_lpae_alloc_pages’:
       | linux/drivers/iommu/io-pgtable-arm.c:221:3:
       | error: implicit declaration of function ‘dma_map_single’
       | [-Werror=implicit-function-declaration]
       | dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
       | ^
       | linux/drivers/iommu/io-pgtable-arm.c:221:42:
       | error: ‘DMA_TO_DEVICE’ undeclared (first use in this function)
       | dma = dma_map_single(dev, pages, size, DMA_TO_DEVICE);
       | ^
      
      Since IOMMU_IO_PGTABLE_LPAE depends on DMA API, io-pgtable-arm.c should
      include linux/dma-mapping.h. This fixes the reported failure.
      
      Cc: Alexey Brodkin <abrodkin@synopsys.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Signed-off-by: NLada Trimasova <ltrimas@synopsys.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      8f6aff98
  21. 17 12月, 2015 4 次提交
  22. 23 9月, 2015 1 次提交
    • R
      iommu/io-pgtable-arm: Don't use dma_to_phys() · ffcb6d16
      Robin Murphy 提交于
      In checking whether DMA addresses differ from physical addresses, using
      dma_to_phys() is actually the wrong thing to do, since it may hide any
      DMA offset, which is precisely one of the things we are checking for.
      Simply casting between the two address types, whilst ugly, is in fact
      the appropriate course of action. Further care (and ugliness) is also
      necessary in the comparison to avoid truncation if phys_addr_t and
      dma_addr_t differ in size.
      
      We can also reject any device with a fixed DMA offset up-front at page
      table creation, leaving the allocation-time check for the more subtle
      cases like bounce buffering due to an incorrect DMA mask.
      
      Furthermore, we can then fix the hackish KConfig dependency so that
      architectures without a dma_to_phys() implementation may still
      COMPILE_TEST (or even use!) the code. The true dependency is on the
      DMA API, so use the appropriate symbol for that.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      [will: folded in selftest fix from Yong Wu]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ffcb6d16
  23. 18 8月, 2015 1 次提交
    • W
      iommu/io-pgtable-arm: Unmap and free table when overwriting with block · cf27ec93
      Will Deacon 提交于
      When installing a block mapping, we unconditionally overwrite a non-leaf
      PTE if we find one. However, this can cause a problem if the following
      sequence of events occur:
      
        (1) iommu_map called for a 4k (i.e. PAGE_SIZE) mapping at some address
            - We initialise the page table all the way down to a leaf entry
            - No TLB maintenance is required, because we're going from invalid
              to valid.
      
        (2) iommu_unmap is called on the mapping installed in (1)
            - We walk the page table to the final (leaf) entry and zero it
            - We only changed a valid leaf entry, so we invalidate leaf-only
      
        (3) iommu_map is called on the same address as (1), but this time for
            a 2MB (i.e. BLOCK_SIZE) mapping)
            - We walk the page table down to the penultimate level, where we
              find a table entry
            - We overwrite the table entry with a block mapping and return
              without any TLB maintenance and without freeing the memory used
              by the now-orphaned table.
      
      This last step can lead to a walk-cache caching the overwritten table
      entry, causing unexpected faults when the new mapping is accessed by a
      device. One way to fix this would be to collapse the page table when
      freeing the last page at a given level, but this would require expensive
      iteration on every map call. Instead, this patch detects the case when
      we are overwriting a table entry and explicitly unmaps the table first,
      which takes care of both freeing and TLB invalidation.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NBrian Starkey <brian.starkey@arm.com>
      Tested-by: NBrian Starkey <brian.starkey@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      cf27ec93
  24. 06 8月, 2015 3 次提交
    • R
      iommu/io-pgtable: Remove flush_pgtable callback · f5b83190
      Robin Murphy 提交于
      With the users fully converted to DMA API operations, it's dead, Jim.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f5b83190
    • R
      iommu/io-pgtable-arm: Centralise sync points · 87a91b15
      Robin Murphy 提交于
      With all current users now opted in to DMA API operations, make the
      iommu_dev pointer mandatory, rendering the flush_pgtable callback
      redundant for cache maintenance. However, since the DMA calls could be
      nops in the case of a coherent IOMMU, we still need to ensure the page
      table updates are fully synchronised against a subsequent page table
      walk. In the unmap path, the TLB sync will usually need to do this
      anyway, so just cement that requirement; in the map path which may
      consist solely of cacheable memory writes (in the coherent case),
      insert an appropriate barrier at the end of the operation, and obviate
      the need to call flush_pgtable on every individual update for
      synchronisation.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      [will: slight clarification to tlb_sync comment]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      87a91b15
    • R
      iommu/io-pgtable-arm: Allow appropriate DMA API use · f8d54961
      Robin Murphy 提交于
      Currently, users of the LPAE page table code are (ab)using dma_map_page()
      as a means to flush page table updates for non-coherent IOMMUs. Since
      from the CPU's point of view, creating IOMMU page tables *is* passing
      DMA buffers to a device (the IOMMU's page table walker), there's little
      reason not to use the DMA API correctly.
      
      Allow IOMMU drivers to opt into DMA API operations for page table
      allocation and updates by providing their appropriate device pointer.
      The expectation is that an LPAE IOMMU should have a full view of system
      memory, so use streaming mappings to avoid unnecessary pressure on
      ZONE_DMA, and treat any DMA translation as a warning sign.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f8d54961
  25. 27 3月, 2015 1 次提交
    • W
      iommu/io-pgtable-arm: avoid speculative walks through TTBR1 · 63979b8d
      Will Deacon 提交于
      Although we set TCR.T1SZ to 0, the input address range covered by TTBR1
      is actually calculated using T0SZ in this case on the ARM SMMU. This
      could theoretically lead to speculative table walks through physical
      address zero, leading to all sorts of fun and games if we have MMIO
      regions down there.
      
      This patch avoids the issue by setting EPD1 to disable walks through
      the unused TTBR1 register.
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      63979b8d
  26. 25 2月, 2015 1 次提交
  27. 19 1月, 2015 3 次提交