1. 24 6月, 2017 9 次提交
    • R
      iommu/io-pgtable: Introduce explicit coherency · 81b3c252
      Robin Murphy 提交于
      Once we remove the serialising spinlock, a potential race opens up for
      non-coherent IOMMUs whereby a caller of .map() can be sure that cache
      maintenance has been performed on their new PTE, but will have no
      guarantee that such maintenance for table entries above it has actually
      completed (e.g. if another CPU took an interrupt immediately after
      writing the table entry, but before initiating the DMA sync).
      
      Handling this race safely will add some potentially non-trivial overhead
      to installing a table entry, which we would much rather avoid on
      coherent systems where it will be unnecessary, and where we are stirivng
      to minimise latency by removing the locking in the first place.
      
      To that end, let's introduce an explicit notion of cache-coherency to
      io-pgtable, such that we will be able to avoid penalising IOMMUs which
      know enough to know when they are coherent.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      81b3c252
    • R
      iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap · b9f1ef30
      Robin Murphy 提交于
      Whilst the short-descriptor format's split_blk_unmap implementation has
      no need to be recursive, it followed the pattern of the LPAE version
      anyway for the sake of consistency. With the latter now reworked for
      both efficiency and future scalability improvements, tweak the former
      similarly, not least to make it less obtuse.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b9f1ef30
    • R
      iommu/io-pgtable-arm: Improve split_blk_unmap · fb3a9579
      Robin Murphy 提交于
      The current split_blk_unmap implementation suffers from some inscrutable
      pointer trickery for creating the tables to replace the block entry, but
      more than that it also suffers from hideous inefficiency. For example,
      the most pathological case of unmapping a level 3 page from a level 1
      block will allocate 513 lower-level tables to remap the entire block at
      page granularity, when only 2 are actually needed (the rest can be
      covered by level 2 block entries).
      
      Also, we would like to be able to relax the spinlock requirement in
      future, for which the roll-back-and-try-again logic for race resolution
      would be pretty hideous under the current paradigm.
      
      Both issues can be resolved most neatly by turning things sideways:
      instead of repeatedly recursing into __arm_lpae_map() map to build up an
      entire new sub-table depth-first, we can directly replace the block
      entry with a next-level table of block/page entries, then repeat by
      unmapping at the next level if necessary. With a little refactoring of
      some helper functions, the code ends up not much bigger than before, but
      considerably easier to follow and to adapt in future.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fb3a9579
    • R
      iommu/io-pgtable-arm-v7s: Check table PTEs more precisely · 9db829d2
      Robin Murphy 提交于
      Whilst we don't support the PXN bit at all, so should never encounter a
      level 1 section or supersection PTE with it set, it would still be wise
      to check both table type bits to resolve any theoretical ambiguity.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      9db829d2
    • A
      iommu: arm-smmu: Handle return of iommu_device_register. · 5c2d0218
      Arvind Yadav 提交于
      iommu_device_register returns an error code and, although it currently
      never fails, we should check its return value anyway.
      Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
      [will: adjusted to follow arm-smmu.c]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5c2d0218
    • A
      iommu: arm-smmu-v3: make of_device_ids const · ebdd13c9
      Arvind Yadav 提交于
      of_device_ids are not supposed to change at runtime. All functions
      working with of_device_ids provided by <linux/of.h> work with const
      of_device_ids. So mark the non-const structs as const.
      Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ebdd13c9
    • R
      iommu/arm-smmu: Plumb in new ACPI identifiers · 84c24379
      Robin Murphy 提交于
      Revision C of IORT now allows us to identify ARM MMU-401 and the Cavium
      ThunderX implementation. Wire them up so that we can probe these models
      once firmware starts using the new codes in place of generic ones, and
      so that the appropriate features and quirks get enabled when we do.
      
      For the sake of backports and mitigating sychronisation problems with
      the ACPICA headers, we'll carry a backup copy of the new definitions
      locally for the short term to make life simpler.
      
      CC: stable@vger.kernel.org # 4.10
      Acked-by: NRobert Richter <rrichter@cavium.com>
      Tested-by: NRobert Richter <rrichter@cavium.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      84c24379
    • A
      iommu/io-pgtable-arm-v7s: constify dummy_tlb_ops. · 60ab7a75
      Arvind Yadav 提交于
      File size before:
         text	   data	    bss	    dec	    hex	filename
         6146	     56	      9	   6211	   1843	drivers/iommu/io-pgtable-arm-v7s.o
      
      File size After adding 'const':
         text	   data	    bss	    dec	    hex	filename
         6170	     24	      9	   6203	   183b	drivers/iommu/io-pgtable-arm-v7s.o
      Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      60ab7a75
    • S
      iommu/arm-smmu-v3: Increase CMDQ drain timeout value · b847de4e
      Sunil Goutham 提交于
      Waiting for a CMD_SYNC to be processed involves waiting for the command
      queue to drain, which can take an awful lot longer than waiting for a
      single entry to become available. Consequently, the common timeout value
      of 100us has been observed to be too short on some platforms when a
      CMD_SYNC is issued into a queued full of TLBI commands.
      
      This patch resolves the issue by using a different (1s) timeout when
      waiting for the CMDQ to drain and using a simple back-off mechanism
      when polling the cons pointer in the absence of WFE support.
      Signed-off-by: NSunil Goutham <sgoutham@cavium.com>
      [will: rewrote commit message and cosmetic changes]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b847de4e
  2. 12 5月, 2017 12 次提交
  3. 11 5月, 2017 12 次提交
  4. 10 5月, 2017 6 次提交
  5. 09 5月, 2017 1 次提交
    • M
      qede: Split PF/VF ndos. · be47c555
      Mintz, Yuval 提交于
      PFs and VFs share the same structure of NDOs today,
      and the VFs explicitly fails the ndo_xdp() callback stating
      it doesn't support XDP.
      
      This results in lots of:
      
        [qede_xdp:1032(enp131s2)]VFs don't support XDP
        ------------[ cut here ]------------
        WARNING: CPU: 4 PID: 1426 at net/core/rtnetlink.c:1637 rtnl_dump_ifinfo+0x354/0x3c0
        ...
        Call Trace:
          ? __alloc_skb+0x9b/0x1d0
          netlink_dump+0x122/0x290
          netlink_recvmsg+0x27d/0x430
          sock_recvmsg+0x3d/0x50
        ...
      
      As every dump request for the VF interface info would fail due to
      rtnl_xdp_fill() returning an error code.
      
      To resolve this, introduce a subset of the NDOs meant for the VF
      in a seperate structure and register that one instead for VFs,
      and omit the ndo_xdp initialization.
      
      Fixes: 40b8c454 ("qede: Prevent VFs from using XDP")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      be47c555