1. 16 9月, 2016 14 次提交
    • R
      iommu/arm-smmu: Keep track of S2CR state · 8e8b203e
      Robin Murphy 提交于
      Making S2CRs first-class citizens within the driver with a high-level
      representation of their state offers a neat solution to a few problems:
      
      Firstly, the information about which context a device's stream IDs are
      associated with is already present by necessity in the S2CR. With that
      state easily accessible we can refer directly to it and obviate the need
      to track an IOMMU domain in each device's archdata (its earlier purpose
      of enforcing correct attachment of multi-device groups now being handled
      by the IOMMU core itself).
      
      Secondly, the core API now deprecates explicit domain detach and expects
      domain attach to move devices smoothly from one domain to another; for
      SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
      to the device. By giving the driver a suitable abstraction of those
      S2CRs to work with, we can massively reduce the overhead of the current
      heavy-handed "detach, free resources, reallocate resources, attach"
      approach.
      
      Thirdly, making the software state hardware-shaped and attached to the
      SMMU instance once again makes suspend/resume of this register group
      that much simpler to implement in future.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8e8b203e
    • R
      iommu/arm-smmu: Consolidate stream map entry state · 1f3d5ca4
      Robin Murphy 提交于
      In order to consider SMR masking, we really want to be able to validate
      ID/mask pairs against existing SMR contents to prevent stream match
      conflicts, which at best would cause transactions to fault unexpectedly,
      and at worst lead to silent unpredictable behaviour. With our SMMU
      instance data holding only an allocator bitmap, and the SMR values
      themselves scattered across master configs hanging off devices which we
      may have no way of finding, there's essentially no way short of digging
      everything back out of the hardware. Similarly, the thought of power
      management ops to support suspend/resume faces the exact same problem.
      
      By massaging the software state into a closer shape to the underlying
      hardware, everything comes together quite nicely; the allocator and the
      high-level view of the data become a single centralised state which we
      can easily keep track of, and to which any updates can be validated in
      full before being synchronised to the hardware itself.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1f3d5ca4
    • R
      iommu/arm-smmu: Handle stream IDs more dynamically · 21174240
      Robin Murphy 提交于
      Rather than assuming fixed worst-case values for stream IDs and SMR
      masks, keep track of whatever implemented bits the hardware actually
      reports. This also obviates the slightly questionable validation of SMR
      fields in isolation - rather than aborting the whole SMMU probe for a
      hardware configuration which is still architecturally valid, we can
      simply refuse masters later if they try to claim an unrepresentable ID
      or mask (which almost certainly implies a DT error anyway).
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      21174240
    • R
      iommu/arm-smmu: Set PRIVCFG in stage 1 STEs · 95fa99aa
      Robin Murphy 提交于
      Implement the SMMUv3 equivalent of d346180e ("iommu/arm-smmu: Treat
      all device transactions as unprivileged"), so that once again those
      pesky DMA controllers with their privileged instruction fetches don't
      unexpectedly fault in stage 1 domains due to VMSAv8 rules.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      95fa99aa
    • R
      iommu/arm-smmu: Support non-PCI devices with SMMUv3 · 08d4ca2a
      Robin Murphy 提交于
      With the device <-> stream ID relationship suitably abstracted and
      of_xlate() hooked up, the PCI dependency now looks, and is, entirely
      arbitrary. Any bus using the of_dma_configure() mechanism will work,
      so extend support to the platform and AMBA buses which do just that.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      08d4ca2a
    • R
      iommu/arm-smmu: Implement of_xlate() for SMMUv3 · 8f785154
      Robin Murphy 提交于
      Now that we can properly describe the mapping between PCI RIDs and
      stream IDs via "iommu-map", and have it fed it to the driver
      automatically via of_xlate(), rework the SMMUv3 driver to benefit from
      that, and get rid of the current misuse of the "iommus" binding.
      
      Since having of_xlate wired up means that masters will now be given the
      appropriate DMA ops, we also need to make sure that default domains work
      properly. This necessitates dispensing with the "whole group at a time"
      notion for attaching to a domain, as devices which share a group get
      attached to the group's default domain one by one as they are initially
      probed.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8f785154
    • R
      iommu/arm-smmu: Fall back to global bypass · dc87a98d
      Robin Murphy 提交于
      Unlike SMMUv2, SMMUv3 has no easy way to bypass unknown stream IDs,
      other than allocating and filling in the entire stream table with bypass
      entries, which for some configurations would waste *gigabytes* of RAM.
      Otherwise, all transactions on unknown stream IDs will simply be aborted
      with a C_BAD_STREAMID event.
      
      Rather than render the system unusable in the case of an invalid DT,
      avoid enabling the SMMU altogether such that everything bypasses
      (though letting the explicit disable_bypass option take precedence).
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      dc87a98d
    • R
      iommu: Introduce iommu_fwspec · 57f98d2f
      Robin Murphy 提交于
      Introduce a common structure to hold the per-device firmware data that
      most IOMMU drivers need to keep track of. This enables us to configure
      much of that data from common firmware code, and consolidate a lot of
      the equivalent implementations, device look-up tables, etc. which are
      currently strewn across IOMMU drivers.
      
      This will also be enable us to address the outstanding "multiple IOMMUs
      on the platform bus" problem by tweaking IOMMU API calls to prefer
      dev->fwspec->ops before falling back to dev->bus->iommu_ops, and thus
      gracefully handle those troublesome systems which we currently cannot.
      
      As the first user, hook up the OF IOMMU configuration mechanism. The
      driver-defined nature of DT cells means that we still need the drivers
      to translate and add the IDs themselves, but future users such as the
      much less free-form ACPI IORT will be much simpler and self-contained.
      
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      57f98d2f
    • R
      iommu/of: Handle iommu-map property for PCI · b996444c
      Robin Murphy 提交于
      Now that we have a way to pick up the RID translation and target IOMMU,
      hook up of_iommu_configure() to bring PCI devices into the of_xlate
      mechanism and allow them IOMMU-backed DMA ops without the need for
      driver-specific handling.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b996444c
    • W
      iommu/arm-smmu: Disable interrupts whilst holding the cmdq lock · 8ded2909
      Will Deacon 提交于
      The cmdq lock is taken whenever we issue commands into the command queue,
      which can occur in IRQ context (as a result of unmap) or in process
      context (as a result of a threaded IRQ handler or device probe).
      
      This can lead to a theoretical deadlock if the interrupt handler
      performing the unmap hits whilst the lock is taken, so explicitly use
      the {irqsave,irqrestore} spin_lock accessors for the cmdq lock.
      Tested-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8ded2909
    • J
      iommu/arm-smmu: Fix polling of command queue · bcfced15
      Jean-Philippe Brucker 提交于
      When the SMMUv3 driver attempts to send a command, it adds an entry to the
      command queue. This is a circular buffer, where both the producer and
      consumer have a wrap bit. When producer.index == consumer.index and
      producer.wrap == consumer.wrap, the list is empty. When producer.index ==
      consumer.index and producer.wrap != consumer.wrap, the list is full.
      
      If the list is full when the driver needs to add a command, it waits for
      the SMMU to consume one command, and advance the consumer pointer. The
      problem is that we currently rely on "X before Y" operation to know if
      entries have been consumed, which is a bit fiddly since it only makes
      sense when the distance between X and Y is less than or equal to the size
      of the queue. At the moment when the list is full, we use "Consumer before
      Producer + 1", which is out of range and returns a value opposite to what
      we expect: when the queue transitions to not full, we stay in the polling
      loop and time out, printing an error.
      
      Given that the actual bug was difficult to determine, simplify the polling
      logic by relying exclusively on queue_full and queue_empty, that don't
      have this range constraint. Polling the queue is now straightforward:
      
      * When we want to add a command and the list is full, wait until it isn't
        full and retry.
      * After adding a sync, wait for the list to be empty before returning.
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bcfced15
    • R
      iommu/arm-smmu: Support v7s context format · 6070529b
      Robin Murphy 提交于
      Fill in the last bits of machinery required to drive a stage 1 context
      bank in v7 short descriptor format. By default we'll prefer to use it
      only when the CPUs are also using the same format, such that we're
      guaranteed that everything will be strictly 32-bit.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6070529b
    • J
      iommu/arm-smmu: Fix event queues synchronization · b4163fb3
      Jean-Philippe Brucker 提交于
      SMMUv3 only sends interrupts for event queues (EVTQ and PRIQ) when they
      transition from empty to non-empty. At the moment, if the SMMU adds new
      items to a queue before the event thread finished consuming a previous
      batch, the driver ignores any new item. The queue is then stuck in
      non-empty state and all subsequent events will be lost.
      
      As an example, consider the following flow, where (P, C) is the SMMU view
      of producer/consumer indices, and (p, c) the driver view.
      
      						P C | p c
        1. SMMU appends a PPR to the PRI queue,	1 0 | 0 0
                sends an MSI
        2. PRIQ handler is called.			1 0 | 1 0
        3. SMMU appends a PPR to the PRI queue.	2 0 | 1 0
        4. PRIQ thread removes the first element.	2 1 | 1 1
      
        5. PRIQ thread believes that the queue is empty, goes into idle
           indefinitely.
      
      To avoid this, always synchronize the producer index and drain the queue
      once before leaving an event handler. In order to prevent races on the
      local producer index, move all event queue handling into the threads.
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b4163fb3
    • P
      iommu/arm-smmu: Drop devm_free_irq when driver detach · e2d42311
      Peng Fan 提交于
      There is no need to call devm_free_irq when driver detach.
      devres_release_all which is called after 'drv->remove' will
      release all managed resources.
      Signed-off-by: NPeng Fan <van.freenix@gmail.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e2d42311
  2. 19 8月, 2016 4 次提交
    • W
      iommu/arm-smmu: Don't BUG() if we find aborting STEs with disable_bypass · 5bc0a116
      Will Deacon 提交于
      The disable_bypass cmdline option changes the SMMUv3 driver to put down
      faulting stream table entries by default, as opposed to bypassing
      transactions from unconfigured devices.
      
      In this mode of operation, it is entirely expected to see aborting
      entries in the stream table if and when we come to installing a valid
      translation, so don't trigger a BUG() as a result of misdiagnosing these
      entries as stream table corruption.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 48ec83bc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
      Tested-by: NRobin Murphy <robin.murphy@arm.com>
      Reported-by: NRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      5bc0a116
    • W
      iommu/arm-smmu: Disable stalling faults for all endpoints · 3714ce1d
      Will Deacon 提交于
      Enabling stalling faults can result in hardware deadlock on poorly
      designed systems, particularly those with a PCI root complex upstream of
      the SMMU.
      
      Although it's not really Linux's job to save hardware integrators from
      their own misfortune, it *is* our job to stop userspace (e.g. VFIO
      clients) from hosing the system for everybody else, even if they might
      already be required to have elevated privileges.
      
      Given that the fault handling code currently executes entirely in IRQ
      context, there is nothing that can sensibly be done to recover from
      things like page faults anyway, so let's rip this code out for now and
      avoid the potential for deadlock.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 48ec83bc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
      Reported-by: NMatt Evans <matt.evans@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      3714ce1d
    • W
      iommu/arm-smmu: Fix CMDQ error handling · aea2037e
      Will Deacon 提交于
      In the unlikely event of a global command queue error, the ARM SMMUv3
      driver attempts to convert the problematic command into a CMD_SYNC and
      resume the command queue. Unfortunately, this code is pretty badly
      broken:
      
        1. It uses the index into the error string table as the CMDQ index,
           so we probably read the wrong entry out of the queue
      
        2. The arguments to queue_write are the wrong way round, so we end up
           writing from the queue onto the stack.
      
      These happily cancel out, so the kernel is likely to stay alive, but
      the command queue will probably fault again when we resume.
      
      This patch fixes the error handling code to use the correct queue index
      and write back the CMD_SYNC to the faulting entry.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 48ec83bc ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices")
      Reported-by: NDiwakar Subraveti <Diwakar.Subraveti@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      aea2037e
    • R
      iommu/io-pgtable-arm-v7s: Fix attributes when splitting blocks · e633fc7a
      Robin Murphy 提交于
      Due to the attribute bits being all over the place in the different
      types of short-descriptor PTEs, when remapping an existing entry, e.g.
      splitting a section into pages, we take the approach of decomposing
      the PTE attributes back to the IOMMU API flags to start from scratch.
      
      On inspection, though, the existing code seems to have got the read-only
      bit backwards and ignored the XN bit. How embarrassing...
      
      Fortunately the primary user so far, the Mediatek IOMMU, both never
      splits blocks (because it only serves non-overlapping DMA API calls) and
      also ignores permissions anyway, but let's put things right before any
      future users trip up.
      
      Cc: <stable@vger.kernel.org>
      Fixes: e5fc9753 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e633fc7a
  3. 10 8月, 2016 1 次提交
  4. 09 8月, 2016 2 次提交
  5. 04 8月, 2016 1 次提交
    • K
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski 提交于
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Acked-by: NRobin Murphy <robin.murphy@arm.com>
      Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
  6. 28 7月, 2016 1 次提交
    • L
      Add braces to avoid "ambiguous ‘else’" compiler warnings · 194dc870
      Linus Torvalds 提交于
      Some of our "for_each_xyz()" macro constructs make gcc unhappy about
      lack of braces around if-statements inside or outside the loop, because
      the loop construct itself has a "if-then-else" statement inside of it.
      
      The resulting warnings look something like this:
      
        drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
        drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
           if (ctx != dev_priv->kernel_context)
              ^
      
      even if the code itself is fine.
      
      Since the warning is fairly easy to avoid by adding a braces around the
      if-statement near the for_each_xyz() construct, do so, rather than
      disabling the otherwise potentially useful warning.
      
      (The if-then-else statements used in the "for_each_xyz()" constructs are
      designed to be inherently safe even with no braces, but in this case
      it's quite understandable that gcc isn't really able to tell that).
      
      This finally leaves the standard "allmodconfig" build with just a
      handful of remaining warnings, so new and valid warnings hopefully will
      stand out.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      194dc870
  7. 27 7月, 2016 1 次提交
  8. 26 7月, 2016 2 次提交
  9. 14 7月, 2016 6 次提交
  10. 13 7月, 2016 8 次提交