1. 08 11月, 2016 1 次提交
    • R
      iommu/arm-smmu: Work around ARM DMA configuration · fba4f8e5
      Robin Murphy 提交于
      The 32-bit ARM DMA configuration code predates the IOMMU core's default
      domain functionality, and instead relies on allocating its own domains
      and attaching any devices using the generic IOMMU binding to them.
      Unfortunately, it does this relatively early on in the creation of the
      device, before we've seen our add_device callback, which leads us to
      attempt to operate on a half-configured master.
      
      To avoid a crash, check for this situation on attach, but refuse to
      play, as there's nothing we can do. This at least allows VFIO to keep
      working for people who update their 32-bit DTs to the generic binding,
      albeit with a few (innocuous) warnings from the DMA layer on boot.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      fba4f8e5
  2. 22 9月, 2016 1 次提交
  3. 20 9月, 2016 1 次提交
  4. 19 9月, 2016 3 次提交
  5. 16 9月, 2016 24 次提交
    • R
      iommu/io-pgtable-arm: Check for v7s-incapable systems · 82db33dc
      Robin Murphy 提交于
      On machines with no 32-bit addressable RAM whatsoever, we shouldn't
      even touch the v7s format as it's never going to work.
      
      Fixes: e5fc9753 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
      Reported-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NEric Auger <eric.auger@redhat.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      82db33dc
    • R
      iommu/dma: Avoid PCI host bridge windows · fade1ec0
      Robin Murphy 提交于
      With our DMA ops enabled for PCI devices, we should avoid allocating
      IOVAs which a host bridge might misinterpret as peer-to-peer DMA and
      lead to faults, corruption or other badness. To be safe, punch out holes
      for all of the relevant host bridge's windows when initialising a DMA
      domain for a PCI device.
      
      CC: Marek Szyprowski <m.szyprowski@samsung.com>
      CC: Inki Dae <inki.dae@samsung.com>
      Reported-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      fade1ec0
    • R
      iommu/dma: Add support for mapping MSIs · 44bb7e24
      Robin Murphy 提交于
      When an MSI doorbell is located downstream of an IOMMU, attaching
      devices to a DMA ops domain and switching on translation leads to a rude
      shock when their attempt to write to the physical address returned by
      the irqchip driver faults (or worse, writes into some already-mapped
      buffer) and no interrupt is forthcoming.
      
      Address this by adding a hook for relevant irqchip drivers to call from
      their compose_msi_msg() callback, to swizzle the physical address with
      an appropriatly-mapped IOVA for any device attached to one of our DMA
      ops domains.
      Acked-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      44bb7e24
    • R
      iommu/arm-smmu: Set domain geometry · 455eb7d3
      Robin Murphy 提交于
      For non-aperture-based IOMMUs, the domain geometry seems to have become
      the de-facto way of indicating the input address space size. That is
      quite a useful thing from the users' perspective, so let's do the same.
      Reviewed-by: NEric Auger <eric.auger@redhat.com>
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      455eb7d3
    • R
      iommu/arm-smmu: Wire up generic configuration support · 021bb842
      Robin Murphy 提交于
      With everything else now in place, fill in an of_xlate callback and the
      appropriate registration to plumb into the generic configuration
      machinery, and watch everything just work.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      021bb842
    • R
      iommu/arm-smmu: Convert to iommu_fwspec · adfec2e7
      Robin Murphy 提交于
      In the final step of preparation for full generic configuration support,
      swap our fixed-size master_cfg for the generic iommu_fwspec. For the
      legacy DT bindings, the driver simply gets to act as its own 'firmware'.
      Farewell, arbitrary MAX_MASTER_STREAMIDS!
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      adfec2e7
    • R
      iommu/arm-smmu: Intelligent SMR allocation · 588888a7
      Robin Murphy 提交于
      Stream Match Registers are one of the more awkward parts of the SMMUv2
      architecture; there are typically never enough to assign one to each
      stream ID in the system, and configuring them such that a single ID
      matches multiple entries is catastrophically bad - at best, every
      transaction raises a global fault; at worst, they go *somewhere*.
      
      To address the former issue, we can mask ID bits such that a single
      register may be used to match multiple IDs belonging to the same device
      or group, but doing so also heightens the risk of the latter problem
      (which can be nasty to debug).
      
      Tackle both problems at once by replacing the simple bitmap allocator
      with something much cleverer. Now that we have convenient in-memory
      representations of the stream mapping table, it becomes straightforward
      to properly validate new SMR entries against the current state, opening
      the door to arbitrary masking and SMR sharing.
      
      Another feature which falls out of this is that with IDs shared by
      separate devices being automatically accounted for, simply associating a
      group pointer with the S2CR offers appropriate group allocation almost
      for free, so hook that up in the process.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      588888a7
    • R
      iommu/arm-smmu: Add a stream map entry iterator · d3097e39
      Robin Murphy 提交于
      We iterate over the SMEs associated with a master config quite a lot in
      various places, and are about to do so even more. Let's wrap the idiom
      in a handy iterator macro before the repetition gets out of hand.
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d3097e39
    • R
      iommu/arm-smmu: Streamline SMMU data lookups · d6fc5d97
      Robin Murphy 提交于
      Simplify things somewhat by stashing our arm_smmu_device instance in
      drvdata, so that it's readily available to our driver model callbacks.
      Then we can excise the private list entirely, since the driver core
      already has a perfectly good list of SMMU devices we can use in the one
      instance we actually need to. Finally, make a further modest code saving
      with the relatively new of_device_get_match_data() helper.
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      d6fc5d97
    • R
      iommu/arm-smmu: Refactor mmu-masters handling · f80cd885
      Robin Murphy 提交于
      To be able to support the generic bindings and handle of_xlate() calls,
      we need to be able to associate SMMUs and stream IDs directly with
      devices *before* allocating IOMMU groups. Furthermore, to support real
      default domains with multi-device groups we also have to handle domain
      attach on a per-device basis, as the "whole group at a time" assumption
      fails to properly handle subsequent devices added to a group after the
      first has already triggered default domain creation and attachment.
      
      To that end, use the now-vacant dev->archdata.iommu field for easy
      config and SMMU instance lookup, and unify config management by chopping
      down the platform-device-specific tree and probing the "mmu-masters"
      property on-demand instead. This may add a bit of one-off overhead to
      initially adding a new device, but we're about to deprecate that binding
      in favour of the inherently-more-efficient generic ones anyway.
      
      For the sake of simplicity, this patch does temporarily regress the case
      of aliasing PCI devices by losing the duplicate stream ID detection that
      the previous per-group config had. Stay tuned, because we'll be back to
      fix that in a better and more general way momentarily...
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      f80cd885
    • R
      iommu/arm-smmu: Keep track of S2CR state · 8e8b203e
      Robin Murphy 提交于
      Making S2CRs first-class citizens within the driver with a high-level
      representation of their state offers a neat solution to a few problems:
      
      Firstly, the information about which context a device's stream IDs are
      associated with is already present by necessity in the S2CR. With that
      state easily accessible we can refer directly to it and obviate the need
      to track an IOMMU domain in each device's archdata (its earlier purpose
      of enforcing correct attachment of multi-device groups now being handled
      by the IOMMU core itself).
      
      Secondly, the core API now deprecates explicit domain detach and expects
      domain attach to move devices smoothly from one domain to another; for
      SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
      to the device. By giving the driver a suitable abstraction of those
      S2CRs to work with, we can massively reduce the overhead of the current
      heavy-handed "detach, free resources, reallocate resources, attach"
      approach.
      
      Thirdly, making the software state hardware-shaped and attached to the
      SMMU instance once again makes suspend/resume of this register group
      that much simpler to implement in future.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8e8b203e
    • R
      iommu/arm-smmu: Consolidate stream map entry state · 1f3d5ca4
      Robin Murphy 提交于
      In order to consider SMR masking, we really want to be able to validate
      ID/mask pairs against existing SMR contents to prevent stream match
      conflicts, which at best would cause transactions to fault unexpectedly,
      and at worst lead to silent unpredictable behaviour. With our SMMU
      instance data holding only an allocator bitmap, and the SMR values
      themselves scattered across master configs hanging off devices which we
      may have no way of finding, there's essentially no way short of digging
      everything back out of the hardware. Similarly, the thought of power
      management ops to support suspend/resume faces the exact same problem.
      
      By massaging the software state into a closer shape to the underlying
      hardware, everything comes together quite nicely; the allocator and the
      high-level view of the data become a single centralised state which we
      can easily keep track of, and to which any updates can be validated in
      full before being synchronised to the hardware itself.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      1f3d5ca4
    • R
      iommu/arm-smmu: Handle stream IDs more dynamically · 21174240
      Robin Murphy 提交于
      Rather than assuming fixed worst-case values for stream IDs and SMR
      masks, keep track of whatever implemented bits the hardware actually
      reports. This also obviates the slightly questionable validation of SMR
      fields in isolation - rather than aborting the whole SMMU probe for a
      hardware configuration which is still architecturally valid, we can
      simply refuse masters later if they try to claim an unrepresentable ID
      or mask (which almost certainly implies a DT error anyway).
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      21174240
    • R
      iommu/arm-smmu: Set PRIVCFG in stage 1 STEs · 95fa99aa
      Robin Murphy 提交于
      Implement the SMMUv3 equivalent of d346180e ("iommu/arm-smmu: Treat
      all device transactions as unprivileged"), so that once again those
      pesky DMA controllers with their privileged instruction fetches don't
      unexpectedly fault in stage 1 domains due to VMSAv8 rules.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      95fa99aa
    • R
      iommu/arm-smmu: Support non-PCI devices with SMMUv3 · 08d4ca2a
      Robin Murphy 提交于
      With the device <-> stream ID relationship suitably abstracted and
      of_xlate() hooked up, the PCI dependency now looks, and is, entirely
      arbitrary. Any bus using the of_dma_configure() mechanism will work,
      so extend support to the platform and AMBA buses which do just that.
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      08d4ca2a
    • R
      iommu/arm-smmu: Implement of_xlate() for SMMUv3 · 8f785154
      Robin Murphy 提交于
      Now that we can properly describe the mapping between PCI RIDs and
      stream IDs via "iommu-map", and have it fed it to the driver
      automatically via of_xlate(), rework the SMMUv3 driver to benefit from
      that, and get rid of the current misuse of the "iommus" binding.
      
      Since having of_xlate wired up means that masters will now be given the
      appropriate DMA ops, we also need to make sure that default domains work
      properly. This necessitates dispensing with the "whole group at a time"
      notion for attaching to a domain, as devices which share a group get
      attached to the group's default domain one by one as they are initially
      probed.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8f785154
    • R
      iommu/arm-smmu: Fall back to global bypass · dc87a98d
      Robin Murphy 提交于
      Unlike SMMUv2, SMMUv3 has no easy way to bypass unknown stream IDs,
      other than allocating and filling in the entire stream table with bypass
      entries, which for some configurations would waste *gigabytes* of RAM.
      Otherwise, all transactions on unknown stream IDs will simply be aborted
      with a C_BAD_STREAMID event.
      
      Rather than render the system unusable in the case of an invalid DT,
      avoid enabling the SMMU altogether such that everything bypasses
      (though letting the explicit disable_bypass option take precedence).
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      dc87a98d
    • R
      iommu: Introduce iommu_fwspec · 57f98d2f
      Robin Murphy 提交于
      Introduce a common structure to hold the per-device firmware data that
      most IOMMU drivers need to keep track of. This enables us to configure
      much of that data from common firmware code, and consolidate a lot of
      the equivalent implementations, device look-up tables, etc. which are
      currently strewn across IOMMU drivers.
      
      This will also be enable us to address the outstanding "multiple IOMMUs
      on the platform bus" problem by tweaking IOMMU API calls to prefer
      dev->fwspec->ops before falling back to dev->bus->iommu_ops, and thus
      gracefully handle those troublesome systems which we currently cannot.
      
      As the first user, hook up the OF IOMMU configuration mechanism. The
      driver-defined nature of DT cells means that we still need the drivers
      to translate and add the IDs themselves, but future users such as the
      much less free-form ACPI IORT will be much simpler and self-contained.
      
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      57f98d2f
    • R
      iommu/of: Handle iommu-map property for PCI · b996444c
      Robin Murphy 提交于
      Now that we have a way to pick up the RID translation and target IOMMU,
      hook up of_iommu_configure() to bring PCI devices into the of_xlate
      mechanism and allow them IOMMU-backed DMA ops without the need for
      driver-specific handling.
      Reviewed-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b996444c
    • W
      iommu/arm-smmu: Disable interrupts whilst holding the cmdq lock · 8ded2909
      Will Deacon 提交于
      The cmdq lock is taken whenever we issue commands into the command queue,
      which can occur in IRQ context (as a result of unmap) or in process
      context (as a result of a threaded IRQ handler or device probe).
      
      This can lead to a theoretical deadlock if the interrupt handler
      performing the unmap hits whilst the lock is taken, so explicitly use
      the {irqsave,irqrestore} spin_lock accessors for the cmdq lock.
      Tested-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      8ded2909
    • J
      iommu/arm-smmu: Fix polling of command queue · bcfced15
      Jean-Philippe Brucker 提交于
      When the SMMUv3 driver attempts to send a command, it adds an entry to the
      command queue. This is a circular buffer, where both the producer and
      consumer have a wrap bit. When producer.index == consumer.index and
      producer.wrap == consumer.wrap, the list is empty. When producer.index ==
      consumer.index and producer.wrap != consumer.wrap, the list is full.
      
      If the list is full when the driver needs to add a command, it waits for
      the SMMU to consume one command, and advance the consumer pointer. The
      problem is that we currently rely on "X before Y" operation to know if
      entries have been consumed, which is a bit fiddly since it only makes
      sense when the distance between X and Y is less than or equal to the size
      of the queue. At the moment when the list is full, we use "Consumer before
      Producer + 1", which is out of range and returns a value opposite to what
      we expect: when the queue transitions to not full, we stay in the polling
      loop and time out, printing an error.
      
      Given that the actual bug was difficult to determine, simplify the polling
      logic by relying exclusively on queue_full and queue_empty, that don't
      have this range constraint. Polling the queue is now straightforward:
      
      * When we want to add a command and the list is full, wait until it isn't
        full and retry.
      * After adding a sync, wait for the list to be empty before returning.
      Suggested-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      bcfced15
    • R
      iommu/arm-smmu: Support v7s context format · 6070529b
      Robin Murphy 提交于
      Fill in the last bits of machinery required to drive a stage 1 context
      bank in v7 short descriptor format. By default we'll prefer to use it
      only when the CPUs are also using the same format, such that we're
      guaranteed that everything will be strictly 32-bit.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      6070529b
    • J
      iommu/arm-smmu: Fix event queues synchronization · b4163fb3
      Jean-Philippe Brucker 提交于
      SMMUv3 only sends interrupts for event queues (EVTQ and PRIQ) when they
      transition from empty to non-empty. At the moment, if the SMMU adds new
      items to a queue before the event thread finished consuming a previous
      batch, the driver ignores any new item. The queue is then stuck in
      non-empty state and all subsequent events will be lost.
      
      As an example, consider the following flow, where (P, C) is the SMMU view
      of producer/consumer indices, and (p, c) the driver view.
      
      						P C | p c
        1. SMMU appends a PPR to the PRI queue,	1 0 | 0 0
                sends an MSI
        2. PRIQ handler is called.			1 0 | 1 0
        3. SMMU appends a PPR to the PRI queue.	2 0 | 1 0
        4. PRIQ thread removes the first element.	2 1 | 1 1
      
        5. PRIQ thread believes that the queue is empty, goes into idle
           indefinitely.
      
      To avoid this, always synchronize the producer index and drain the queue
      once before leaving an event handler. In order to prevent races on the
      local producer index, move all event queue handling into the threads.
      Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      b4163fb3
    • P
      iommu/arm-smmu: Drop devm_free_irq when driver detach · e2d42311
      Peng Fan 提交于
      There is no need to call devm_free_irq when driver detach.
      devres_release_all which is called after 'drv->remove' will
      release all managed resources.
      Signed-off-by: NPeng Fan <van.freenix@gmail.com>
      Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      e2d42311
  6. 15 9月, 2016 1 次提交
    • J
      iommu/amd: Don't put completion-wait semaphore on stack · 4bf5beef
      Joerg Roedel 提交于
      The semaphore used by the AMD IOMMU to signal command
      completion lived on the stack until now, which was safe as
      the driver busy-waited on the semaphore with IRQs disabled,
      so the stack can't go away under the driver.
      
      But the recently introduced vmap-based stacks break this as
      the physical address of the semaphore can't be determinded
      easily anymore. The driver used the __pa() macro, but that
      only works in the direct-mapping. The result were
      Completion-Wait timeout errors seen by the IOMMU driver,
      breaking system boot.
      
      Since putting the semaphore on the stack is bad design
      anyway, move the semaphore into 'struct amd_iommu'. It is
      protected by the per-iommu lock and now in the direct
      mapping again. This fixes the Completion-Wait timeout errors
      and makes AMD IOMMU systems boot again with vmap-based
      stacks enabled.
      Reported-by: NBorislav Petkov <bp@alien8.de>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      4bf5beef
  7. 05 9月, 2016 9 次提交