提交 · 4fac8076df854aa4ddb8acbf6cce9d337300219e · openeuler / Kernel

15 1月, 2018 1 次提交

ia64: clean up swiotlb support · 4fac8076

由 Christoph Hellwig 提交于 12月 24, 2017

Move the few remaining bits of swiotlb glue towards their callers,
and remove the pointless on ia64 swiotlb variable.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NChristian König <christian.koenig@amd.com>

4fac8076

18 11月, 2017 1 次提交

iommu/vt-d: Fix scatterlist offset handling · 29a90b70

由 Robin Murphy 提交于 9月 28, 2017

The intel-iommu DMA ops fail to correctly handle scatterlists where
sg->offset is greater than PAGE_SIZE - the IOVA allocation is computed
appropriately based on the page-aligned portion of the offset, but the
mapping is set up relative to sg->page, which means it fails to actually
cover the whole buffer (and in the worst case doesn't cover it at all):

    (sg->dma_address + sg->dma_len) ----+
    sg->dma_address ---------+          |
    iov_pfn------+           |          |
                 |           |          |
                 v           v          v
iova:   a        b        c        d        e        f
        |--------|--------|--------|--------|--------|
                          <...calculated....>
                 [_____mapped______]
pfn:    0        1        2        3        4        5
        |--------|--------|--------|--------|--------|
                 ^           ^          ^
                 |           |          |
    sg->page ----+           |          |
    sg->offset --------------+          |
    (sg->offset + sg->length) ----------+

As a result, the caller ends up overrunning the mapping into whatever
lies beyond, which usually goes badly:

[  429.645492] DMAR: DRHD: handling fault status reg 2
[  429.650847] DMAR: [DMA Write] Request device [02:00.4] fault addr f2682000 ...

Whilst this is a fairly rare occurrence, it can happen from the result
of intermediate scatterlist processing such as scatterwalk_ffwd() in the
crypto layer. Whilst that particular site could be fixed up, it still
seems worthwhile to bring intel-iommu in line with other DMA API
implementations in handling this robustly.

To that end, fix the intel_map_sg() path to line up the mapping
correctly (in units of MM pages rather than VT-d pages to match the
aligned_nrpages() calculation) regardless of the offset, and use
sg_phys() consistently for clarity.
Reported-by: NHarsh Jain <Harsh@chelsio.com>
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Reviewed by: Ashok Raj <ashok.raj@intel.com>
Tested by: Jacob Pan <jacob.jun.pan@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>

29a90b70

12 10月, 2017 1 次提交

iommu/iova: Make rcache flush optional on IOVA allocation failure · 538d5b33

由 Tomasz Nowicki 提交于 9月 20, 2017

Since IOVA allocation failure is not unusual case we need to flush
CPUs' rcache in hope we will succeed in next round.

However, it is useful to decide whether we need rcache flush step because
of two reasons:
- Not scalability. On large system with ~100 CPUs iterating and flushing
  rcache for each CPU becomes serious bottleneck so we may want to defer it.
- free_cpu_cached_iovas() does not care about max PFN we are interested in.
  Thus we may flush our rcaches and still get no new IOVA like in the
  commonly used scenario:

    if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
        iova = alloc_iova_fast(iovad, iova_len, DMA_BIT_MASK(32) >> shift);

    if (!iova)
        iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift);

   1. First alloc_iova_fast() call is limited to DMA_BIT_MASK(32) to get
      PCI devices a SAC address
   2. alloc_iova() fails due to full 32-bit space
   3. rcaches contain PFNs out of 32-bit space so free_cpu_cached_iovas()
      throws entries away for nothing and alloc_iova() fails again
   4. Next alloc_iova_fast() call cannot take advantage of rcache since we
      have just defeated caches. In this case we pick the slowest option
      to proceed.

This patch reworks flushed_rcache local flag to be additional function
argument instead and control rcache flush step. Also, it updates all users
to do the flush as the last chance.
Signed-off-by: NTomasz Nowicki <Tomasz.Nowicki@caviumnetworks.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Tested-by: NNate Watterson <nwatters@codeaurora.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

538d5b33

10 10月, 2017 1 次提交

iommu/vt-d: Delete unnecessary check in domain_context_mapping_one() · b117e038

由 Christos Gkekas 提交于 10月 08, 2017

Variable did_old is unsigned so checking whether it is
greater or equal to zero is not necessary.
Signed-off-by: NChristos Gkekas <chris.gekas@gmail.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b117e038

06 10月, 2017 1 次提交

iommu/vt-d: Don't register bus-notifier under dmar_global_lock · ec154bf5

由 Joerg Roedel 提交于 10月 06, 2017

The notifier function will take the dmar_global_lock too, so
lockdep complains about inverse locking order when the
notifier is registered under the dmar_global_lock.
Reported-by: NJan Kiszka <jan.kiszka@siemens.com>
Fixes: 59ce0515 ('iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

ec154bf5

27 9月, 2017 1 次提交

iommu/iova: Make dma_32bit_pfn implicit · aa3ac946

由 Zhen Lei 提交于 9月 21, 2017

Now that the cached node optimisation can apply to all allocations, the
couple of users which were playing tricks with dma_32bit_pfn in order to
benefit from it can stop doing so. Conversely, there is also no need for
all the other users to explicitly calculate a 'real' 32-bit PFN, when
init_iova_domain() can happily do that itself from the page granularity.

CC: Thierry Reding <thierry.reding@gmail.com>
CC: Jonathan Hunter <jonathanh@nvidia.com>
CC: David Airlie <airlied@linux.ie>
CC: Sudeep Dutt <sudeep.dutt@intel.com>
CC: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com>
Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: NZhen Lei <thunder.leizhen@huawei.com>
Tested-by: NNate Watterson <nwatters@codeaurora.org>
[rm: use iova_shift(), rewrote commit message]
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

aa3ac946

01 9月, 2017 1 次提交

iommu/vt-d: Don't be too aggressive when clearing one context entry · 5082219b

由 Filippo Sironi 提交于 8月 31, 2017

Previously, we were invalidating context cache and IOTLB globally when
clearing one context entry.  This is a tad too aggressive.
Invalidate the context cache and IOTLB for the interested device only.
Signed-off-by: NFilippo Sironi <sironi@amazon.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Acked-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5082219b

31 8月, 2017 1 次提交

iommu/vt-d: Prevent VMD child devices from being remapping targets · 5823e330

由 Jon Derrick 提交于 8月 30, 2017

VMD child devices must use the VMD endpoint's ID as the requester. Because
of this, there needs to be a way to link the parent VMD endpoint's IOMMU
group and associated mappings to the VMD child devices such that attaching
and detaching child devices modify the endpoint's mappings, while
preventing early detaching on a singular device removal or unbinding.

The reassignment of individual VMD child devices devices to VMs is outside
the scope of VMD, but may be implemented in the future. For now it is best
to prevent any such attempts.

Prevent VMD child devices from returning an IOMMU, which prevents it from
exposing an iommu_group sysfs directory and allowing subsequent binding by
userspace-access drivers such as VFIO.
Signed-off-by: NJon Derrick <jonathan.derrick@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>

5823e330

30 8月, 2017 1 次提交

iommu/vt-d: Avoid calling virt_to_phys() on null pointer · 11b93ebf

由 Ashok Raj 提交于 8月 08, 2017

New kernels with debug show panic() from __phys_addr() checks. Avoid
calling virt_to_phys()  when pasid_state_tbl pointer is null

To: Joerg Roedel <joro@8bytes.org>
To: linux-kernel@vger.kernel.org>
Cc: iommu@lists.linux-foundation.org
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>

Fixes: 2f26e0a9 ('iommu/vt-d: Add basic SVM PASID support')
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Acked-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

11b93ebf

16 8月, 2017 1 次提交

iommu/vt-d: Make use of iova deferred flushing · 13cf0174

由 Joerg Roedel 提交于 8月 11, 2017

Remove the deferred flushing implementation in the Intel
VT-d driver and use the one from the common iova code
instead.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

13cf0174

15 8月, 2017 1 次提交

iommu: Fix wrong freeing of iommu_device->dev · 2926a2aa

由 Joerg Roedel 提交于 8月 14, 2017

The struct iommu_device has a 'struct device' embedded into
it, not as a pointer, but the whole struct. In the
conversion of the iommu drivers to use struct iommu_device
it was forgotten that the relase function for that struct
device simply calls kfree() on the pointer.

This frees memory that was never allocated and causes memory
corruption.

To fix this issue, use a pointer to struct device instead of
embedding the whole struct. This needs some updates in the
iommu sysfs code as well as the Intel VT-d and AMD IOMMU
driver.
Reported-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: 39ab9555 ('iommu: Add sysfs bindings for struct iommu_device')
Cc: stable@vger.kernel.org # >= v4.11
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

2926a2aa

26 7月, 2017 1 次提交

iommu/vt-d: Don't free parent pagetable of the PTE we're adding · bc24c571

由 David Dillow 提交于 6月 28, 2017

When adding a large scatterlist entry that covers more than the L3
superpage size (1GB) but has an alignment such that we must use L2
superpages (2MB) , we give dma_pte_free_level() a range that causes it
to free the L3 pagetable we're about to populate. We fix this by telling
dma_pte_free_pagetable() about the pagetable level we're about to populate
to prevent freeing it.

For example, mapping a scatterlist with entry lengths 854MB and 1194MB
at IOVA 0xffff80000000 would, when processing the 2MB-aligned second
entry, cause pfn_to_dma_pte() to create a L3 directory to hold L2
superpages for the mapping at IOVA 0xffffc0000000. We would previously
call dma_pte_free_pagetable(domain, 0xffffc0000, 0xfffffffff), which
would free the L3 directory pfn_to_dma_pte() just created for IO PFN
0xffffc0000. Telling dma_pte_free_pagetable() to retain the L3
directories while using L2 superpages avoids the erroneous free.
Signed-off-by: NDavid Dillow <dillow@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bc24c571

28 6月, 2017 3 次提交

x86: remove arch specific dma_supported implementation · 5860acc1

由 Christoph Hellwig 提交于 5月 22, 2017

And instead wire it up as method for all the dma_map_ops instances.

Note that this also means the arch specific check will be fully instead
of partially applied in the AMD iommu driver.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

5860acc1

iommu/vt-d: Constify intel_dma_ops · 01e1932a

由 Arvind Yadav 提交于 6月 28, 2017

Most dma_map_ops structures are never modified. Constify these
structures such that these can be write-protected.
Signed-off-by: NArvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

01e1932a

iommu/vt-d: Don't disable preemption while accessing deferred_flush() · 58c4a95f

由 Sebastian Andrzej Siewior 提交于 6月 27, 2017

get_cpu() disables preemption and returns the current CPU number. The
CPU number is only used once while retrieving the address of the local's
CPU deferred_flush pointer.
We can instead use raw_cpu_ptr() while we remain preemptible. The worst
thing that can happen is that flush_unmaps_timeout() is invoked multiple
times: once by taskA after seeing HIGH_WATER_MARK and then preempted to
another CPU and then by taskB which saw HIGH_WATER_MARK on the same CPU
as taskA. It is also likely that ->size got from HIGH_WATER_MARK to 0
right after its read because another CPU invoked flush_unmaps_timeout()
for this CPU.
The access to flush_data is protected by a spinlock so even if we get
migrated to another CPU or preempted - the data structure is protected.

While at it, I marked deferred_flush static since I can't find a
reference to it outside of this file.

Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

58c4a95f

30 5月, 2017 1 次提交

iommu/vt-d: Unwrap __get_valid_domain_for_dev() · b316d02a

由 Peter Xu 提交于 5月 22, 2017

We do find_domain() in __get_valid_domain_for_dev(), while we do the
same thing in get_valid_domain_for_dev().  No need to do it twice.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b316d02a

23 5月, 2017 1 次提交

iommu/vt-d: Adjust system_state checks · b608fe35

由 Thomas Gleixner 提交于 5月 16, 2017

To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.

Adjust the system_state checks in dmar_parse_one_atsr() and
dmar_iommu_notify_scope_dev() to handle the extra states.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: NJoerg Roedel <joro@8bytes.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170516184735.712365947@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

b608fe35

17 5月, 2017 1 次提交

iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings · f73a7eee

由 KarimAllah Ahmed 提交于 5月 05, 2017

Ever since commit 091d42e4 ("iommu/vt-d: Copy translation tables from
old kernel") the kdump kernel copies the IOMMU context tables from the
previous kernel. Each device mappings will be destroyed once the driver
for the respective device takes over.

This unfortunately breaks the workflow of mapping and unmapping a new
context to the IOMMU. The mapping function assumes that either:

1) Unmapping did the proper IOMMU flushing and it only ever flush if the
   IOMMU unit supports caching invalid entries.
2) The system just booted and the initialization code took care of
   flushing all IOMMU caches.

This assumption is not true for the kdump kernel since the context
tables have been copied from the previous kernel and translations could
have been cached ever since. So make sure to flush the IOTLB as well
when we destroy these old copied mappings.

Cc: Joerg Roedel <joro@8bytes.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
Acked-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Cc: stable@vger.kernel.org  v4.2+
Fixes: 091d42e4 ("iommu/vt-d: Copy translation tables from old kernel")
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f73a7eee

27 4月, 2017 1 次提交

x86, iommu/vt-d: Add an option to disable Intel IOMMU force on · bfd20f1c

由 Shaohua Li 提交于 4月 26, 2017

IOMMU harms performance signficantly when we run very fast networking
workloads. It's 40GB networking doing XDP test. Software overhead is
almost unaware, but it's the IOTLB miss (based on our analysis) which
kills the performance. We observed the same performance issue even with
software passthrough (identity mapping), only the hardware passthrough
survives. The pps with iommu (with software passthrough) is only about
~30% of that without it. This is a limitation in hardware based on our
observation, so we'd like to disable the IOMMU force on, but we do want
to use TBOOT and we can sacrifice the DMA security bought by IOMMU. I
must admit I know nothing about TBOOT, but TBOOT guys (cc-ed) think not
eabling IOMMU is totally ok.

So introduce a new boot option to disable the force on. It's kind of
silly we need to run into intel_iommu_init even without force on, but we
need to disable TBOOT PMR registers. For system without the boot option,
nothing is changed.
Signed-off-by: NShaohua Li <shli@fb.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bfd20f1c

29 3月, 2017 1 次提交

iommu/vt-d: Make sure IOMMUs are off when intel_iommu=off · 161b28aa

由 Joerg Roedel 提交于 3月 28, 2017

When booting into a kexec kernel with intel_iommu=off, and
the previous kernel had intel_iommu=on, the IOMMU hardware
is still enabled and gets not disabled by the new kernel.

This causes the boot to fail because DMA is blocked by the
hardware. Disable the IOMMUs when we find it enabled in the
kexec kernel and boot with intel_iommu=off.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

161b28aa

22 3月, 2017 2 次提交

iommu: Disambiguate MSI region types · 9d3a4de4

由 Robin Murphy 提交于 3月 16, 2017

The introduction of reserved regions has left a couple of rough edges
which we could do with sorting out sooner rather than later. Since we
are not yet addressing the potential dynamic aspect of software-managed
reservations and presenting them at arbitrary fixed addresses, it is
incongruous that we end up displaying hardware vs. software-managed MSI
regions to userspace differently, especially since ARM-based systems may
actually require one or the other, or even potentially both at once,
(which iommu-dma currently has no hope of dealing with at all). Let's
resolve the former user-visible inconsistency ASAP before the ABI has
been baked into a kernel release, in a way that also lays the groundwork
for the latter shortcoming to be addressed by follow-up patches.

For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use
IOMMU_RESV_MSI to describe the hardware type, and document everything a
little bit. Since the x86 MSI remapping hardware falls squarely under
this meaning of IOMMU_RESV_MSI, apply that type to their regions as well,
so that we tell the same story to userspace across all platforms.

Secondly, as the various region types require quite different handling,
and it really makes little sense to ever try combining them, convert the
bitfield-esque #defines to a plain enum in the process before anyone
gets the wrong impression.

Fixes: d30ddcaa ("iommu: Add a new type field in iommu_resv_region")
Reviewed-by: NEric Auger <eric.auger@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: David Woodhouse <dwmw2@infradead.org>
CC: kvm@vger.kernel.org
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

9d3a4de4

iommu/vt-d: Fix NULL pointer dereference in device_to_iommu · 5003ae1e

由 Koos Vriezen 提交于 3月 01, 2017

The function device_to_iommu() in the Intel VT-d driver
lacks a NULL-ptr check, resulting in this oops at boot on
some platforms:

 BUG: unable to handle kernel NULL pointer dereference at 00000000000007ab
 IP: [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
 PGD 0

 [...]

 Call Trace:
   ? find_or_alloc_domain.constprop.29+0x1a/0x300
   ? dw_dma_probe+0x561/0x580 [dw_dmac_core]
   ? __get_valid_domain_for_dev+0x39/0x120
   ? __intel_map_single+0x138/0x180
   ? intel_alloc_coherent+0xb6/0x120
   ? sst_hsw_dsp_init+0x173/0x420 [snd_soc_sst_haswell_pcm]
   ? mutex_lock+0x9/0x30
   ? kernfs_add_one+0xdb/0x130
   ? devres_add+0x19/0x60
   ? hsw_pcm_dev_probe+0x46/0xd0 [snd_soc_sst_haswell_pcm]
   ? platform_drv_probe+0x30/0x90
   ? driver_probe_device+0x1ed/0x2b0
   ? __driver_attach+0x8f/0xa0
   ? driver_probe_device+0x2b0/0x2b0
   ? bus_for_each_dev+0x55/0x90
   ? bus_add_driver+0x110/0x210
   ? 0xffffffffa11ea000
   ? driver_register+0x52/0xc0
   ? 0xffffffffa11ea000
   ? do_one_initcall+0x32/0x130
   ? free_vmap_area_noflush+0x37/0x70
   ? kmem_cache_alloc+0x88/0xd0
   ? do_init_module+0x51/0x1c4
   ? load_module+0x1ee9/0x2430
   ? show_taint+0x20/0x20
   ? kernel_read_file+0xfd/0x190
   ? SyS_finit_module+0xa3/0xb0
   ? do_syscall_64+0x4a/0xb0
   ? entry_SYSCALL64_slow_path+0x25/0x25
 Code: 78 ff ff ff 4d 85 c0 74 ee 49 8b 5a 10 0f b6 9b e0 00 00 00 41 38 98 e0 00 00 00 77 da 0f b6 eb 49 39 a8 88 00 00 00 72 ce eb 8f <41> f6 82 ab 07 00 00 04 0f 85 76 ff ff ff 0f b6 4d 08 88 0e 49
 RIP  [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
  RSP <ffffc90001457a78>
 CR2: 00000000000007ab
 ---[ end trace 16f974b6d58d0aad ]---

Add the missing pointer check.

Fixes: 1c387188 ("iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions")
Signed-off-by: NKoos Vriezen <koos.vriezen@gmail.com>
Cc: stable@vger.kernel.org # 4.8.15+
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5003ae1e

28 2月, 2017 1 次提交

iommu/vt-d: Fix crash when accessing VT-d sysfs entries · a7fdb6e6

由 Joerg Roedel 提交于 2月 28, 2017

The link between the iommu sysfs-device and the struct
intel_iommu is no longer stored as driver-data. Update the
code to use the new access method.
Reported-by: NDave Jones <davej@codemonkey.org.uk>
Fixes: 39ab9555 ('iommu: Add sysfs bindings for struct iommu_device')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

a7fdb6e6

25 2月, 2017 1 次提交

mm: wire up GFP flag passing in dma_alloc_from_contiguous · 712c604d

由 Lucas Stach 提交于 2月 24, 2017

The callers of the DMA alloc functions already provide the proper
context GFP flags.  Make sure to pass them through to the CMA allocator,
to make the CMA compaction context aware.

Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.deSigned-off-by: NLucas Stach <l.stach@pengutronix.de>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

712c604d

10 2月, 2017 3 次提交

iommu: Make iommu_device_link/unlink take a struct iommu_device · e3d10af1

由 Joerg Roedel 提交于 2月 01, 2017

This makes the interface more consistent with
iommu_device_sysfs_add/remove.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e3d10af1

iommu: Add sysfs bindings for struct iommu_device · 39ab9555

由 Joerg Roedel 提交于 2月 01, 2017

There is currently support for iommu sysfs bindings, but
those need to be implemented in the IOMMU drivers. Add a
more generic version of this by adding a struct device to
struct iommu_device and use that for the sysfs bindings.

Also convert the AMD and Intel IOMMU driver to make use of
it.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

39ab9555

iommu: Introduce new 'struct iommu_device' · b0119e87

由 Joerg Roedel 提交于 2月 01, 2017

This struct represents one hardware iommu in the iommu core
code. For now it only has the iommu-ops associated with it,
but that will be extended soon.

The register/unregister interface is also added, as well as
making use of it in the Intel and AMD IOMMU drivers.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b0119e87

31 1月, 2017 2 次提交

iommu/vt-d: Don't over-free page table directories · f7116e11

由 David Dillow 提交于 1月 30, 2017

dma_pte_free_level() recurses down the IOMMU page tables and frees
directory pages that are entirely contained in the given PFN range.
Unfortunately, it incorrectly calculates the starting address covered
by the PTE under consideration, which can lead to it clearing an entry
that is still in use.

This occurs if we have a scatterlist with an entry that has a length
greater than 1026 MB and is aligned to 2 MB for both the IOMMU and
physical addresses. For example, if __domain_mapping() is asked to map a
two-entry scatterlist with 2 MB and 1028 MB segments to PFN 0xffff80000,
it will ask if dma_pte_free_pagetable() is asked to PFNs from
0xffff80200 to 0xffffc05ff, it will also incorrectly clear the PFNs from
0xffff80000 to 0xffff801ff because of this issue. The current code will
set level_pfn to 0xffff80200, and 0xffff80200-0xffffc01ff fits inside
the range being cleared. Properly setting the level_pfn for the current
level under consideration catches that this PTE is outside of the range
being cleared.

This patch also changes the value passed into dma_pte_free_level() when
it recurses. This only affects the first PTE of the range being cleared,
and is handled by the existing code that ensures we start our cursor no
lower than start_pfn.

This was found when using dma_map_sg() to map large chunks of contiguous
memory, which immediatedly led to faults on the first access of the
erroneously-deleted mappings.

Fixes: 3269ee0b ("intel-iommu: Fix leaks in pagetable freeing")
Reviewed-by: NBenjamin Serebrin <serebrin@google.com>
Signed-off-by: NDavid Dillow <dillow@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f7116e11

iommu/vt-d: Tylersburg isoch identity map check is done too late. · 21e722c4

由 Ashok Raj 提交于 1月 30, 2017

The check to set identity map for tylersburg is done too late. It needs
to be done before the check for identity_map domain is done.

To: Joerg Roedel <joro@8bytes.org>
To: David Woodhouse <dwmw2@infradead.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Ashok Raj <ashok.raj@intel.com>

Fixes: 86080ccc ("iommu/vt-d: Allocate si_domain in init_dmars()")
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Reported-by: NYunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

21e722c4

23 1月, 2017 1 次提交

iommu/vt-d: Implement reserved region get/put callbacks · 0659b8dc

由 Eric Auger 提交于 1月 19, 2017

This patch registers the [FEE0_0000h - FEF0_000h] 1MB MSI
range as a reserved region and RMRR regions as direct regions.

This will allow to report those reserved regions in the
iommu-group sysfs.
Signed-off-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0659b8dc

04 1月, 2017 2 次提交

iommu/vt-d: Fix pasid table size encoding · 65ca7f5f

由 Jacob Pan 提交于 12月 06, 2016

Different encodings are used to represent supported PASID bits
and number of PASID table entries.
The current code assigns ecap_pss directly to extended context
table entry PTS which is wrong and could result in writing
non-zero bits to the reserved fields. IOMMU fault reason
11 will be reported when reserved bits are nonzero.
This patch converts ecap_pss to extend context entry pts encoding
based on VT-d spec. Chapter 9.4 as follows:
 - number of PASID bits = ecap_pss + 1
 - number of PASID table entries = 2^(pts + 5)
Software assigned limit of pasid_max value is also respected to
match the allocation limitation of PASID table.

cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: NMika Kuoppala <mika.kuoppala@intel.com>
Fixes: 2f26e0a9 ('iommu/vt-d: Add basic SVM PASID support')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

65ca7f5f

iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped · aec0e861

由 Xunlei Pang 提交于 12月 05, 2016

We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
under kdump, it can be steadily reproduced on several different machines,
the dmesg log is like:
HP HPSA Driver (v 3.4.16-0)
hpsa 0000:02:00.0: using doorbell to reset controller
hpsa 0000:02:00.0: board ready after hard reset.
hpsa 0000:02:00.0: Waiting for controller to respond to no-op
DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff]
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set
hpsa 0000:02:00.0: controller message 03:00 timed out
hpsa 0000:02:00.0: no-op failed; re-trying

After some debugging, we found that the fault addr is from DMA initiated at
the driver probe stage after reset(not in-flight DMA), and the corresponding
pte entry value is correct, the fault is likely due to the old iommu caches
of the in-flight DMA before it.

Thus we need to flush the old cache after context mapping is setup for the
device, where the device is supposed to finish reset at its driver probe
stage and no in-flight DMA exists hereafter.

I'm not sure if the hardware is responsible for invalidating all the related
caches allocated in the iommu hardware before, but seems not the case for hpsa,
actually many device drivers have problems in properly resetting the hardware.
Anyway flushing (again) by software in kdump kernel when the device gets context
mapped which is a quite infrequent operation does little harm.

With this patch, the problematic machine can survive the kdump tests.

CC: Myron Stowe <myron.stowe@gmail.com>
CC: Joseph Szczypek <jszczype@redhat.com>
CC: Don Brace <don.brace@microsemi.com>
CC: Baoquan He <bhe@redhat.com>
CC: Dave Young <dyoung@redhat.com>
Fixes: 091d42e4 ("iommu/vt-d: Copy translation tables from old kernel")
Fixes: dbcd861f ("iommu/vt-d: Do not re-use domain-ids from the old kernel")
Fixes: cf484d0e ("iommu/vt-d: Mark copied context entries")
Signed-off-by: NXunlei Pang <xlpang@redhat.com>
Tested-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

aec0e861

02 12月, 2016 1 次提交

iommu/vt-d: Convert to hotplug state machine · 21647615

由 Anna-Maria Gleixner 提交于 11月 27, 2016

Install the callbacks via the state machine.
Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: rt@linutronix.de
Cc: David Woodhouse <dwmw2@infradead.org>
Link: http://lkml.kernel.org/r/20161126231350.10321-14-bigeasy@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

21647615

08 11月, 2016 1 次提交

iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path · bea64033

由 Joerg Roedel 提交于 11月 08, 2016

It turns out that the disable_dmar_iommu() code-path tried
to get the device_domain_lock recursivly, which will
dead-lock when this code runs on dmar removal. Fix both
code-paths that could lead to the dead-lock.

Fixes: 55d94043 ('iommu/vt-d: Get rid of domain->iommu_lock')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bea64033

30 10月, 2016 1 次提交

iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions · 1c387188

由 Ashok Raj 提交于 10月 21, 2016

The VT-d specification (§8.3.3) says:
    ‘Virtual Functions’ of a ‘Physical Function’ are under the scope
    of the same remapping unit as the ‘Physical Function’.

The BIOS is not required to list all the possible VFs in the scope
tables, and arguably *shouldn't* make any attempt to do so, since there
could be a huge number of them.

This has been broken basically for ever — the VF is never going to match
against a specific unit's scope, so it ends up being assigned to the
INCLUDE_ALL IOMMU. Which was always actually correct by coincidence, but
now we're looking at Root-Complex integrated devices with SR-IOV support
it's going to start being wrong.

Fix it to simply use pci_physfn() before doing the lookup for PCI devices.

Cc: stable@vger.kernel.org
Signed-off-by: NSainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

1c387188

05 9月, 2016 2 次提交

iommu/vt-d: Make sure RMRRs are mapped before domain goes public · 1c5ebba9

由 Joerg Roedel 提交于 8月 25, 2016

When a domain is allocated through the get_valid_domain_for_dev
path, it will be context-mapped before the RMRR regions are
mapped in the page-table. This opens a short time window
where device-accesses to these regions fail and causing DMAR
faults.

Fix this by mapping the RMRR regions before the domain is
context-mapped.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

1c5ebba9

iommu/vt-d: Split up get_domain_for_dev function · 76208356

由 Joerg Roedel 提交于 8月 25, 2016

Split out the search for an already existing domain and the
context mapping of the device to the new domain.

This allows to map possible RMRR regions into the domain
before it is context mapped.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

76208356

04 8月, 2016 1 次提交

dma-mapping: use unsigned long for dma_attrs · 00085f1e

由 Krzysztof Kozlowski 提交于 8月 03, 2016

The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: NVineet Gupta <vgupta@synopsys.com>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

00085f1e

28 7月, 2016 1 次提交

Add braces to avoid "ambiguous ‘else’" compiler warnings · 194dc870

由 Linus Torvalds 提交于 7月 27, 2016

Some of our "for_each_xyz()" macro constructs make gcc unhappy about
lack of braces around if-statements inside or outside the loop, because
the loop construct itself has a "if-then-else" statement inside of it.

The resulting warnings look something like this:

  drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
  drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
     if (ctx != dev_priv->kernel_context)
        ^

even if the code itself is fine.

Since the warning is fairly easy to avoid by adding a braces around the
if-statement near the for_each_xyz() construct, do so, rather than
disabling the otherwise potentially useful warning.

(The if-then-else statements used in the "for_each_xyz()" constructs are
designed to be inherently safe even with no braces, but in this case
it's quite understandable that gcc isn't really able to tell that).

This finally leaves the standard "allmodconfig" build with just a
handful of remaining warnings, so new and valid warnings hopefully will
stand out.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

194dc870

14 7月, 2016 1 次提交

iommu/vt-d: Return error code in domain_context_mapping_one() · 5c365d18

由 Wei Yang 提交于 7月 13, 2016

In 'commit <55d94043> ("iommu/vt-d: Get rid of domain->iommu_lock")',
the error handling path is changed a little, which makes the function
always return 0.

This path fixes this.
Signed-off-by: NWei Yang <richard.weiyang@gmail.com>
Fixes: 55d94043 ('iommu/vt-d: Get rid of domain->iommu_lock')
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5c365d18

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功