提交 · 213e4eb2da7e4fa716560b20752b05b80b5b0da9 · openanolis / cloud-kernel

22 3月, 2017 2 次提交

iommu: Disambiguate MSI region types · 9d3a4de4

由 Robin Murphy 提交于 3月 16, 2017

The introduction of reserved regions has left a couple of rough edges
which we could do with sorting out sooner rather than later. Since we
are not yet addressing the potential dynamic aspect of software-managed
reservations and presenting them at arbitrary fixed addresses, it is
incongruous that we end up displaying hardware vs. software-managed MSI
regions to userspace differently, especially since ARM-based systems may
actually require one or the other, or even potentially both at once,
(which iommu-dma currently has no hope of dealing with at all). Let's
resolve the former user-visible inconsistency ASAP before the ABI has
been baked into a kernel release, in a way that also lays the groundwork
for the latter shortcoming to be addressed by follow-up patches.

For clarity, rename the software-managed type to IOMMU_RESV_SW_MSI, use
IOMMU_RESV_MSI to describe the hardware type, and document everything a
little bit. Since the x86 MSI remapping hardware falls squarely under
this meaning of IOMMU_RESV_MSI, apply that type to their regions as well,
so that we tell the same story to userspace across all platforms.

Secondly, as the various region types require quite different handling,
and it really makes little sense to ever try combining them, convert the
bitfield-esque #defines to a plain enum in the process before anyone
gets the wrong impression.

Fixes: d30ddcaa ("iommu: Add a new type field in iommu_resv_region")
Reviewed-by: NEric Auger <eric.auger@redhat.com>
CC: Alex Williamson <alex.williamson@redhat.com>
CC: David Woodhouse <dwmw2@infradead.org>
CC: kvm@vger.kernel.org
Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

9d3a4de4

iommu/vt-d: Fix NULL pointer dereference in device_to_iommu · 5003ae1e

由 Koos Vriezen 提交于 3月 01, 2017

The function device_to_iommu() in the Intel VT-d driver
lacks a NULL-ptr check, resulting in this oops at boot on
some platforms:

 BUG: unable to handle kernel NULL pointer dereference at 00000000000007ab
 IP: [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
 PGD 0

 [...]

 Call Trace:
   ? find_or_alloc_domain.constprop.29+0x1a/0x300
   ? dw_dma_probe+0x561/0x580 [dw_dmac_core]
   ? __get_valid_domain_for_dev+0x39/0x120
   ? __intel_map_single+0x138/0x180
   ? intel_alloc_coherent+0xb6/0x120
   ? sst_hsw_dsp_init+0x173/0x420 [snd_soc_sst_haswell_pcm]
   ? mutex_lock+0x9/0x30
   ? kernfs_add_one+0xdb/0x130
   ? devres_add+0x19/0x60
   ? hsw_pcm_dev_probe+0x46/0xd0 [snd_soc_sst_haswell_pcm]
   ? platform_drv_probe+0x30/0x90
   ? driver_probe_device+0x1ed/0x2b0
   ? __driver_attach+0x8f/0xa0
   ? driver_probe_device+0x2b0/0x2b0
   ? bus_for_each_dev+0x55/0x90
   ? bus_add_driver+0x110/0x210
   ? 0xffffffffa11ea000
   ? driver_register+0x52/0xc0
   ? 0xffffffffa11ea000
   ? do_one_initcall+0x32/0x130
   ? free_vmap_area_noflush+0x37/0x70
   ? kmem_cache_alloc+0x88/0xd0
   ? do_init_module+0x51/0x1c4
   ? load_module+0x1ee9/0x2430
   ? show_taint+0x20/0x20
   ? kernel_read_file+0xfd/0x190
   ? SyS_finit_module+0xa3/0xb0
   ? do_syscall_64+0x4a/0xb0
   ? entry_SYSCALL64_slow_path+0x25/0x25
 Code: 78 ff ff ff 4d 85 c0 74 ee 49 8b 5a 10 0f b6 9b e0 00 00 00 41 38 98 e0 00 00 00 77 da 0f b6 eb 49 39 a8 88 00 00 00 72 ce eb 8f <41> f6 82 ab 07 00 00 04 0f 85 76 ff ff ff 0f b6 4d 08 88 0e 49
 RIP  [<ffffffff8132234a>] device_to_iommu+0x11a/0x1a0
  RSP <ffffc90001457a78>
 CR2: 00000000000007ab
 ---[ end trace 16f974b6d58d0aad ]---

Add the missing pointer check.

Fixes: 1c387188 ("iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions")
Signed-off-by: NKoos Vriezen <koos.vriezen@gmail.com>
Cc: stable@vger.kernel.org # 4.8.15+
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5003ae1e

28 2月, 2017 1 次提交

iommu/vt-d: Fix crash when accessing VT-d sysfs entries · a7fdb6e6

由 Joerg Roedel 提交于 2月 28, 2017

The link between the iommu sysfs-device and the struct
intel_iommu is no longer stored as driver-data. Update the
code to use the new access method.
Reported-by: NDave Jones <davej@codemonkey.org.uk>
Fixes: 39ab9555 ('iommu: Add sysfs bindings for struct iommu_device')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

a7fdb6e6

25 2月, 2017 1 次提交

mm: wire up GFP flag passing in dma_alloc_from_contiguous · 712c604d

由 Lucas Stach 提交于 2月 24, 2017

The callers of the DMA alloc functions already provide the proper
context GFP flags.  Make sure to pass them through to the CMA allocator,
to make the CMA compaction context aware.

Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.deSigned-off-by: NLucas Stach <l.stach@pengutronix.de>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

712c604d

10 2月, 2017 3 次提交

iommu: Make iommu_device_link/unlink take a struct iommu_device · e3d10af1

由 Joerg Roedel 提交于 2月 01, 2017

This makes the interface more consistent with
iommu_device_sysfs_add/remove.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e3d10af1

iommu: Add sysfs bindings for struct iommu_device · 39ab9555

由 Joerg Roedel 提交于 2月 01, 2017

There is currently support for iommu sysfs bindings, but
those need to be implemented in the IOMMU drivers. Add a
more generic version of this by adding a struct device to
struct iommu_device and use that for the sysfs bindings.

Also convert the AMD and Intel IOMMU driver to make use of
it.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

39ab9555

iommu: Introduce new 'struct iommu_device' · b0119e87

由 Joerg Roedel 提交于 2月 01, 2017

This struct represents one hardware iommu in the iommu core
code. For now it only has the iommu-ops associated with it,
but that will be extended soon.

The register/unregister interface is also added, as well as
making use of it in the Intel and AMD IOMMU drivers.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b0119e87

31 1月, 2017 2 次提交

iommu/vt-d: Don't over-free page table directories · f7116e11

由 David Dillow 提交于 1月 30, 2017

dma_pte_free_level() recurses down the IOMMU page tables and frees
directory pages that are entirely contained in the given PFN range.
Unfortunately, it incorrectly calculates the starting address covered
by the PTE under consideration, which can lead to it clearing an entry
that is still in use.

This occurs if we have a scatterlist with an entry that has a length
greater than 1026 MB and is aligned to 2 MB for both the IOMMU and
physical addresses. For example, if __domain_mapping() is asked to map a
two-entry scatterlist with 2 MB and 1028 MB segments to PFN 0xffff80000,
it will ask if dma_pte_free_pagetable() is asked to PFNs from
0xffff80200 to 0xffffc05ff, it will also incorrectly clear the PFNs from
0xffff80000 to 0xffff801ff because of this issue. The current code will
set level_pfn to 0xffff80200, and 0xffff80200-0xffffc01ff fits inside
the range being cleared. Properly setting the level_pfn for the current
level under consideration catches that this PTE is outside of the range
being cleared.

This patch also changes the value passed into dma_pte_free_level() when
it recurses. This only affects the first PTE of the range being cleared,
and is handled by the existing code that ensures we start our cursor no
lower than start_pfn.

This was found when using dma_map_sg() to map large chunks of contiguous
memory, which immediatedly led to faults on the first access of the
erroneously-deleted mappings.

Fixes: 3269ee0b ("intel-iommu: Fix leaks in pagetable freeing")
Reviewed-by: NBenjamin Serebrin <serebrin@google.com>
Signed-off-by: NDavid Dillow <dillow@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f7116e11

iommu/vt-d: Tylersburg isoch identity map check is done too late. · 21e722c4

由 Ashok Raj 提交于 1月 30, 2017

The check to set identity map for tylersburg is done too late. It needs
to be done before the check for identity_map domain is done.

To: Joerg Roedel <joro@8bytes.org>
To: David Woodhouse <dwmw2@infradead.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Cc: Ashok Raj <ashok.raj@intel.com>

Fixes: 86080ccc ("iommu/vt-d: Allocate si_domain in init_dmars()")
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Reported-by: NYunhong Jiang <yunhong.jiang@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

21e722c4

23 1月, 2017 1 次提交

iommu/vt-d: Implement reserved region get/put callbacks · 0659b8dc

由 Eric Auger 提交于 1月 19, 2017

This patch registers the [FEE0_0000h - FEF0_000h] 1MB MSI
range as a reserved region and RMRR regions as direct regions.

This will allow to report those reserved regions in the
iommu-group sysfs.
Signed-off-by: NEric Auger <eric.auger@redhat.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

0659b8dc

04 1月, 2017 2 次提交

iommu/vt-d: Fix pasid table size encoding · 65ca7f5f

由 Jacob Pan 提交于 12月 06, 2016

Different encodings are used to represent supported PASID bits
and number of PASID table entries.
The current code assigns ecap_pss directly to extended context
table entry PTS which is wrong and could result in writing
non-zero bits to the reserved fields. IOMMU fault reason
11 will be reported when reserved bits are nonzero.
This patch converts ecap_pss to extend context entry pts encoding
based on VT-d spec. Chapter 9.4 as follows:
 - number of PASID bits = ecap_pss + 1
 - number of PASID table entries = 2^(pts + 5)
Software assigned limit of pasid_max value is also respected to
match the allocation limitation of PASID table.

cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: NMika Kuoppala <mika.kuoppala@intel.com>
Fixes: 2f26e0a9 ('iommu/vt-d: Add basic SVM PASID support')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

65ca7f5f

iommu/vt-d: Flush old iommu caches for kdump when the device gets context mapped · aec0e861

由 Xunlei Pang 提交于 12月 05, 2016

We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
under kdump, it can be steadily reproduced on several different machines,
the dmesg log is like:
HP HPSA Driver (v 3.4.16-0)
hpsa 0000:02:00.0: using doorbell to reset controller
hpsa 0000:02:00.0: board ready after hard reset.
hpsa 0000:02:00.0: Waiting for controller to respond to no-op
DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff]
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set
hpsa 0000:02:00.0: controller message 03:00 timed out
hpsa 0000:02:00.0: no-op failed; re-trying

After some debugging, we found that the fault addr is from DMA initiated at
the driver probe stage after reset(not in-flight DMA), and the corresponding
pte entry value is correct, the fault is likely due to the old iommu caches
of the in-flight DMA before it.

Thus we need to flush the old cache after context mapping is setup for the
device, where the device is supposed to finish reset at its driver probe
stage and no in-flight DMA exists hereafter.

I'm not sure if the hardware is responsible for invalidating all the related
caches allocated in the iommu hardware before, but seems not the case for hpsa,
actually many device drivers have problems in properly resetting the hardware.
Anyway flushing (again) by software in kdump kernel when the device gets context
mapped which is a quite infrequent operation does little harm.

With this patch, the problematic machine can survive the kdump tests.

CC: Myron Stowe <myron.stowe@gmail.com>
CC: Joseph Szczypek <jszczype@redhat.com>
CC: Don Brace <don.brace@microsemi.com>
CC: Baoquan He <bhe@redhat.com>
CC: Dave Young <dyoung@redhat.com>
Fixes: 091d42e4 ("iommu/vt-d: Copy translation tables from old kernel")
Fixes: dbcd861f ("iommu/vt-d: Do not re-use domain-ids from the old kernel")
Fixes: cf484d0e ("iommu/vt-d: Mark copied context entries")
Signed-off-by: NXunlei Pang <xlpang@redhat.com>
Tested-by: NDon Brace <don.brace@microsemi.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

aec0e861

02 12月, 2016 1 次提交

iommu/vt-d: Convert to hotplug state machine · 21647615

由 Anna-Maria Gleixner 提交于 11月 27, 2016

Install the callbacks via the state machine.
Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: rt@linutronix.de
Cc: David Woodhouse <dwmw2@infradead.org>
Link: http://lkml.kernel.org/r/20161126231350.10321-14-bigeasy@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

21647615

08 11月, 2016 1 次提交

iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path · bea64033

由 Joerg Roedel 提交于 11月 08, 2016

It turns out that the disable_dmar_iommu() code-path tried
to get the device_domain_lock recursivly, which will
dead-lock when this code runs on dmar removal. Fix both
code-paths that could lead to the dead-lock.

Fixes: 55d94043 ('iommu/vt-d: Get rid of domain->iommu_lock')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bea64033

30 10月, 2016 1 次提交

iommu/vt-d: Fix IOMMU lookup for SR-IOV Virtual Functions · 1c387188

由 Ashok Raj 提交于 10月 21, 2016

The VT-d specification (§8.3.3) says:
    ‘Virtual Functions’ of a ‘Physical Function’ are under the scope
    of the same remapping unit as the ‘Physical Function’.

The BIOS is not required to list all the possible VFs in the scope
tables, and arguably *shouldn't* make any attempt to do so, since there
could be a huge number of them.

This has been broken basically for ever — the VF is never going to match
against a specific unit's scope, so it ends up being assigned to the
INCLUDE_ALL IOMMU. Which was always actually correct by coincidence, but
now we're looking at Root-Complex integrated devices with SR-IOV support
it's going to start being wrong.

Fix it to simply use pci_physfn() before doing the lookup for PCI devices.

Cc: stable@vger.kernel.org
Signed-off-by: NSainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

1c387188

05 9月, 2016 2 次提交

iommu/vt-d: Make sure RMRRs are mapped before domain goes public · 1c5ebba9

由 Joerg Roedel 提交于 8月 25, 2016

When a domain is allocated through the get_valid_domain_for_dev
path, it will be context-mapped before the RMRR regions are
mapped in the page-table. This opens a short time window
where device-accesses to these regions fail and causing DMAR
faults.

Fix this by mapping the RMRR regions before the domain is
context-mapped.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

1c5ebba9

iommu/vt-d: Split up get_domain_for_dev function · 76208356

由 Joerg Roedel 提交于 8月 25, 2016

Split out the search for an already existing domain and the
context mapping of the device to the new domain.

This allows to map possible RMRR regions into the domain
before it is context mapped.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

76208356

04 8月, 2016 1 次提交

dma-mapping: use unsigned long for dma_attrs · 00085f1e

由 Krzysztof Kozlowski 提交于 8月 03, 2016

The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: NVineet Gupta <vgupta@synopsys.com>
Acked-by: NRobin Murphy <robin.murphy@arm.com>
Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

00085f1e

28 7月, 2016 1 次提交

Add braces to avoid "ambiguous ‘else’" compiler warnings · 194dc870

由 Linus Torvalds 提交于 7月 27, 2016

Some of our "for_each_xyz()" macro constructs make gcc unhappy about
lack of braces around if-statements inside or outside the loop, because
the loop construct itself has a "if-then-else" statement inside of it.

The resulting warnings look something like this:

  drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
  drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
     if (ctx != dev_priv->kernel_context)
        ^

even if the code itself is fine.

Since the warning is fairly easy to avoid by adding a braces around the
if-statement near the for_each_xyz() construct, do so, rather than
disabling the otherwise potentially useful warning.

(The if-then-else statements used in the "for_each_xyz()" constructs are
designed to be inherently safe even with no braces, but in this case
it's quite understandable that gcc isn't really able to tell that).

This finally leaves the standard "allmodconfig" build with just a
handful of remaining warnings, so new and valid warnings hopefully will
stand out.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

194dc870

14 7月, 2016 1 次提交

iommu/vt-d: Return error code in domain_context_mapping_one() · 5c365d18

由 Wei Yang 提交于 7月 13, 2016

In 'commit <55d94043> ("iommu/vt-d: Get rid of domain->iommu_lock")',
the error handling path is changed a little, which makes the function
always return 0.

This path fixes this.
Signed-off-by: NWei Yang <richard.weiyang@gmail.com>
Fixes: 55d94043 ('iommu/vt-d: Get rid of domain->iommu_lock')
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5c365d18

04 7月, 2016 1 次提交

iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas · 0caa7616

由 Aaron Campbell 提交于 7月 02, 2016

Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum
number of possible domains is 64K; indeed this is the maximum value
that the cap_ndoms() macro will expand to.  Since the value 65536
will not fix in a u16, the 'did' variable must be promoted to an
int, otherwise the test for < 65536 will always be true and the
loop will never end.

The symptom, in my case, was a hung machine during suspend.

Fixes: 3bd4f911 ("iommu/vt-d: Fix overflow of iommu->domains array")
Signed-off-by: NAaron Campbell <aaron@monkey.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

0caa7616

27 6月, 2016 1 次提交

iommu/vt-d: Fix overflow of iommu->domains array · 3bd4f911

由 Jan Niehusmann 提交于 6月 06, 2016

The valid range of 'did' in get_iommu_domain(*iommu, did)
is 0..cap_ndoms(iommu->cap), so don't exceed that
range in free_all_cpu_cached_iovas().

The user-visible impact of the out-of-bounds access is the machine
hanging on suspend-to-ram. It is, in fact, a kernel panic, but due
to already suspended devices, that's often not visible to the user.

Fixes: 22e2f9fa ("iommu/vt-d: Use per-cpu IOVA caching")
Signed-off-by: NJan Niehusmann <jan@gondor.com>
Tested-By: NMarius Vlad <marius.c.vlad@intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3bd4f911

17 6月, 2016 1 次提交

iommu/vt-d: Enable QI on all IOMMUs before setting root entry · a4c34ff1

由 Joerg Roedel 提交于 6月 17, 2016

This seems to be required on some X58 chipsets on systems
with more than one IOMMU. QI does not work until it is
enabled on all IOMMUs in the system.
Reported-by: NDheeraj CVR <cvr.dheeraj@gmail.com>
Tested-by: NDheeraj CVR <cvr.dheeraj@gmail.com>
Fixes: 5f0a7f76 ('iommu/vt-d: Make root entry visible for hardware right after allocation')
Cc: stable@vger.kernel.org
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

a4c34ff1

15 6月, 2016 1 次提交

iommu/vt-d: Reduce extra first level entry in iommu->domains · 86f004c7

由 Wei Yang 提交于 5月 21, 2016

In commit <8bf47816> ("iommu/vt-d: Split up iommu->domains array"), it
it splits iommu->domains in two levels. Each first level contains 256
entries of second level. In case of the ndomains is exact a multiple of
256, it would have one more extra first level entry for current
implementation.

This patch refines this calculation to reduce the extra first level entry.
Signed-off-by: NWei Yang <richard.weiyang@gmail.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

86f004c7

21 4月, 2016 7 次提交

iommu/vt-d: Use per-cpu IOVA caching · 22e2f9fa

由 Omer Peleg 提交于 4月 20, 2016

Commit 9257b4a2 ('iommu/iova: introduce per-cpu caching to iova allocation')
introduced per-CPU IOVA caches to massively improve scalability. Use them.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
[dwmw2: split out VT-d part into a separate patch]
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

22e2f9fa

iommu/vt-d: change intel-iommu to use IOVA frame numbers · 2aac6304

由 Omer Peleg 提交于 4月 20, 2016

Make intel-iommu map/unmap/invalidate work with IOVA pfns instead of
pointers to "struct iova". This avoids using the iova struct from the IOVA
red-black tree and the resulting explicit find_iova() on unmap.

This patch will allow us to cache IOVAs in the next patch, in order to
avoid rbtree operations for the majority of map/unmap operations.

Note: In eliminating the find_iova() operation, we have also eliminated
the sanity check previously done in the unmap flow. Arguably, this was
overhead that is better avoided in production code, but it could be
brought back as a debug option for driver development.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, fixed to not break iova api, and reworded
 the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

2aac6304

iommu/vt-d: avoid dev iotlb logic for domains with no dev iotlbs · 0824c592

由 Omer Peleg 提交于 4月 20, 2016

This patch avoids taking the device_domain_lock in iommu_flush_dev_iotlb()
for domains with no dev iotlb devices.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[gvdl@google.com: fixed locking issues]
Signed-off-by: NGodfrey van der Linden <gvdl@google.com>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

0824c592

iommu/vt-d: only unmap mapped entries · 769530e4

由 Omer Peleg 提交于 4月 20, 2016

Current unmap implementation unmaps the entire area covered by the IOVA
range, which is a power-of-2 aligned region. The corresponding map,
however, only maps those pages originally mapped by the user. This
discrepancy can lead to unmapping of already unmapped entries, which is
unneeded work.

With this patch, only mapped pages are unmapped. This is also a baseline
for a map/unmap implementation based on IOVAs and not iova structures,
which will allow caching.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

769530e4

iommu/vt-d: correct flush_unmaps pfn usage · f5c0c08b

由 Omer Peleg 提交于 4月 20, 2016

Change flush_unmaps() to correctly pass iommu_flush_iotlb_psi()
dma addresses.  (x86_64 mm and dma have the same size for pages
at the moment, but this usage improves consistency.)
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

f5c0c08b

iommu/vt-d: per-cpu deferred invalidation queues · aa473240

由 Omer Peleg 提交于 4月 20, 2016

The IOMMU's IOTLB invalidation is a costly process.  When iommu mode
is not set to "strict", it is done asynchronously. Current code
amortizes the cost of invalidating IOTLB entries by batching all the
invalidations in the system and performing a single global invalidation
instead. The code queues pending invalidations in a global queue that
is accessed under the global "async_umap_flush_lock" spinlock, which
can result is significant spinlock contention.

This patch splits this deferred queue into multiple per-cpu deferred
queues, and thus gets rid of the "async_umap_flush_lock" and its
contention.  To keep existing deferred invalidation behavior, it still
invalidates the pending invalidations of all CPUs whenever a CPU
reaches its watermark or a timeout occurs.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

aa473240

iommu/vt-d: refactoring of deferred flush entries · 314f1dc1

由 Omer Peleg 提交于 4月 20, 2016

Currently, deferred flushes' info is striped between several lists in
the flush tables. Instead, move all information about a specific flush
to a single entry in this table.

This patch does not introduce any functional change.
Signed-off-by: NOmer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased and reworded the commit message]
Signed-off-by: NAdam Morrison <mad@cs.technion.ac.il>
Reviewed-by: NShaohua Li <shli@fb.com>
Reviewed-by: NBen Serebrin <serebrin@google.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

314f1dc1

07 4月, 2016 1 次提交

iommu/vt-d: Silence an uninitialized variable warning · 0b74ecdf

由 Dan Carpenter 提交于 4月 06, 2016

My static checker complains that "dma_alias" is uninitialized unless we
are dealing with a pci device.  This is true but harmless.  Anyway, we
can flip the condition around to silence the warning.
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

0b74ecdf

05 4月, 2016 1 次提交

x86/vt-d: Fix comment for dma_pte_free_pagetable() · 3d1a2442

由 Michael S. Tsirkin 提交于 3月 23, 2016

dma_pte_free_pagetable no longer depends on last level ptes
being clear, it clears them itself.  Fix up the comment to
match.

Cc: Jiang Liu <jiang.liu@linux.intel.com>
Suggested-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3d1a2442

01 3月, 2016 1 次提交

iommu/vt-d: Use BUS_NOTIFY_REMOVED_DEVICE in hotplug path · e6a8c9b3

由 Joerg Roedel 提交于 2月 29, 2016

In the PCI hotplug path of the Intel IOMMU driver, replace
the usage of the BUS_NOTIFY_DEL_DEVICE notifier, which is
executed before the driver is unbound from the device, with
BUS_NOTIFY_REMOVED_DEVICE, which runs after that.

This fixes a kernel BUG being triggered in the VT-d code
when the device driver tries to unmap DMA buffers and the
VT-d driver already destroyed all mappings.
Reported-by: NStefani Seibold <stefani@seibold.net>
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e6a8c9b3

29 1月, 2016 1 次提交

iommu/vt-d: Don't skip PCI devices when disabling IOTLB · da972fb1

由 Jeremy McNicoll 提交于 1月 14, 2016

Fix a simple typo when disabling IOTLB on PCI(e) devices.

Fixes: b16d0cb9 ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS")
Cc: stable@vger.kernel.org  # v4.4
Signed-off-by: NJeremy McNicoll <jmcnicol@redhat.com>
Reviewed-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

da972fb1

16 12月, 2015 1 次提交

Revert "scatterlist: use sg_phys()" · 3e6110fd

由 Dan Williams 提交于 12月 15, 2015

commit db0fa0cb "scatterlist: use sg_phys()" did replacements of
the form:

    phys_addr_t phys = page_to_phys(sg_page(s));
    phys_addr_t phys = sg_phys(s) & PAGE_MASK;

However, this breaks platforms where sizeof(phys_addr_t) >
sizeof(unsigned long).  Revert for 4.3 and 4.4 to make room for a
combined helper in 4.5.

Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: db0fa0cb ("scatterlist: use sg_phys()")
Suggested-by: NJoerg Roedel <joro@8bytes.org>
Reported-by: NVitaly Lavrov <vel21ripn@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3e6110fd

07 11月, 2015 1 次提交

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc

由 Mel Gorman 提交于 11月 06, 2015

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0164adc

25 10月, 2015 1 次提交

iommu/vt-d: Clean up pasid_enabled() and ecs_enabled() dependencies · d42fde70

由 David Woodhouse 提交于 10月 24, 2015

When booted with intel_iommu=ecs_off we were still allocating the PASID
tables even though we couldn't actually use them. We really want to make
the pasid_enabled() macro depend on ecs_enabled().

Which is unfortunate, because currently they're the other way round to
cope with the Broadwell/Skylake problems with ECS.

Instead of having ecs_enabled() depend on pasid_enabled(), which was never
something that made me happy anyway, make it depend in the normal case
on the "broken PASID" bit 28 *not* being set.

Then pasid_enabled() can depend on ecs_enabled() as it should. And we also
don't need to mess with it if we ever see an implementation that has some
features requiring ECS (like PRI) but which *doesn't* have PASID support.
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

d42fde70

22 10月, 2015 1 次提交

iommu: Add device_group call-back to x86 iommu drivers · a960fadb

由 Joerg Roedel 提交于 10月 21, 2015

Set the device_group call-back to pci_device_group() for the
Intel VT-d and the AMD IOMMU driver.
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

a960fadb

19 10月, 2015 1 次提交

iommu/vt-d: Use dev_err(..) in intel_svm_device_to_iommu(..) · b9997e38

由 Sudeep Dutt 提交于 10月 18, 2015

This will give a little bit of assistance to those developing drivers
using SVM. It might cause a slight annoyance to end-users whose kernel
disables the IOMMU when drivers are trying to use it. But the fix there
is to fix the kernel to enable the IOMMU.
Signed-off-by: NSudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

b9997e38

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功