提交 · d7cbc0f3220fabbdfa9b3aa79275baa5b16fef5d · openeuler / Kernel

11 4月, 2019 3 次提交

iommu/vt-d: Make intel_iommu_enable_pasid() more generic · d7cbc0f3

由 Lu Baolu 提交于 3月 25, 2019

This moves intel_iommu_enable_pasid() out of the scope of
CONFIG_INTEL_IOMMU_SVM with more and more features requiring
pasid function.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

d7cbc0f3

iommu: Bind process address spaces to devices · 26b25a2b

由 Jean-Philippe Brucker 提交于 4月 10, 2019

Add bind() and unbind() operations to the IOMMU API.
iommu_sva_bind_device() binds a device to an mm, and returns a handle to
the bond, which is released by calling iommu_sva_unbind_device().

Each mm bound to devices gets a PASID (by convention, a 20-bit system-wide
ID representing the address space), which can be retrieved with
iommu_sva_get_pasid(). When programming DMA addresses, device drivers
include this PASID in a device-specific manner, to let the device access
the given address space. Since the process memory may be paged out, device
and IOMMU must support I/O page faults (e.g. PCI PRI).

Using iommu_sva_set_ops(), device drivers provide an mm_exit() callback
that is called by the IOMMU driver if the process exits before the device
driver called unbind(). In mm_exit(), device driver should disable DMA
from the given context, so that the core IOMMU can reallocate the PASID.
Whether the process exited or nor, the device driver should always release
the handle with unbind().

To use these functions, device driver must first enable the
IOMMU_DEV_FEAT_SVA device feature with iommu_dev_enable_feature().
Signed-off-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

26b25a2b

iommu: Add APIs for multiple domains per device · a3a19592

由 Lu Baolu 提交于 3月 25, 2019

Sharing a physical PCI device in a finer-granularity way
is becoming a consensus in the industry. IOMMU vendors
are also engaging efforts to support such sharing as well
as possible. Among the efforts, the capability of support
finer-granularity DMA isolation is a common requirement
due to the security consideration. With finer-granularity
DMA isolation, subsets of a PCI function can be isolated
from each others by the IOMMU. As a result, there is a
request in software to attach multiple domains to a physical
PCI device. One example of such use model is the Intel
Scalable IOV [1] [2]. The Intel vt-d 3.0 spec [3] introduces
the scalable mode which enables PASID granularity DMA
isolation.

This adds the APIs to support multiple domains per device.
In order to ease the discussions, we call it 'a domain in
auxiliary mode' or simply 'auxiliary domain' when multiple
domains are attached to a physical device.

The APIs include:

* iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Detect both IOMMU and PCI endpoint devices supporting
    the feature (aux-domain here) without the host driver
    dependency.

* iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_AUX)
  - Check the enabling status of the feature (aux-domain
    here). The aux-domain interfaces are available only
    if this returns true.

* iommu_dev_enable/disable_feature(dev, IOMMU_DEV_FEAT_AUX)
  - Enable/disable device specific aux-domain feature.

* iommu_aux_attach_device(domain, dev)
  - Attaches @domain to @dev in the auxiliary mode. Multiple
    domains could be attached to a single device in the
    auxiliary mode with each domain representing an isolated
    address space for an assignable subset of the device.

* iommu_aux_detach_device(domain, dev)
  - Detach @domain which has been attached to @dev in the
    auxiliary mode.

* iommu_aux_get_pasid(domain, dev)
  - Return ID used for finer-granularity DMA translation.
    For the Intel Scalable IOV usage model, this will be
    a PASID. The device which supports Scalable IOV needs
    to write this ID to the device register so that DMA
    requests could be tagged with a right PASID prefix.

This has been updated with the latest proposal from Joerg
posted here [5].

Many people involved in discussions of this design.

Kevin Tian <kevin.tian@intel.com>
Liu Yi L <yi.l.liu@intel.com>
Ashok Raj <ashok.raj@intel.com>
Sanjay Kumar <sanjay.k.kumar@intel.com>
Jacob Pan <jacob.jun.pan@linux.intel.com>
Alex Williamson <alex.williamson@redhat.com>
Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Joerg Roedel <joro@8bytes.org>

and some discussions can be found here [4] [5].

[1] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[2] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
[3] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[4] https://lkml.org/lkml/2018/7/26/4
[5] https://www.spinics.net/lists/iommu/msg31874.html

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Suggested-by: NKevin Tian <kevin.tian@intel.com>
Suggested-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Suggested-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NJean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

a3a19592

30 3月, 2019 2 次提交

iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging · 0a352554

由 Nicolas Boichat 提交于 3月 28, 2019

IOMMUs using ARMv7 short-descriptor format require page tables (level 1
and 2) to be allocated within the first 4GB of RAM, even on 64-bit
systems.

For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32 is
defined (e.g.  on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.  Note that
we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc, as this is
not strictly necessary, and would cause a warning in mm/sl*b.c, as we
did not update GFP_SLAB_BUG_MASK.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Link: http://lkml.kernel.org/r/20181210011504.122604-3-drinkcat@chromium.org
Fixes: ad67f5a6 ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: NNicolas Boichat <drinkcat@chromium.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Huaisheng Ye <yehs1@lenovo.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sasha Levin <Alexander.Levin@microsoft.com>
Cc: Tomasz Figa <tfiga@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yong Wu <yong.wu@mediatek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0a352554

iommu/amd: Reserve exclusion range in iova-domain · 8aafaaf2

由 Joerg Roedel 提交于 3月 28, 2019

If a device has an exclusion range specified in the IVRS
table, this region needs to be reserved in the iova-domain
of that device. This hasn't happened until now and can cause
data corruption on data transfered with these devices.

Treat exclusion ranges as reserved regions in the iommu-core
to fix the problem.

Fixes: be2a022c ('x86, AMD IOMMU: add functions to parse IOMMU memory mapping requirements for devices')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>
Reviewed-by: NGary R Hook <gary.hook@amd.com>

8aafaaf2

25 3月, 2019 1 次提交

iommu: Don't print warning when IOMMU driver only supports unmanaged domains · 8bc32a28

由 Joerg Roedel 提交于 3月 22, 2019

Print the warning about the fall-back to IOMMU_DOMAIN_DMA in
iommu_group_get_for_dev() only when such a domain was
actually allocated.

Otherwise the user will get misleading warnings in the
kernel log when the iommu driver used doesn't support
IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY.

Fixes: fccb4e3b ('iommu: Allow default domain type to be set on the kernel command line')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

8bc32a28

22 3月, 2019 3 次提交

iommu/vt-d: Save the right domain ID used by hardware · 84c11e4d

由 Lu Baolu 提交于 3月 20, 2019

The driver sets a default domain id (FLPT_DEFAULT_DID) in the
first level only pasid entry, but saves a different domain id
in @sdev->did. The value saved in @sdev->did will be used to
invalidate the translation caches. Hence, the driver might
result in invalidating the caches with a wrong domain id.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

84c11e4d

iommu/vt-d: Check capability before disabling protected memory · 5bb71fc7

由 Lu Baolu 提交于 3月 20, 2019

The spec states in 10.4.16 that the Protected Memory Enable
Register should be treated as read-only for implementations
not supporting protected memory regions (PLMR and PHMR fields
reported as Clear in the Capability register).

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: mark gross <mgross@intel.com>
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Fixes: f8bab735 ("intel-iommu: PMEN support")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5bb71fc7

iommu/iova: Fix tracking of recently failed iova address · 80ef4464

由 Robert Richter 提交于 3月 20, 2019

If a 32 bit allocation request is too big to possibly succeed, it
early exits with a failure and then should never update max32_alloc_
size. This patch fixes current code, now the size is only updated if
the slow path failed while walking the tree. Without the fix the
allocation may enter the slow path again even if there was a failure
before of a request with the same or a smaller size.

Cc: <stable@vger.kernel.org> # 4.20+
Fixes: bee60e94 ("iommu/iova: Optimise attempts to allocate iova from 32bit address range")
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Signed-off-by: NRobert Richter <rrichter@marvell.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

80ef4464

18 3月, 2019 2 次提交

iommu/vt-d: Switch to bitmap_zalloc() · 5aba6c47

由 Andy Shevchenko 提交于 3月 04, 2019

Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5aba6c47

iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE · 4e50ce03

由 Stanislaw Gruszka 提交于 3月 13, 2019

Take into account that sg->offset can be bigger than PAGE_SIZE when
setting segment sg->dma_address. Otherwise sg->dma_address will point
at diffrent page, what makes DMA not possible with erros like this:

xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa70c0 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7040 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7080 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7100 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7000 flags=0x0020]

Additinally with wrong sg->dma_address unmap_sg will free wrong pages,
what what can cause crashes like this:

Feb 28 19:27:45 kernel: BUG: Bad page state in process cinnamon pfn:39e8b1
Feb 28 19:27:45 kernel: Disabling lock debugging due to kernel taint
Feb 28 19:27:45 kernel: flags: 0x2ffff0000000000()
Feb 28 19:27:45 kernel: raw: 02ffff0000000000 0000000000000000 ffffffff00000301 0000000000000000
Feb 28 19:27:45 kernel: raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
Feb 28 19:27:45 kernel: page dumped because: nonzero _refcount
Feb 28 19:27:45 kernel: Modules linked in: ccm fuse arc4 nct6775 hwmon_vid amdgpu nls_iso8859_1 nls_cp437 edac_mce_amd vfat fat kvm_amd ccp rng_core kvm mt76x0u mt76x0_common mt76x02_usb irqbypass mt76_usb mt76x02_lib mt76 crct10dif_pclmul crc32_pclmul chash mac80211 amd_iommu_v2 ghash_clmulni_intel gpu_sched i2c_algo_bit ttm wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper snd_hda_codec_hdmi snd_hda_intel drm snd_hda_codec aesni_intel snd_hda_core snd_hwdep aes_x86_64 crypto_simd snd_pcm cfg80211 cryptd mousedev snd_timer glue_helper pcspkr r8169 input_leds realtek agpgart libphy rfkill snd syscopyarea sysfillrect sysimgblt fb_sys_fops soundcore sp5100_tco k10temp i2c_piix4 wmi evdev gpio_amdpt pinctrl_amd mac_hid pcc_cpufreq acpi_cpufreq sg ip_tables x_tables ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) dm_mod(E) serio_raw(E) atkbd(E) libps2(E) crc32c_intel(E) ahci(E) libahci(E) libata(E) xhci_pci(E) xhci_hcd(E)
Feb 28 19:27:45 kernel: scsi_mod(E) i8042(E) serio(E) bcache(E) crc64(E)
Feb 28 19:27:45 kernel: CPU: 2 PID: 896 Comm: cinnamon Tainted: G B W E 4.20.12-arch1-1-custom #1
Feb 28 19:27:45 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P1.20 06/26/2018
Feb 28 19:27:45 kernel: Call Trace:
Feb 28 19:27:45 kernel: dump_stack+0x5c/0x80
Feb 28 19:27:45 kernel: bad_page.cold.29+0x7f/0xb2
Feb 28 19:27:45 kernel: __free_pages_ok+0x2c0/0x2d0
Feb 28 19:27:45 kernel: skb_release_data+0x96/0x180
Feb 28 19:27:45 kernel: __kfree_skb+0xe/0x20
Feb 28 19:27:45 kernel: tcp_recvmsg+0x894/0xc60
Feb 28 19:27:45 kernel: ? reuse_swap_page+0x120/0x340
Feb 28 19:27:45 kernel: ? ptep_set_access_flags+0x23/0x30
Feb 28 19:27:45 kernel: inet_recvmsg+0x5b/0x100
Feb 28 19:27:45 kernel: __sys_recvfrom+0xc3/0x180
Feb 28 19:27:45 kernel: ? handle_mm_fault+0x10a/0x250
Feb 28 19:27:45 kernel: ? syscall_trace_enter+0x1d3/0x2d0
Feb 28 19:27:45 kernel: ? __audit_syscall_exit+0x22a/0x290
Feb 28 19:27:45 kernel: __x64_sys_recvfrom+0x24/0x30
Feb 28 19:27:45 kernel: do_syscall_64+0x5b/0x170
Feb 28 19:27:45 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9

Cc: stable@vger.kernel.org
Reported-and-tested-by: NJan Viktorin <jan.viktorin@gmail.com>
Reviewed-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: NStanislaw Gruszka <sgruszka@redhat.com>
Fixes: 80187fd3 ('iommu/amd: Optimize map_sg and unmap_sg')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

4e50ce03

15 3月, 2019 1 次提交

iommu/amd: Fix NULL dereference bug in match_hid_uid · bb6bccba

由 Aaron Ma 提交于 3月 13, 2019

Add a non-NULL check to fix potential NULL pointer dereference
Cleanup code to call function once.
Signed-off-by: NAaron Ma <aaron.ma@canonical.com>
Fixes: 2bf9a0a1 ('iommu/amd: Add iommu support for ACPI HID devices')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

bb6bccba

06 3月, 2019 1 次提交

mm: replace all open encodings for NUMA_NO_NODE · 98fa15f3

由 Anshuman Khandual 提交于 3月 05, 2019

Patch series "Replace all open encodings for NUMA_NO_NODE", v3.

All these places for replacement were found by running the following
grep patterns on the entire kernel code.  Please let me know if this
might have missed some instances.  This might also have replaced some
false positives.  I will appreciate suggestions, inputs and review.

1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"

This patch (of 2):

At present there are multiple places where invalid node number is
encoded as -1.  Even though implicitly understood it is always better to
have macros in there.  Replace these open encodings for an invalid node
number with the global macro NUMA_NO_NODE.  This helps remove NUMA
related assumptions like 'invalid node' from various places redirecting
them to a common definition.

Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.comSigned-off-by: NAnshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	[ixgbe]
Acked-by: Jens Axboe <axboe@kernel.dk>			[mtip32xx]
Acked-by: Vinod Koul <vkoul@kernel.org>			[dmaengine.c]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>		[powerpc]
Acked-by: Doug Ledford <dledford@redhat.com>		[drivers/infiniband]
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

98fa15f3

01 3月, 2019 5 次提交

iommu/vt-d: Get domain ID before clear pasid entry · 48739afa

由 Lu Baolu 提交于 3月 01, 2019

After tearing down a pasid entry, the domain id is used to
invalidate the translation caches. Retrieve the domain id
from the pasid entry value before clearing the pasid entry.
Otherwise, we will always use domain id 0.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Fixes: 6f7db75e ("iommu/vt-d: Add second level page table interface")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

48739afa

iommu/vt-d: Fix NULL pointer reference in intel_svm_bind_mm() · c56cba5d

由 Lu Baolu 提交于 3月 01, 2019

Intel IOMMU could be turned off with intel_iommu=off. If Intel
IOMMU is off,  the intel_iommu struct will not be initialized.
When device drivers call intel_svm_bind_mm(), the NULL pointer
reference will happen there.

Add dmar_disabled check to avoid NULL pointer reference.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reported-by: NDave Jiang <dave.jiang@intel.com>
Fixes: 2f26e0a9 ("iommu/vt-d: Add basic SVM PASID support")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

c56cba5d

iommu/vt-d: Set context field after value initialized · 41b80db2

由 Lu Baolu 提交于 3月 01, 2019

Otherwise, the translation type field of a context entry for
a PCI device will always be 0. All translated DMA requests
will be blocked by IOMMU. As the result, the PCI devices with
PCI ATS (device IOTBL) support won't work as expected.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Suggested-by: NKevin Tian <kevin.tian@intel.com>
Fixes: 7373a8cc ("iommu/vt-d: Setup context and enable RID2PASID support")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

41b80db2

iommu/vt-d: Disable ATS support on untrusted devices · d8b85910

由 Lu Baolu 提交于 3月 01, 2019

Commit fb58fdcd ("iommu/vt-d: Do not enable ATS for untrusted
devices") disables ATS support on the devices which have been marked
as untrusted. Unfortunately this is not enough to fix the DMA attack
vulnerabiltiies because IOMMU driver allows translated requests as
long as a device advertises the ATS capability. Hence a malicious
peripheral device could use this to bypass IOMMU.

This disables the ATS support on untrusted devices by clearing the
internal per-device ATS mark. As the result, IOMMU driver will block
any translated requests from any device marked as untrusted.

Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Suggested-by: NKevin Tian <kevin.tian@intel.com>
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Fixes: fb58fdcd ("iommu/vt-d: Do not enable ATS for untrusted devices")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

d8b85910

iommu/mediatek: Fix semicolon code style issue · a947a45f

由 Yang Wei 提交于 2月 28, 2019

Delete a superfluous semicolon in mtk_iommu_add_device().
Signed-off-by: NYang Wei <yang.wei9@zte.com.cn>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

a947a45f

28 2月, 2019 1 次提交

iommu/hyper-v: Add Hyper-V stub IOMMU driver · 29217a47

由 Lan Tianyu 提交于 2月 27, 2019

On the bare metal, enabling X2APIC mode requires interrupt remapping
function which helps to deliver irq to cpu with 32-bit APIC ID.
Hyper-V doesn't provide interrupt remapping function so far and Hyper-V
MSI protocol already supports to deliver interrupt to the CPU whose
virtual processor index is more than 255. IO-APIC interrupt still has
8-bit APIC ID limitation.

This patch is to add Hyper-V stub IOMMU driver in order to enable
X2APIC mode successfully in Hyper-V Linux guest. The driver returns X2APIC
interrupt remapping capability when X2APIC mode is available. Otherwise,
it creates a Hyper-V irq domain to limit IO-APIC interrupts' affinity
and make sure cpus assigned with IO-APIC interrupt have 8-bit APIC ID.

Define 24 IO-APIC remapping entries because Hyper-V only expose one
single IO-APIC and one IO-APIC has 24 pins according IO-APIC spec(
https://pdos.csail.mit.edu/6.828/2016/readings/ia32/ioapic.pdf).
Reviewed-by: NMichael Kelley <mikelley@microsoft.com>
Signed-off-by: NLan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

29217a47

26 2月, 2019 8 次提交

iommu/vt-d: Check identity map for hot-added devices · 117266fd

由 Lu Baolu 提交于 2月 25, 2019

The Intel IOMMU driver will put devices into a static identity
mapped domain during boot if the kernel parameter "iommu=pt" is
used. That means the IOMMU hardware will translate a DMA address
into the same memory address.

Unfortunately, hot-added devices are not subject to this. That
results in some devices not working properly after hot added. A
quick way to reproduce this issue is to boot a system with

    iommu=pt

and, remove then readd the pci device with

    echo 1 > /sys/bus/pci/devices/[pci_source_id]/remove
    echo 1 > /sys/bus/pci/rescan

You will find the identity mapped domain was replaced with a
normal domain.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: stable@vger.kernel.org
Reported-by: NJis Ben <jisben@google.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NJames Dong <xmdong@google.com>
Fixes: 99dcaded ('intel-iommu: Support PCIe hot-plug')
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

117266fd

iommu/dmar: Fix buffer overflow during PCI bus notification · cffaaf0c

由 Julia Cartwright 提交于 2月 20, 2019

Commit 57384592 ("iommu/vt-d: Store bus information in RMRR PCI
device path") changed the type of the path data, however, the change in
path type was not reflected in size calculations.  Update to use the
correct type and prevent a buffer overflow.

This bug manifests in systems with deep PCI hierarchies, and can lead to
an overflow of the static allocated buffer (dmar_pci_notify_info_buf),
or can lead to overflow of slab-allocated data.

   BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0
   Write of size 1 at addr ffffffff90445d80 by task swapper/0/1
   CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.14.87-rt49-02406-gd0a0e96 #1
   Call Trace:
    ? dump_stack+0x46/0x59
    ? print_address_description+0x1df/0x290
    ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
    ? kasan_report+0x256/0x340
    ? dmar_alloc_pci_notify_info+0x1d5/0x2e0
    ? e820__memblock_setup+0xb0/0xb0
    ? dmar_dev_scope_init+0x424/0x48f
    ? __down_write_common+0x1ec/0x230
    ? dmar_dev_scope_init+0x48f/0x48f
    ? dmar_free_unused_resources+0x109/0x109
    ? cpumask_next+0x16/0x20
    ? __kmem_cache_create+0x392/0x430
    ? kmem_cache_create+0x135/0x2f0
    ? e820__memblock_setup+0xb0/0xb0
    ? intel_iommu_init+0x170/0x1848
    ? _raw_spin_unlock_irqrestore+0x32/0x60
    ? migrate_enable+0x27a/0x5b0
    ? sched_setattr+0x20/0x20
    ? migrate_disable+0x1fc/0x380
    ? task_rq_lock+0x170/0x170
    ? try_to_run_init_process+0x40/0x40
    ? locks_remove_file+0x85/0x2f0
    ? dev_prepare_static_identity_mapping+0x78/0x78
    ? rt_spin_unlock+0x39/0x50
    ? lockref_put_or_lock+0x2a/0x40
    ? dput+0x128/0x2f0
    ? __rcu_read_unlock+0x66/0x80
    ? __fput+0x250/0x300
    ? __rcu_read_lock+0x1b/0x30
    ? mntput_no_expire+0x38/0x290
    ? e820__memblock_setup+0xb0/0xb0
    ? pci_iommu_init+0x25/0x63
    ? pci_iommu_init+0x25/0x63
    ? do_one_initcall+0x7e/0x1c0
    ? initcall_blacklisted+0x120/0x120
    ? kernel_init_freeable+0x27b/0x307
    ? rest_init+0xd0/0xd0
    ? kernel_init+0xf/0x120
    ? rest_init+0xd0/0xd0
    ? ret_from_fork+0x1f/0x40
   The buggy address belongs to the variable:
    dmar_pci_notify_info_buf+0x40/0x60

Fixes: 57384592 ("iommu/vt-d: Store bus information in RMRR PCI device path")
Signed-off-by: NJulia Cartwright <julia@ni.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

cffaaf0c

iommu: Fix IOMMU debugfs fallout · 18b3af44

由 Geert Uytterhoeven 提交于 2月 20, 2019

A change made in the final version of IOMMU debugfs support replaced the
public function iommu_debugfs_new_driver_dir() by the public dentry
iommu_debugfs_dir in <linux/iommu.h>, but forgot to update both the
implementation in iommu-debugfs.c, and the patch description.

Fix this by exporting iommu_debugfs_dir, and removing the reference to
and implementation of iommu_debugfs_new_driver_dir().

Fixes: bad614b2 ("iommu: Enable debugfs exposure of IOMMU driver internals")
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Acked-by: NGary R Hook <gary.hook@amd.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

18b3af44

iommu/vt-d: Enable ATS only if the device uses page aligned address. · 61363c14

由 Kuppuswamy Sathyanarayanan 提交于 2月 19, 2019

As per Intel vt-d specification, Rev 3.0 (section 7.5.1.1, title "Page
Request Descriptor"), Intel IOMMU page request descriptor only uses
bits[63:12] of the page address. Hence Intel IOMMU driver would only
permit devices that advertise they would only send Page Aligned Requests
to participate in ATS service.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

61363c14

iommu/vt-d: Fix PRI/PASID dependency issue. · 1b84778a

由 Kuppuswamy Sathyanarayanan 提交于 2月 19, 2019

In Intel IOMMU, if the Page Request Queue (PRQ) is full, it will
automatically respond to the device with a success message as a keep
alive. And when sending the success message, IOMMU will include PASID in
the Response Message when the Page Request has a PASID in Request
Message and it does not check against the PRG Response PASID requirement
of the device before sending the response. Also, if the device receives
the PRG response with PASID when its not expecting it the device behavior
is undefined. So if PASID is enabled in the device, enable PRI only if
device expects PASID in PRG Response Message.

Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

1b84778a

iommu/vt-d: Allow interrupts from the entire bus for aliased devices · 3f0c625c

由 Logan Gunthorpe 提交于 2月 13, 2019

When a device has multiple aliases that all are from the same bus,
we program the IRTE to accept requests from any matching device on the
bus.

This is so NTB devices which can have requests from multiple bus-devfns
can pass MSI interrupts through across the bridge.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

3f0c625c

iommu/vt-d: Add helper to set an IRTE to verify only the bus number · 9ca82611

由 Logan Gunthorpe 提交于 2月 13, 2019

The current code uses set_irte_sid() with SVT_VERIFY_BUS and PCI_DEVID
to set the SID value. However, this is very confusing because, with
SVT_VERIFY_BUS, the SID value is not a PCI devfn address, but the start
and end bus numbers to match against.

According to the Intel Virtualization Technology for Directed I/O
Architecture Specification, Rev. 3.0, page 9-36:

  The most significant 8-bits of the SID field contains the Startbus#,
  and the least significant 8-bits of the SID field contains the Endbus#.
  Interrupt requests that reference this IRTE must have a requester-id
  whose bus# (most significant 8-bits of requester-id) has a value equal
  to or within the Startbus# to Endbus# range.

So to make this more clear, introduce a new set_irte_verify_bus() that
explicitly takes a start bus and end bus so that we can stop abusing
the PCI_DEVID macro.

This helper function will be called a second time in an subsequent patch.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

9ca82611

iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tables · 032ebd85

由 Nicolas Boichat 提交于 1月 28, 2019

L1 tables are allocated with __get_dma_pages, and therefore already
ignored by kmemleak.

Without this, the kernel would print this error message on boot,
when the first L1 table is allocated:

[    2.810533] kmemleak: Trying to color unknown object at 0xffffffd652388000 as Black
[    2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S                4.19.16 #8
[    2.831227] Workqueue: events deferred_probe_work_func
[    2.836353] Call trace:
...
[    2.852532]  paint_ptr+0xa0/0xa8
[    2.855750]  kmemleak_ignore+0x38/0x6c
[    2.859490]  __arm_v7s_alloc_table+0x168/0x1f4
[    2.863922]  arm_v7s_alloc_pgtable+0x114/0x17c
[    2.868354]  alloc_io_pgtable_ops+0x3c/0x78
...

Fixes: e5fc9753 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Signed-off-by: NNicolas Boichat <drinkcat@chromium.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

032ebd85

11 2月, 2019 7 次提交

iommu/vt-d: Remove misleading "domain 0" test from domain_exit() · f096d665

由 Bjorn Helgaas 提交于 2月 08, 2019

The "Domain 0 is reserved, so dont process it" comment suggests that a NULL
pointer corresponds to domain 0.  I don't think that's true, and in any
case, every caller supplies a non-NULL domain pointer that has already been
dereferenced, so the test is unnecessary.

Remove the test for a null "domain" pointer.  No functional change
intended.

This null pointer check was added by 5e98c4b1 ("Allocation and free
functions of virtual machine domain").
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

f096d665

iommu/vt-d: Remove unused dmar_remove_one_dev_info() argument · 71753239

由 Bjorn Helgaas 提交于 2月 08, 2019

domain_remove_dev_info() takes a struct dmar_domain * argument, but doesn't
use it.  Remove it.  No functional change intended.

The last use of this argument was removed by 127c7615 ("iommu/vt-d:
Pass device_domain_info to __dmar_remove_one_dev_info").
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

71753239

iommu/vt-d: Remove unnecessary local variable initializations · e083ea5b

由 Bjorn Helgaas 提交于 2月 08, 2019

A local variable initialization is a hint that the variable will be used in
an unusual way.  If the initialization is unnecessary, that hint becomes a
distraction.

Remove unnecessary initializations.  No functional change intended.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e083ea5b

iommu/vt-d: Use dev_printk() when possible · 932a6523

由 Bjorn Helgaas 提交于 2月 08, 2019

Use dev_printk() when possible so the IOMMU messages are more consistent
with other messages related to the device.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

932a6523

iommu/amd: Use dev_printk() when possible · 5f226da1

由 Bjorn Helgaas 提交于 2月 08, 2019

Use dev_printk() when possible so the IOMMU messages are more consistent
with other messages related to the device.
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5f226da1

iommu: Use dev_printk() when possible · 780da9e4

由 Bjorn Helgaas 提交于 2月 08, 2019

Use dev_printk() when possible so the IOMMU messages are more consistent
with other messages related to the device.

E.g., I think these messages related to surprise hotplug:

  pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
  iommu: Removing device 0000:87:00.0 from group 12
  pciehp 0000:80:10.0:pcie004: Slot(36): Card present
  pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec

would be easier to read as these (also requires some PCI changes not
included here):

  pci 0000:80:10.0: Slot(36): Link Down
  pci 0000:87:00.0: Removing from iommu group 12
  pci 0000:80:10.0: Slot(36): Card present
  pci 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

780da9e4

iommu: Allow io-pgtable to be used outside of drivers/iommu/ · b77cf11f

由 Rob Herring 提交于 2月 05, 2019

Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops
and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to
use the page table library. Specifically, some ARM Mali GPUs use the
ARM page table formats.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: NRob Herring <robh@kernel.org>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b77cf11f

01 2月, 2019 1 次提交

IOMMU: Make dwo drivers use stateless device links · ea4f6400

由 Rafael J. Wysocki 提交于 2月 01, 2019

The device links used by rockchip-iommu and exynos-iommu are
completely managed by these drivers within the IOMMU framework,
so there is no reason to involve the driver core in the management
of these links.

For this reason, make rockchip-iommu and exynos-iommu pass
DL_FLAG_STATELESS in flags to device_link_add(), so that the device
links used by them are stateless.

[Note that this change is requisite for a subsequent one that will
 rework the management of stateful device links in the driver core
 and it will not be compatible with the two drivers in question any
 more.]
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ea4f6400

31 1月, 2019 5 次提交

iommu/vt-d: Remove change_pte notifier · 1a9eb9b9

由 Peter Xu 提交于 1月 30, 2019

The change_pte() interface is tailored for PFN updates, while the
other notifier invalidate_range() should be enough for Intel IOMMU
cache flushing.  Actually we've done similar thing for AMD IOMMU
already in 8301da53 ("iommu/amd: Remove change_pte mmu_notifier
call-back", 2014-07-30) but the Intel IOMMU driver still have it.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

1a9eb9b9

iommu/amd: Remove clear_flush_young notifier · 5a63f0ad

由 Peter Xu 提交于 1月 30, 2019

AMD IOMMU driver is using the clear_flush_young() to do cache flushing
but that's actually already covered by invalidate_range().  Remove the
extra notifier and the chunks.
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

5a63f0ad

iommu/amd: Print reason for iommu_map_page failure in map_sg · 2e6c6a86

由 Jerry Snitselaar 提交于 1月 28, 2019

Since there are multiple possible failures in iommu_map_page
it would be useful to know which case is being hit when the
error message is printed in map_sg. While here, fix up checkpatch
complaint about using function name in a string instead of
__func__.

Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: NJerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

2e6c6a86

iommu/vt-d: Leave scalable mode default off · 8950dcd8

由 Lu Baolu 提交于 1月 24, 2019

Commit 765b6a98 ("iommu/vt-d: Enumerate the scalable
mode capability") enables VT-d scalable mode if hardware
advertises the capability. As we will bring up different
features and use cases to upstream in different patch
series, it will leave some intermediate kernel versions
which support partial features. Hence, end user might run
into problems when they use such kernels on bare metals
or virtualization environments.

This leaves scalable mode default off and end users could
turn it on with "intel-iommu=sm_on" only when they have
clear ideas about which scalable features are supported
in the kernel.

Cc: Liu Yi L <yi.l.liu@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Suggested-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

8950dcd8

iommu/vt-d: Implement dma_[un]map_resource() · 21d5d27c

由 Logan Gunthorpe 提交于 1月 22, 2019

Currently the Intel IOMMU uses the default dma_[un]map_resource()
implementations does nothing and simply returns the physical address
unmodified.

However, this doesn't create the IOVA entries necessary for addresses
mapped this way to work when the IOMMU is enabled. Thus, when the
IOMMU is enabled, drivers relying on dma_map_resource() will trigger
DMAR errors. We see this when running ntb_transport with the IOMMU
enabled, DMA, and switchtec hardware.

The implementation for intel_map_resource() is nearly identical to
intel_map_page(), we just have to re-create __intel_map_single().
dma_unmap_resource() uses intel_unmap_page() directly as the
functions are identical.
Signed-off-by: NLogan Gunthorpe <logang@deltatee.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

21d5d27c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功