提交 · 66930e7e1e58880046a0d39eacccf67e8027d980 · openeuler / Kernel

18 11月, 2020 1 次提交

iommu/vt-d: include conditionally on CONFIG_INTEL_IOMMU_SVM · 68dd9d89

由 Lukas Bulwahn 提交于 11月 15, 2020

Commit 6ee1b77b ("iommu/vt-d: Add svm/sva invalidate function")
introduced intel_iommu_sva_invalidate() when CONFIG_INTEL_IOMMU_SVM.
This function uses the dedicated static variable inv_type_granu_table
and functions to_vtd_granularity() and to_vtd_size().

These parts are unused when !CONFIG_INTEL_IOMMU_SVM, and hence,
make CC=clang W=1 warns with an -Wunused-function warning.

Include these parts conditionally on CONFIG_INTEL_IOMMU_SVM.

Fixes: 6ee1b77b ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: NLukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201115205951.20698-1-lukas.bulwahn@gmail.comSigned-off-by: NWill Deacon <will@kernel.org>

68dd9d89

03 11月, 2020 3 次提交

iommu/vt-d: Fix a bug for PDP check in prq_event_thread · 71cd8e2d

由 Liu, Yi L 提交于 10月 30, 2020

In prq_event_thread(), the QI_PGRP_PDP is wrongly set by
'req->pasid_present' which should be replaced to
'req->priv_data_present'.

Fixes: 5b438f4b ("iommu/vt-d: Support page request in scalable mode")
Signed-off-by: NLiu, Yi L <yi.l.liu@intel.com>
Signed-off-by: NYi Sun <yi.y.sun@linux.intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1604025444-6954-3-git-send-email-yi.y.sun@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

71cd8e2d

iommu/vt-d: Fix sid not set issue in intel_svm_bind_gpasid() · eea4e29a

由 Liu Yi L 提交于 10月 30, 2020

Should get correct sid and set it into sdev. Because we execute
'sdev->sid != req->rid' in the loop of prq_event_thread().

Fixes: eb8d93ea ("iommu/vt-d: Report page request faults for guest SVA")
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NYi Sun <yi.y.sun@linux.intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1604025444-6954-2-git-send-email-yi.y.sun@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

eea4e29a

iommu/vt-d: Fix kernel NULL pointer dereference in find_domain() · 6097df45

由 Lu Baolu 提交于 10月 28, 2020

If calling find_domain() for a device which hasn't been probed by the
iommu core, below kernel NULL pointer dereference issue happens.

[  362.736947] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  362.743953] #PF: supervisor read access in kernel mode
[  362.749115] #PF: error_code(0x0000) - not-present page
[  362.754278] PGD 0 P4D 0
[  362.756843] Oops: 0000 [#1] SMP NOPTI
[  362.760528] CPU: 0 PID: 844 Comm: cat Not tainted 5.9.0-rc4-intel-next+ #1
[  362.767428] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake
               U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3384.A02.1909200816
               09/20/2019
[  362.781109] RIP: 0010:find_domain+0xd/0x40
[  362.785234] Code: 48 81 fb 60 28 d9 b2 75 de 5b 41 5c 41 5d 5d c3 0f 1f 00 66
                     2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 e0 02 00
                     00 55 <48> 8b 40 38 48 89 e5 48 83 f8 fe 0f 94 c1 48 85 ff
                     0f 94 c2 08 d1
[  362.804041] RSP: 0018:ffffb09cc1f0bd38 EFLAGS: 00010046
[  362.809292] RAX: 0000000000000000 RBX: ffff905b98e4fac8 RCX: 0000000000000000
[  362.816452] RDX: 0000000000000001 RSI: ffff905b98e4fac8 RDI: ffff905b9ccd40d0
[  362.823617] RBP: ffffb09cc1f0bda0 R08: ffffb09cc1f0bd48 R09: 000000000000000f
[  362.830778] R10: ffffffffb266c080 R11: ffff905b9042602d R12: ffff905b98e4fac8
[  362.837944] R13: ffffb09cc1f0bd48 R14: ffff905b9ccd40d0 R15: ffff905b98e4fac8
[  362.845108] FS:  00007f8485460740(0000) GS:ffff905b9fc00000(0000)
               knlGS:0000000000000000
[  362.853227] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  362.858996] CR2: 0000000000000038 CR3: 00000004627a6003 CR4: 0000000000770ef0
[  362.866161] PKRU: fffffffc
[  362.868890] Call Trace:
[  362.871363]  ? show_device_domain_translation+0x32/0x100
[  362.876700]  ? bind_store+0x110/0x110
[  362.880387]  ? klist_next+0x91/0x120
[  362.883987]  ? domain_translation_struct_show+0x50/0x50
[  362.889237]  bus_for_each_dev+0x79/0xc0
[  362.893121]  domain_translation_struct_show+0x36/0x50
[  362.898204]  seq_read+0x135/0x410
[  362.901545]  ? handle_mm_fault+0xeb8/0x1750
[  362.905755]  full_proxy_read+0x5c/0x90
[  362.909526]  vfs_read+0xa6/0x190
[  362.912782]  ksys_read+0x61/0xe0
[  362.916037]  __x64_sys_read+0x1a/0x20
[  362.919725]  do_syscall_64+0x37/0x80
[  362.923329]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  362.928405] RIP: 0033:0x7f84855c5e95

Filter out those devices to avoid such error.

Fixes: e2726dae ("iommu/vt-d: debugfs: Add support to show page table internals")
Reported-and-tested-by: NXu Pengfei <pengfei.xu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org#v5.6+
Link: https://lore.kernel.org/r/20201028070725.24979-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

6097df45

02 11月, 2020 1 次提交

swiotlb: remove the tbl_dma_addr argument to swiotlb_tbl_map_single · fc0021aa

由 Christoph Hellwig 提交于 10月 23, 2020

The tbl_dma_addr argument is used to check the DMA boundary for the
allocations, and thus needs to be a dma_addr_t. swiotlb-xen instead
passed a physical address, which could lead to incorrect results for
strange offsets. Fix this by removing the parameter entirely and hard
code the DMA address for io_tlb_start instead.

Fixes: 91ffe4ad ("swiotlb-xen: introduce phys_to_dma/dma_to_phys translations")
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NStefano Stabellini <sstabellini@kernel.org>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

fc0021aa

19 10月, 2020 1 次提交

iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not built · 9def3b1a

由 Bartosz Golaszewski 提交于 10月 13, 2020

Since commit c40aaaac ("iommu/vt-d: Gracefully handle DMAR units
with no supported address widths") dmar.c needs struct iommu_device to
be selected. We can drop this dependency by not dereferencing struct
iommu_device if IOMMU_API is not selected and by reusing the information
stored in iommu->drhd->ignored instead.

This fixes the following build error when IOMMU_API is not selected:

drivers/iommu/intel/dmar.c: In function ‘free_iommu’:
drivers/iommu/intel/dmar.c:1139:41: error: ‘struct iommu_device’ has no member named ‘ops’
 1139 |  if (intel_iommu_enabled && iommu->iommu.ops) {
                                                ^

Fixes: c40aaaac ("iommu/vt-d: Gracefully handle DMAR units with no supported address widths")
Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Acked-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/20201013073055.11262-1-brgl@bgdev.plSigned-off-by: NJoerg Roedel <jroedel@suse.de>

9def3b1a

07 10月, 2020 1 次提交

iommu/vt-d: Gracefully handle DMAR units with no supported address widths · c40aaaac

由 David Woodhouse 提交于 9月 24, 2020

Instead of bailing out completely, such a unit can still be used for
interrupt remapping.
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/linux-iommu/549928db2de6532117f36c9c810373c14cf76f51.camel@infradead.org/Signed-off-by: NJoerg Roedel <jroedel@suse.de>

c40aaaac

06 10月, 2020 2 次提交

dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> · 0b1abd1f

由 Christoph Hellwig 提交于 9月 11, 2020

Merge dma-contiguous.h into dma-map-ops.h, after removing the comment
describing the contiguous allocator into kernel/dma/contigous.c.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0b1abd1f

dma-mapping: split <linux/dma-mapping.h> · 0a0f0d8b

由 Christoph Hellwig 提交于 9月 22, 2020

Split out all the bits that are purely for dma_map_ops implementations
and related code into a new <linux/dma-map-ops.h> header so that they
don't get pulled into all the drivers. That also means the architecture
specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h>
any more, which leads to a missing includes that were pulled in by the
x86 or arm versions in a few not overly portable drivers.
Signed-off-by: NChristoph Hellwig <hch@lst.de>

0a0f0d8b

01 10月, 2020 3 次提交

iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb() · 1a3f2fd7

由 Lu Baolu 提交于 9月 27, 2020

Lock(&iommu->lock) without disabling irq causes lockdep warnings.

[   12.703950] ========================================================
[   12.703962] WARNING: possible irq lock inversion dependency detected
[   12.703975] 5.9.0-rc6+ #659 Not tainted
[   12.703983] --------------------------------------------------------
[   12.703995] systemd-udevd/284 just changed the state of lock:
[   12.704007] ffffffffbd6ff4d8 (device_domain_lock){..-.}-{2:2}, at:
               iommu_flush_dev_iotlb.part.57+0x2e/0x90
[   12.704031] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   12.704043]  (&iommu->lock){+.+.}-{2:2}
[   12.704045]

               and interrupts could create inverse lock ordering between
               them.

[   12.704073]
               other info that might help us debug this:
[   12.704085]  Possible interrupt unsafe locking scenario:

[   12.704097]        CPU0                    CPU1
[   12.704106]        ----                    ----
[   12.704115]   lock(&iommu->lock);
[   12.704123]                                local_irq_disable();
[   12.704134]                                lock(device_domain_lock);
[   12.704146]                                lock(&iommu->lock);
[   12.704158]   <Interrupt>
[   12.704164]     lock(device_domain_lock);
[   12.704174]
                *** DEADLOCK ***
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200927062428.13713-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

1a3f2fd7

iommu/vt-d: Check UAPI data processed by IOMMU core · 6278eecb

由 Jacob Pan 提交于 9月 25, 2020

IOMMU generic layer already does sanity checks on UAPI data for version
match and argsz range based on generic information.

This patch adjusts the following data checking responsibilities:
- removes the redundant version check from VT-d driver
- removes the check for vendor specific data size
- adds check for the use of reserved/undefined flags
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/1601051567-54787-7-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

6278eecb

iommu/uapi: Use named union for user data · 8d3bb3b8

由 Jacob Pan 提交于 9月 25, 2020

IOMMU UAPI data size is filled by the user space which must be validated
by the kernel. To ensure backward compatibility, user data can only be
extended by either re-purpose padding bytes or extend the variable sized
union at the end. No size change is allowed before the union. Therefore,
the minimum size is the offset of the union.

To use offsetof() on the union, we must make it named.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/linux-iommu/20200611145518.0c2817d6@x1.home/
Link: https://lore.kernel.org/r/1601051567-54787-4-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

8d3bb3b8

25 9月, 2020 1 次提交

dma-mapping: add a new dma_alloc_pages API · efa70f2f

由 Christoph Hellwig 提交于 9月 01, 2020

This API is the equivalent of alloc_pages, except that the returned memory
is guaranteed to be DMA addressable by the passed in device.  The
implementation will also be used to provide a more sensible replacement
for DMA_ATTR_NON_CONSISTENT flag.

Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages
as its backend.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)

efa70f2f

24 9月, 2020 2 次提交

ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT · 01feba59

由 Jonathan Cameron 提交于 8月 18, 2020

Several ACPI static tables contain references to proximity domains.

ACPI 6.3 has clarified that only entries in SRAT may define a new
domain (sec 5.2.16).

Those tables described in the ACPI spec have additional clarifying text.

NFIT: Table 5-132,

"Integer that represents the proximity domain to which the memory
 belongs. This number must match with corresponding entry in the
 SRAT table."

HMAT: Table 5-145,

"... This number must match with the corresponding entry in the SRAT
 table's processor affinity structure ... if the initiator is a processor,
 or the Generic Initiator Affinity Structure if the initiator is a generic
 initiator".

IORT and DMAR are defined by external specifications.

Intel Virtualization Technology for Directed I/O Rev 3.1 does not make any
explicit statements, but the general SRAT statement above will still apply.
https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf

IO Remapping Table, Platform Design Document rev D, also makes not explicit
statement, but refers to ACPI SRAT table for more information and again the
generic SRAT statement above applies.
https://developer.arm.com/documentation/den0049/d/

In conclusion, any proximity domain specified in these tables, should be a
reference to a proximity domain also found in SRAT, and they should not be
able to instantiate a new domain.  Hence we switch to pxm_to_node() which
will only return existing nodes.
Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NBarry Song <song.bao.hua@hisilicon.com>
Reviewed-by: NHanjun Guo <guohanjun@huawei.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

01feba59

iommu/vt-d: Use device numa domain if RHSA is missing · d2ef0962

由 Lu Baolu 提交于 9月 22, 2020

If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR
table, we could default to the device NUMA domain as fall back.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200904010303.2961-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20200922060843.31546-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

d2ef0962

18 9月, 2020 3 次提交

x86/mmu: Allocate/free a PASID · 20f0afd1

由 Fenghua Yu 提交于 9月 15, 2020

A PASID is allocated for an "mm" the first time any thread binds to an
SVA-capable device and is freed from the "mm" when the SVA is unbound
by the last thread. It's possible for the "mm" to have different PASID
values in different binding/unbinding SVA cycles.

The mm's PASID (non-zero for valid PASID or 0 for invalid PASID) is
propagated to a per-thread PASID MSR for all threads within the mm
through IPI, context switch, or inherited. This is done to ensure that a
running thread has the right PASID in the MSR matching the mm's PASID.

 [ bp: s/SVM/SVA/g; massage. ]
Suggested-by: NAndy Lutomirski <luto@kernel.org>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/1600187413-163670-10-git-send-email-fenghua.yu@intel.com

20f0afd1

iommu/vt-d: Change flags type to unsigned int in binding mm · 2a5054c6

由 Fenghua Yu 提交于 9月 15, 2020

"flags" passed to intel_svm_bind_mm() is a bit mask and should be
defined as "unsigned int" instead of "int".

Change its type to "unsigned int".
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-3-git-send-email-fenghua.yu@intel.com

2a5054c6

drm, iommu: Change type of pasid to u32 · c7b6bac9

由 Fenghua Yu 提交于 9月 15, 2020

PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".

No PASID type change in uapi although it defines PASID as __u64 in
some places.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com

c7b6bac9

16 9月, 2020 9 次提交

iommu/vt-d: Remove domain search for PCI/MSI[X] · 9f0ffb4b

由 Thomas Gleixner 提交于 8月 26, 2020

Now that the domain can be retrieved through device::msi_domain the domain
search for PCI_MSI[X] is not longer required. Remove it.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112334.305699301@linutronix.de

9f0ffb4b

iommm/vt-d: Store irq domain in struct device · 85a8dfc5

由 Thomas Gleixner 提交于 8月 26, 2020

As a first step to make X86 utilize the direct MSI irq domain operations
store the irq domain pointer in the device struct when a device is probed.

This is done from dmar_pci_bus_add_dev() because it has to work even when
DMA remapping is disabled. It only overrides the irqdomain of devices which
are handled by a regular PCI/MSI irq domain which protects PCI devices
behind special busses like VMD which have their own irq domain.

No functional change. It just avoids the redirection through
arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq
domain alloc/free functions instead of having to look up the irq domain for
every single MSI interupt.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112333.714566121@linutronix.de

85a8dfc5

x86/msi: Consolidate MSI allocation · 3b9c1d37

由 Thomas Gleixner 提交于 8月 26, 2020

Convert the interrupt remap drivers to retrieve the pci device from the msi
descriptor and use info::hwirq.

This is the first step to prepare x86 for using the generic MSI domain ops.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NWei Liu <wei.liu@kernel.org>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112332.466405395@linutronix.de

3b9c1d37

x86_ioapic_Consolidate_IOAPIC_allocation · 33a65ba4

由 Thomas Gleixner 提交于 8月 26, 2020

Move the IOAPIC specific fields into their own struct and reuse the common
devid. Get rid of the #ifdeffery as it does not matter at all whether the
alloc info is a couple of bytes longer or not.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NWei Liu <wei.liu@kernel.org>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112332.054367732@linutronix.de

33a65ba4

x86/msi: Consolidate HPET allocation · 2bf1e7bc

由 Thomas Gleixner 提交于 8月 26, 2020

None of the magic HPET fields are required in any way.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.943993771@linutronix.de

2bf1e7bc

iommu/irq_remapping: Consolidate irq domain lookup · 6b6256e6

由 Thomas Gleixner 提交于 8月 26, 2020

Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN
types, consolidate the two getter functions.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.741909337@linutronix.de

6b6256e6

iommu/vt-d: Consolidate irq domain getter · 60e5a939

由 Thomas Gleixner 提交于 8月 26, 2020

The irq domain request mode is now indicated in irq_alloc_info::type.

Consolidate the two getter functions into one.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.530546013@linutronix.de

60e5a939

x86/irq: Add allocation type for parent domain retrieval · b4c364da

由 Thomas Gleixner 提交于 8月 26, 2020

irq_remapping_ir_irq_domain() is used to retrieve the remapping parent
domain for an allocation type. irq_remapping_irq_domain() is for retrieving
the actual device domain for allocating interrupts for a device.

The two functions are similar and can be unified by using explicit modes
for parent irq domain retrieval.

Add X86_IRQ_ALLOC_TYPE_IOAPIC/HPET_GET_PARENT and use it in the iommu
implementations. Drop the parent domain retrieval for PCI_MSI/X as that is
unused.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.436350257@linutronix.de

b4c364da

x86_irq_Rename_X86_IRQ_ALLOC_TYPE_MSI_to_reflect_PCI_dependency · 801b5e4c

由 Thomas Gleixner 提交于 8月 26, 2020

No functional change.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20200826112331.343103175@linutronix.de

801b5e4c

11 9月, 2020 1 次提交

dma-direct: rename and cleanup __phys_to_dma · 5ceda740

由 Christoph Hellwig 提交于 8月 17, 2020

The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious.  Try
to improve the situation by renaming __phys_to_dma to
phys_to_dma_unencryped, and not forcing architectures that want to
override phys_to_dma to actually provide __phys_to_dma.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>

5ceda740

04 9月, 2020 4 次提交

iommu/vt-d: Handle 36bit addressing for x86-32 · 29aaebbc

由 Chris Wilson 提交于 8月 22, 2020

Beware that the address size for x86-32 may exceed unsigned long.

[    0.368971] UBSAN: shift-out-of-bounds in drivers/iommu/intel/iommu.c:128:14
[    0.369055] shift exponent 36 is too large for 32-bit type 'long unsigned int'

If we don't handle the wide addresses, the pages are mismapped and the
device read/writes go astray, detected as DMAR faults and leading to
device failure. The behaviour changed (from working to broken) in commit
fa954e68 ("iommu/vt-d: Delegate the dma domain to upper layer"), but
the error looks older.

Fixes: fa954e68 ("iommu/vt-d: Delegate the dma domain to upper layer")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Cc: James Sewart <jamessewart@arista.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: <stable@vger.kernel.org> # v5.3+
Link: https://lore.kernel.org/r/20200822160209.28512-1-chris@chris-wilson.co.ukSigned-off-by: NJoerg Roedel <jroedel@suse.de>

29aaebbc

iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set() · 2d33b7d6

由 Lu Baolu 提交于 9月 03, 2020

The dev_iommu_priv_set() must be called after probe_device(). This fixes
a NULL pointer deference bug when booting a system with kernel cmdline
"intel_iommu=on,igfx_off", where the dev_iommu_priv_set() is abused.

The following stacktrace was produced:

 Command line: BOOT_IMAGE=/isolinux/bzImage console=tty1 intel_iommu=on,igfx_off
 ...
 DMAR: Host address width 39
 DMAR: DRHD base: 0x000000fed90000 flags: 0x0
 DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 19e2ff0505e
 DMAR: DRHD base: 0x000000fed91000 flags: 0x1
 DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
 DMAR: RMRR base: 0x0000009aa9f000 end: 0x0000009aabefff
 DMAR: RMRR base: 0x0000009d000000 end: 0x0000009f7fffff
 DMAR: No ATSR found
 BUG: kernel NULL pointer dereference, address: 0000000000000038
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP PTI
 CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.9.0-devel+ #2
 Hardware name: LENOVO 20HGS0TW00/20HGS0TW00, BIOS N1WET46S (1.25s ) 03/30/2018
 RIP: 0010:intel_iommu_init+0xed0/0x1136
 Code: fe e9 61 02 00 00 bb f4 ff ff ff e9 57 02 00 00 48 63 d1 48 c1 e2 04 48
       03 50 20 48 8b 12 48 85 d2 74 0b 48 8b 92 d0 02 00 00 48 89 7a 38 ff c1
       e9 15 f5 ff ff 48 c7 c7 60 99 ac a7 49 c7 c7 a0
 RSP: 0000:ffff96d180073dd0 EFLAGS: 00010282
 RAX: ffff8c91037a7d20 RBX: 0000000000000000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffff
 RBP: ffff96d180073e90 R08: 0000000000000001 R09: ffff8c91039fe3c0
 R10: 0000000000000226 R11: 0000000000000226 R12: 000000000000000b
 R13: ffff8c910367c650 R14: ffffffffa8426d60 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff8c9107480000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000038 CR3: 00000004b100a001 CR4: 00000000003706e0
 Call Trace:
  ? _raw_spin_unlock_irqrestore+0x1f/0x30
  ? call_rcu+0x10e/0x320
  ? trace_hardirqs_on+0x2c/0xd0
  ? rdinit_setup+0x2c/0x2c
  ? e820__memblock_setup+0x8b/0x8b
  pci_iommu_init+0x16/0x3f
  do_one_initcall+0x46/0x1e4
  kernel_init_freeable+0x169/0x1b2
  ? rest_init+0x9f/0x9f
  kernel_init+0xa/0x101
  ret_from_fork+0x22/0x30
 Modules linked in:
 CR2: 0000000000000038
 ---[ end trace 3653722a6f936f18 ]---

Fixes: 01b9d4e2 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Reported-by: NTorsten Hilbrich <torsten.hilbrich@secunet.com>
Reported-by: NWendy Wang <wendy.wang@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NTorsten Hilbrich <torsten.hilbrich@secunet.com>
Link: https://lore.kernel.org/linux-iommu/96717683-70be-7388-3d2f-61131070a96a@secunet.com/
Link: https://lore.kernel.org/r/20200903065132.16879-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

2d33b7d6

iommu/vt-d: Serialize IOMMU GCMD register modifications · 6e4e9ec6

由 Lu Baolu 提交于 8月 28, 2020

The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General
Description) that:

If multiple control fields in this register need to be modified, software
must serialize the modifications through multiple writes to this register.

However, in irq_remapping.c, modifications of IRE and CFI are done in one
write. We need to do two separate writes with STS checking after each. It
also checks the status register before writing command register to avoid
unnecessary register write.

Fixes: af8d102f ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

6e4e9ec6

iommu/vt-d: Drop kerneldoc marker from regular comment · 3207fa32

由 Krzysztof Kozlowski 提交于 7月 28, 2020

Fix W=1 compile warnings (invalid kerneldoc):

drivers/iommu/intel/dmar.c:389: warning: Function parameter or member 'header' not described in 'dmar_parse_one_drhd'
Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20200728170859.28143-2-krzk@kernel.orgSigned-off-by: NJoerg Roedel <jroedel@suse.de>

3207fa32

24 8月, 2020 1 次提交

treewide: Use fallthrough pseudo-keyword · df561f66

由 Gustavo A. R. Silva 提交于 8月 23, 2020

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>

df561f66

13 8月, 2020 1 次提交

mm: do page fault accounting in handle_mm_fault · bce617ed

由 Peter Xu 提交于 8月 11, 2020

Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b982 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NPeter Xu <peterx@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bce617ed

06 8月, 2020 1 次提交

x86/headers: Remove APIC headers from <asm/smp.h> · 13c01139

由 Ingo Molnar 提交于 8月 06, 2020

The APIC headers are relatively complex and bring in additional
header dependencies - while smp.h is a relatively simple header
included from high level headers.

Remove the dependency and add in the missing #include's in .c
files where they gained it indirectly before.
Signed-off-by: NIngo Molnar <mingo@kernel.org>

13c01139

29 7月, 2020 1 次提交

iommu/vt-d: Move Kconfig and Makefile bits down into intel directory · ab65ba57

由 Jerry Snitselaar 提交于 6月 30, 2020

Move Intel Kconfig and Makefile bits down into intel directory
with the rest of the Intel specific files.
Signed-off-by: NJerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200630200636.48600-2-jsnitsel@redhat.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

ab65ba57

24 7月, 2020 4 次提交

PCI/ATS: Add pci_pri_supported() to check device or associated PF · 3f9a7a13

由 Ashok Raj 提交于 7月 23, 2020

For SR-IOV, the PF PRI is shared between the PF and any associated VFs, and
the PRI Capability is allowed for PFs but not for VFs.  Searching for the
PRI Capability on a VF always fails, even if its associated PF supports
PRI.

Add pci_pri_supported() to check whether device or its associated PF
supports PRI.

[bhelgaas: commit log, avoid "!!"]
Fixes: b16d0cb9 ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS")
Link: https://lore.kernel.org/r/1595543849-19692-1-git-send-email-ashok.raj@intel.comSigned-off-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Cc: stable@vger.kernel.org	# v4.4+

3f9a7a13

iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu · b1012ca8

由 Lu Baolu 提交于 7月 23, 2020

The VT-d spec requires (10.4.4 Global Command Register, TE field) that:

Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before completing
the translation enable command and reflecting the status of the command
through the TES field in the Global Status register.

Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.

This provides a quirk list for those devices and skips TE disabling if
the qurik hits.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NKoba Ko <koba.ko@canonical.com>
Tested-by: NJun Miao <jun.miao@windriver.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200723013437.2268-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

b1012ca8

iommu/vt-d: Rename intel-pasid.h to pasid.h · 02f3effd

由 Lu Baolu 提交于 7月 24, 2020

As Intel VT-d files have been moved to its own subdirectory, the prefix
makes no sense. No functional changes.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-13-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

02f3effd

iommu/vt-d: Add page response ops support · 8b737121

由 Lu Baolu 提交于 7月 24, 2020

After page requests are handled, software must respond to the device
which raised the page request with the result. This is done through
the iommu ops.page_response if the request was reported to outside of
vendor iommu driver through iommu_report_device_fault(). This adds the
VT-d implementation of page_response ops.
Co-developed-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Co-developed-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200724014925.15523-12-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

8b737121

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功