提交 · 2e1a44c1c4acf209c0dd7bc04421d101b9e80d11 · openeuler / Kernel

07 4月, 2021 3 次提交

iommu/vt-d: Remove svm_dev_ops · 2e1a44c1

由 Lu Baolu 提交于 3月 23, 2021

The svm_dev_ops has never been referenced in the tree, and there's no
plan to have anything to use it. Remove it to make the code neat.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323010600.678627-3-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

2e1a44c1

iommu/vt-d: Don't set then clear private data in prq_event_thread() · 1d421058

由 Lu Baolu 提交于 3月 20, 2021

The VT-d specification (section 7.6) requires that the value in the
Private Data field of a Page Group Response Descriptor must match
the value in the Private Data field of the respective Page Request
Descriptor.

The private data field of a page group response descriptor is set then
immediately cleared in prq_event_thread(). This breaks the rule defined
by the VT-d specification. Fix it by moving clearing code up.

Fixes: 5b438f4b ("iommu/vt-d: Support page request in scalable mode")
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320024156.640798-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

1d421058

iommu/vt-d: Fix lockdep splat in intel_pasid_get_entry() · 803766cb

由 Lu Baolu 提交于 3月 20, 2021

The pasid_lock is used to synchronize different threads from modifying a
same pasid directory entry at the same time. It causes below lockdep splat.

[   83.296538] ========================================================
[   83.296538] WARNING: possible irq lock inversion dependency detected
[   83.296539] 5.12.0-rc3+ #25 Tainted: G        W
[   83.296539] --------------------------------------------------------
[   83.296540] bash/780 just changed the state of lock:
[   83.296540] ffffffff82b29c98 (device_domain_lock){..-.}-{2:2}, at:
           iommu_flush_dev_iotlb.part.0+0x32/0x110
[   83.296547] but this lock took another, SOFTIRQ-unsafe lock in the past:
[   83.296547]  (pasid_lock){+.+.}-{2:2}
[   83.296548]

           and interrupts could create inverse lock ordering between them.

[   83.296549] other info that might help us debug this:
[   83.296549] Chain exists of:
                 device_domain_lock --> &iommu->lock --> pasid_lock
[   83.296551]  Possible interrupt unsafe locking scenario:

[   83.296551]        CPU0                    CPU1
[   83.296552]        ----                    ----
[   83.296552]   lock(pasid_lock);
[   83.296553]                                local_irq_disable();
[   83.296553]                                lock(device_domain_lock);
[   83.296554]                                lock(&iommu->lock);
[   83.296554]   <Interrupt>
[   83.296554]     lock(device_domain_lock);
[   83.296555]
                *** DEADLOCK ***

Fix it by replacing the pasid_lock with an atomic exchange operation.
Reported-and-tested-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320020916.640115-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

803766cb

18 3月, 2021 6 次提交

iommu/vt-d: Calculate and set flags for handle_mm_fault · 396bd6f3

由 Jacob Pan 提交于 3月 02, 2021

Page requests are originated from the user page fault. Therefore, we
shall set FAULT_FLAG_USER.

FAULT_FLAG_REMOTE indicates that we are walking an mm which is not
guaranteed to be the same as the current->mm and should not be subject
to protection key enforcement. Therefore, we should set FAULT_FLAG_REMOTE
to avoid faults when both SVM and PKEY are used.

References: commit 1b2ee126 ("mm/core: Do not enforce PKEY permissions on remote mm access")
Reviewed-by: NRaj Ashok <ashok.raj@intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-5-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

396bd6f3

iommu/vt-d: Reject unsupported page request modes · 78a523fe

由 Jacob Pan 提交于 3月 02, 2021

When supervisor/privilige mode SVM is used, we bind init_mm.pgd with
a supervisor PASID. There should not be any page fault for init_mm.
Execution request with DMA read is also not supported.

This patch checks PRQ descriptor for both unsupported configurations,
reject them both with invalid responses.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-4-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

78a523fe

iommu/vt-d: Enable write protect propagation from guest · bb0f6153

由 Jacob Pan 提交于 3月 02, 2021

Write protect bit, when set, inhibits supervisor writes to the read-only
pages. In guest supervisor shared virtual addressing (SVA), write-protect
should be honored upon guest bind supervisor PASID request.

This patch extends the VT-d portion of the IOMMU UAPI to include WP bit.
WPE bit of the supervisor PASID entry will be set to match CPU CR0.WP bit.
Signed-off-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-3-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

bb0f6153

iommu/vt-d: Enable write protect for supervisor SVM · f68c7f53

由 Jacob Pan 提交于 3月 02, 2021

Write protect bit, when set, inhibits supervisor writes to the read-only
pages. In supervisor shared virtual addressing (SVA), where page tables
are shared between CPU and DMA, IOMMU PASID entry WPE bit should match
CR0.WP bit in the CPU.
This patch sets WPE bit for supervisor PASIDs if CR0.WP is set.
Signed-off-by: NSanjay Kumar <sanjay.k.kumar@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-2-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

f68c7f53

iommu/vt-d: Report more information about invalidation errors · 6ca69e58

由 Lu Baolu 提交于 3月 18, 2021

When the invalidation queue errors are encountered, dump the information
logged by the VT-d hardware together with the pending queue invalidation
descriptors.
Signed-off-by: NAshok Raj <ashok.raj@intel.com>
Tested-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210318005340.187311-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

6ca69e58

iommu/vt-d: Disable SVM when ATS/PRI/PASID are not enabled in the device · dec991e4

由 Kyung Min Park 提交于 3月 14, 2021

Currently, the Intel VT-d supports Shared Virtual Memory (SVM) only when
IO page fault is supported. Otherwise, shared memory pages can not be
swapped out and need to be pinned. The device needs the Address Translation
Service (ATS), Page Request Interface (PRI) and Process Address Space
Identifier (PASID) capabilities to be enabled to support IO page fault.

Disable SVM when ATS, PRI and PASID are not enabled in the device.
Signed-off-by: NKyung Min Park <kyung.min.park@intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210314201534.918-1-kyung.min.park@intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

dec991e4

04 3月, 2021 1 次提交

iommu/vt-d: Fix status code for Allocate/Free PASID command · 444d66a2

由 Zenghui Yu 提交于 2月 27, 2021

As per Intel vt-d spec, Rev 3.0 (section 10.4.45 "Virtual Command Response
Register"), the status code of "No PASID available" error in response to
the Allocate PASID command is 2, not 1. The same for "Invalid PASID" error
in response to the Free PASID command.

We will otherwise see confusing kernel log under the command failure from
guest side. Fix it.

Fixes: 24f27d32 ("iommu/vt-d: Enlightened PASID allocation")
Signed-off-by: NZenghui Yu <yuzenghui@huawei.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210227073909.432-1-yuzenghui@huawei.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

444d66a2

04 2月, 2021 4 次提交

iommu/vt-d: Parse SATC reporting structure · 31a75cbb

由 Yian Chen 提交于 2月 04, 2021

Software should parse every SATC table and all devices in the tables
reported by the BIOS and keep the information in kernel list for further
reference.
Signed-off-by: NYian Chen <yian.chen@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210203093329.1617808-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-7-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

31a75cbb

iommu/vt-d: Add iotlb_sync_map callback · 933fcd01

由 Lu Baolu 提交于 2月 04, 2021

Some Intel VT-d hardware implementations don't support memory coherency
for page table walk (presented by the Page-Walk-coherency bit in the
ecap register), so that software must flush the corresponding CPU cache
lines explicitly after each page table entry update.

The iommu_map_sg() code iterates through the given scatter-gather list
and invokes iommu_map() for each element in the scatter-gather list,
which calls into the vendor IOMMU driver through iommu_ops callback. As
the result, a single sg mapping may lead to multiple cache line flushes,
which leads to the degradation of I/O performance after the commit
<c588072b> ("iommu/vt-d: Convert intel iommu driver to the iommu
ops").

Fix this by adding iotlb_sync_map callback and centralizing the clflush
operations after all sg mappings.

Fixes: c588072b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Reported-by: NChuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/linux-iommu/D81314ED-5673-44A6-B597-090E3CB83EB0@oracle.com/Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
[ cel: removed @first_pte, which is no longer used ]
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/linux-iommu/161177763962.1311.15577661784296014186.stgit@manet.1015granger.net
Link: https://lore.kernel.org/r/20210204014401.2846425-5-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

933fcd01

iommu/vt-d: Move capability check code to cap_audit files · 010bf565

由 Kyung Min Park 提交于 2月 04, 2021

Move IOMMU capability check and sanity check code to cap_audit files.
Also implement some helper functions for sanity checks.
Signed-off-by: NKyung Min Park <kyung.min.park@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-4-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

010bf565

iommu/vt-d: Audit IOMMU Capabilities and add helper functions · ad3d1902

由 Kyung Min Park 提交于 2月 04, 2021

Audit IOMMU Capability/Extended Capability and check if the IOMMUs have
the consistent value for features. Report out or scale to the lowest
supported when IOMMU features have incompatibility among IOMMUs.

Report out features when below features are mismatched:
  - First Level 5 Level Paging Support (FL5LP)
  - First Level 1 GByte Page Support (FL1GP)
  - Read Draining (DRD)
  - Write Draining (DWD)
  - Page Selective Invalidation (PSI)
  - Zero Length Read (ZLR)
  - Caching Mode (CM)
  - Protected High/Low-Memory Region (PHMR/PLMR)
  - Required Write-Buffer Flushing (RWBF)
  - Advanced Fault Logging (AFL)
  - RID-PASID Support (RPS)
  - Scalable Mode Page Walk Coherency (SMPWC)
  - First Level Translation Support (FLTS)
  - Second Level Translation Support (SLTS)
  - No Write Flag Support (NWFS)
  - Second Level Accessed/Dirty Support (SLADS)
  - Virtual Command Support (VCS)
  - Scalable Mode Translation Support (SMTS)
  - Device TLB Invalidation Throttle (DIT)
  - Page Drain Support (PDS)
  - Process Address Space ID Support (PASID)
  - Extended Accessed Flag Support (EAFS)
  - Supervisor Request Support (SRS)
  - Execute Request Support (ERS)
  - Page Request Support (PRS)
  - Nested Translation Support (NEST)
  - Snoop Control (SC)
  - Pass Through (PT)
  - Device TLB Support (DT)
  - Queued Invalidation (QI)
  - Page walk Coherency (C)

Set capability to the lowest supported when below features are mismatched:
  - Maximum Address Mask Value (MAMV)
  - Number of Fault Recording Registers (NFR)
  - Second Level Large Page Support (SLLPS)
  - Fault Recording Offset (FRO)
  - Maximum Guest Address Width (MGAW)
  - Supported Adjusted Guest Address Width (SAGAW)
  - Number of Domains supported (NDOMS)
  - Pasid Size Supported (PSS)
  - Maximum Handle Mask Value (MHMV)
  - IOTLB Register Offset (IRO)
Signed-off-by: NKyung Min Park <kyung.min.park@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130184452.31711-1-kyung.min.park@intel.com
Link: https://lore.kernel.org/r/20210204014401.2846425-3-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

ad3d1902

02 2月, 2021 1 次提交

iommu/vt-d: Fix compile error [-Werror=implicit-function-declaration] · e1ed66ac

由 Lu Baolu 提交于 1月 30, 2021

trace_qi_submit() could be used when interrupt remapping is supported,
but DMA remapping is not. In this case, the following compile error
occurs.

../drivers/iommu/intel/dmar.c: In function 'qi_submit_sync':
../drivers/iommu/intel/dmar.c:1311:3: error: implicit declaration of function 'trace_qi_submit';
  did you mean 'ftrace_nmi_exit'? [-Werror=implicit-function-declaration]
   trace_qi_submit(iommu, desc[i].qw0, desc[i].qw1,
   ^~~~~~~~~~~~~~~
   ftrace_nmi_exit

Fixes: f2dd8717 ("iommu/vt-d: Add qi_submit trace event")
Reported-by: Nkernel test robot <lkp@intel.com>
Reported-by: NRandy Dunlap <rdunlap@infradead.org>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210130151907.3929148-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

e1ed66ac

29 1月, 2021 2 次提交

iommu/vt-d: Use INVALID response code instead of FAILURE · 3aa7c62c

由 Lu Baolu 提交于 1月 26, 2021

The VT-d IOMMU response RESPONSE_FAILURE for a page request in below
cases:

- When it gets a Page_Request with no PASID;
- When it gets a Page_Request with PASID that is not in use for this
  device.

This is allowed by the spec, but IOMMU driver doesn't support such cases
today. When the device receives RESPONSE_FAILURE, it sends the device
state machine to HALT state. Now if we try to unload the driver, it hangs
since the device doesn't send any outbound transactions to host when the
driver is trying to clear things up. The only possible responses would be
for invalidation requests.

Let's use RESPONSE_INVALID instead for now, so that the device state
machine doesn't enter HALT state.
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210126080730.2232859-3-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

3aa7c62c

iommu/vt-d: Clear PRQ overflow only when PRQ is empty · 28a77185

由 Lu Baolu 提交于 1月 26, 2021

It is incorrect to always clear PRO when it's set w/o first checking
whether the overflow condition has been cleared. Current code assumes
that if an overflow condition occurs it must have been cleared by earlier
loop. However since the code runs in a threaded context, the overflow
condition could occur even after setting the head to the tail under some
extreme condition. To be sane, we should read both head/tail again when
seeing a pending PRO and only clear PRO after all pending PRs have been
handled.
Suggested-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@MWHPR11MB1886.namprd11.prod.outlook.com/
Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

28a77185

28 1月, 2021 5 次提交

iommu/vt-d: Do not use flush-queue when caching-mode is on · 29b32839

由 Nadav Amit 提交于 1月 27, 2021

When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.

Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f5 ("intel-iommu: Avoid global flushes with caching
mode.")

This behavior changed after commit 13cf0174 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.

Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.

Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.

Fixes: 13cf0174 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: NNadav Amit <namit@vmware.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

29b32839

iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid() · 494b3688

由 Lu Baolu 提交于 1月 19, 2021

An incorrect address mask is being used in the qi_flush_dev_iotlb_pasid()
to check the address alignment. This leads to a lot of spurious kernel
warnings:

[  485.837093] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0
[  485.837098] DMAR: Invalidate non-aligned address 7f76f47f9000, order 0
[  492.494145] qi_flush_dev_iotlb_pasid: 5734 callbacks suppressed
[  492.494147] DMAR: Invalidate non-aligned address 7f7728800000, order 11
[  492.508965] DMAR: Invalidate non-aligned address 7f7728800000, order 11

Fix it by checking the alignment in right way.

Fixes: 288d08e7 ("iommu/vt-d: Handle non-page aligned address")
Reported-and-tested-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20210119043500.1539596-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

494b3688

iommu/vt-d: Preset Access/Dirty bits for IOVA over FL · a8ce9ebb

由 Lu Baolu 提交于 1月 15, 2021

The Access/Dirty bits in the first level page table entry will be set
whenever a page table entry was used for address translation or write
permission was successfully translated. This is always true when using
the first-level page table for kernel IOVA. Instead of wasting hardware
cycles to update the certain bits, it's better to set them up at the
beginning.
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210115004202.953965-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

a8ce9ebb

iommu/vt-d: Add qi_submit trace event · f2dd8717

由 Lu Baolu 提交于 1月 14, 2021

This adds a new trace event to track the submissions of requests to the
invalidation queue. This event will provide the information like:
- IOMMU name
- Invalidation type
- Descriptor raw data

A sample output like:
| qi_submit: iotlb_inv dmar1: 0x100e2 0x0 0x0 0x0
| qi_submit: dev_tlb_inv dmar1: 0x1000000003 0x7ffffffffffff001 0x0 0x0
| qi_submit: iotlb_inv dmar2: 0x800f2 0xf9a00005 0x0 0x0

This will be helpful for queued invalidation related debugging.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210114090400.736104-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

f2dd8717

iommu/vt-d: Consolidate duplicate cache invaliation code · 9872f9bd

由 Lu Baolu 提交于 1月 14, 2021

The pasid based IOTLB and devTLB invalidation code is duplicate in
several places. Consolidate them by using the common helpers.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210114085021.717041-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

9872f9bd

13 1月, 2021 1 次提交

iommu/vt-d: Fix duplicate included linux/dma-map-ops.h · 694a1c0a

由 Tian Tao 提交于 12月 28, 2020

linux/dma-map-ops.h is included more than once, Remove the one that
isn't necessary.
Signed-off-by: NTian Tao <tiantao6@hisilicon.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609118774-10083-1-git-send-email-tiantao6@hisilicon.comSigned-off-by: NWill Deacon <will@kernel.org>

694a1c0a

12 1月, 2021 1 次提交

iommu/vt-d: Fix unaligned addresses for intel_flush_svm_range_dev() · 2d6ffc63

由 Lu Baolu 提交于 12月 31, 2020

The VT-d hardware will ignore those Addr bits which have been masked by
the AM field in the PASID-based-IOTLB invalidation descriptor. As the
result, if the starting address in the descriptor is not aligned with
the address mask, some IOTLB caches might not invalidate. Hence people
will see below errors.

[ 1093.704661] dmar_fault: 29 callbacks suppressed
[ 1093.704664] DMAR: DRHD: handling fault status reg 3
[ 1093.712738] DMAR: [DMA Read] Request device [7a:02.0] PASID 2
               fault addr 7f81c968d000 [fault reason 113]
               SM: Present bit in first-level paging entry is clear

Fix this by using aligned address for PASID-based-IOTLB invalidation.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Reported-and-tested-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-2-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

2d6ffc63

07 1月, 2021 5 次提交

iommu/vt-d: Fix ineffective devTLB invalidation for subdevices · 7c29ada5

由 Liu Yi L 提交于 1月 07, 2021

iommu_flush_dev_iotlb() is called to invalidate caches on a device but
only loops over the devices which are fully-attached to the domain. For
sub-devices, this is ineffective and can result in invalid caching
entries left on the device.

Fix the missing invalidation by adding a loop over the subdevices and
ensuring that 'domain->has_iotlb_device' is updated when attaching to
subdevices.

Fixes: 67b8e02b ("iommu/vt-d: Aux-domain specific domain attach/detach")
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-4-git-send-email-yi.l.liu@intel.comSigned-off-by: NWill Deacon <will@kernel.org>

7c29ada5

iommu/vt-d: Fix general protection fault in aux_detach_device() · 18abda7a

由 Liu Yi L 提交于 1月 07, 2021

The aux-domain attach/detach are not tracked, some data structures might
be used after free. This causes general protection faults when multiple
subdevices are created and assigned to a same guest machine:

  | general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] SMP NOPTI
  | RIP: 0010:intel_iommu_aux_detach_device+0x12a/0x1f0
  | [...]
  | Call Trace:
  |  iommu_aux_detach_device+0x24/0x70
  |  vfio_mdev_detach_domain+0x3b/0x60
  |  ? vfio_mdev_set_domain+0x50/0x50
  |  iommu_group_for_each_dev+0x4f/0x80
  |  vfio_iommu_detach_group.isra.0+0x22/0x30
  |  vfio_iommu_type1_detach_group.cold+0x71/0x211
  |  ? find_exported_symbol_in_section+0x4a/0xd0
  |  ? each_symbol_section+0x28/0x50
  |  __vfio_group_unset_container+0x4d/0x150
  |  vfio_group_try_dissolve_container+0x25/0x30
  |  vfio_group_put_external_user+0x13/0x20
  |  kvm_vfio_group_put_external_user+0x27/0x40 [kvm]
  |  kvm_vfio_destroy+0x45/0xb0 [kvm]
  |  kvm_put_kvm+0x1bb/0x2e0 [kvm]
  |  kvm_vm_release+0x22/0x30 [kvm]
  |  __fput+0xcc/0x260
  |  ____fput+0xe/0x10
  |  task_work_run+0x8f/0xb0
  |  do_exit+0x358/0xaf0
  |  ? wake_up_state+0x10/0x20
  |  ? signal_wake_up_state+0x1a/0x30
  |  do_group_exit+0x47/0xb0
  |  __x64_sys_exit_group+0x18/0x20
  |  do_syscall_64+0x57/0x1d0
  |  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix the crash by tracking the subdevices when attaching and detaching
aux-domains.

Fixes: 67b8e02b ("iommu/vt-d: Aux-domain specific domain attach/detach")
Co-developed-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-3-git-send-email-yi.l.liu@intel.comSigned-off-by: NWill Deacon <will@kernel.org>

18abda7a

iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev · 9ad9f45b

由 Liu Yi L 提交于 1月 07, 2021

'struct intel_svm' is shared by all devices bound to a give process,
but records only a single pointer to a 'struct intel_iommu'. Consequently,
cache invalidations may only be applied to a single DMAR unit, and are
erroneously skipped for the other devices.

In preparation for fixing this, rework the structures so that the iommu
pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
to track them in its device list.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Raj Ashok <ashok.raj@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Reported-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Reported-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Tested-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Cc: stable@vger.kernel.org # v5.0+
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.comSigned-off-by: NWill Deacon <will@kernel.org>

9ad9f45b

iommu/vt-d: Fix lockdep splat in sva bind()/unbind() · 420d42f6

由 Lu Baolu 提交于 12月 31, 2020

Lock(&iommu->lock) without disabling irq causes lockdep warnings.

========================================================
WARNING: possible irq lock inversion dependency detected
5.11.0-rc1+ #828 Not tainted
--------------------------------------------------------
kworker/0:1H/120 just changed the state of lock:
ffffffffad9ea1b8 (device_domain_lock){..-.}-{2:2}, at:
iommu_flush_dev_iotlb.part.0+0x32/0x120
but this lock took another, SOFTIRQ-unsafe lock in the past:
 (&iommu->lock){+.+.}-{2:2}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&iommu->lock);
                               local_irq_disable();
                               lock(device_domain_lock);
                               lock(&iommu->lock);
  <Interrupt>
    lock(device_domain_lock);

 *** DEADLOCK ***
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-5-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

420d42f6

iommu/vt-d: Fix misuse of ALIGN in qi_flush_piotlb() · 1efd17e7

由 Lu Baolu 提交于 12月 31, 2020

Use IS_ALIGNED() instead. Otherwise, an unaligned address will be ignored.

Fixes: 33cd6e64 ("iommu/vt-d: Flush PASID-based iotlb for iova over first level")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-1-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

1efd17e7

06 1月, 2021 1 次提交

iommu/intel: Fix memleak in intel_irq_remapping_alloc · ff2b46d7

由 Dinghao Liu 提交于 1月 05, 2021

When irq_domain_get_irq_data() or irqd_cfg() fails
at i == 0, data allocated by kzalloc() has not been
freed before returning, which leads to memleak.

Fixes: b106ee63 ("irq_remapping/vt-d: Enhance Intel IR driver to support hierarchical irqdomains")
Signed-off-by: NDinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210105051837.32118-1-dinghao.liu@zju.edu.cnSigned-off-by: NWill Deacon <will@kernel.org>

ff2b46d7

01 12月, 2020 1 次提交

iommu/vt-d: Avoid GFP_ATOMIC where it is not needed · 33e07157

由 Christophe JAILLET 提交于 12月 01, 2020

There is no reason to use GFP_ATOMIC in a 'suspend' function.
Use GFP_KERNEL instead to give more opportunities to allocate the
requested memory.
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201030182630.5154-1-christophe.jaillet@wanadoo.frSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201201013149.2466272-2-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

33e07157

27 11月, 2020 1 次提交

iommu/vt-d: Remove set but not used variable · 405a43cc

由 Lu Baolu 提交于 11月 27, 2020

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/iommu/intel/iommu.c:5643:27: warning: variable 'last_pfn' set but not used [-Wunused-but-set-variable]
5643 |  unsigned long start_pfn, last_pfn;
     |                           ^~~~~~~~

This variable is never used, so remove it.

Fixes: 2a2b8eaa ("iommu: Handle freelists when using deferred flushing in iommu drivers")
Reported-by: Nkernel test robot <lkp@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201127013308.1833610-1-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

405a43cc

26 11月, 2020 1 次提交

iommu/vt-d: Don't read VCCAP register unless it exists · d76b42e9

由 David Woodhouse 提交于 11月 26, 2020

My virtual IOMMU implementation is whining that the guest is reading a
register that doesn't exist. Only read the VCCAP_REG if the corresponding
capability is set in ECAP_REG to indicate that it actually exists.

Fixes: 3375303e ("iommu/vt-d: Add custom allocator for IOASID")
Signed-off-by: NDavid Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: NLiu Yi L <yi.l.liu@intel.com>
Cc: stable@vger.kernel.org # v5.7+
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/de32b150ffaa752e0cff8571b17dfb1213fbe71c.camel@infradead.orgSigned-off-by: NWill Deacon <will@kernel.org>

d76b42e9

25 11月, 2020 5 次提交

iommu: Move def_domain type check for untrusted device into core · 28b41e2c

由 Lu Baolu 提交于 11月 24, 2020

So that the vendor iommu drivers are no more required to provide the
def_domain_type callback to always isolate the untrusted devices.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Link: https://lore.kernel.org/linux-iommu/243ce89c33fe4b9da4c56ba35acebf81@huawei.com/
Link: https://lore.kernel.org/r/20201124130604.2912899-2-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

28b41e2c

iommu/vt-d: Cleanup after converting to dma-iommu ops · 58a8bb39

由 Lu Baolu 提交于 11月 24, 2020

Some cleanups after converting the driver to use dma-iommu ops.
- Remove nobounce option;
- Cleanup and simplify the path in domain mapping.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NLogan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-8-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

58a8bb39

iommu/vt-d: Convert intel iommu driver to the iommu ops · c588072b

由 Tom Murphy 提交于 11月 24, 2020

Convert the intel iommu driver to the dma-iommu api. Remove the iova
handling and reserve region code from the intel iommu driver.
Signed-off-by: NTom Murphy <murphyt7@tcd.ie>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NLogan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-7-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

c588072b

iommu/vt-d: Update domain geometry in iommu_ops.at(de)tach_dev · c062db03

由 Lu Baolu 提交于 11月 24, 2020

The iommu-dma constrains IOVA allocation based on the domain geometry
that the driver reports. Update domain geometry everytime a domain is
attached to or detached from a device.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NLogan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-6-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

c062db03

iommu: Handle freelists when using deferred flushing in iommu drivers · 2a2b8eaa

由 Tom Murphy 提交于 11月 24, 2020

Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.

This is useful for iommu drivers (in this case the intel iommu driver)
which need to wait for the ioTLB to be flushed before newly
free/unmapped page table pages can be freed. This way we can still batch
ioTLB free operations and handle the freelists.
Signed-off-by: NTom Murphy <murphyt7@tcd.ie>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Tested-by: NLogan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/20201124082057.2614359-2-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

2a2b8eaa

23 11月, 2020 1 次提交

iommu/ioasid: Add ioasid references · cb4789b0

由 Jean-Philippe Brucker 提交于 11月 06, 2020

Let IOASID users take references to existing ioasids with ioasid_get().
ioasid_put() drops a reference and only frees the ioasid when its
reference number is zero. It returns true if the ioasid was freed.
For drivers that don't call ioasid_get(), ioasid_put() is the same as
ioasid_free().
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201106155048.997886-2-jean-philippe@linaro.orgSigned-off-by: NWill Deacon <will@kernel.org>

cb4789b0

19 11月, 2020 1 次提交

Revert "iommu/vt-d: Take CONFIG_PCI_ATS into account" · 01cf158e

由 Thomas Gleixner 提交于 11月 19, 2020

This reverts commit 8986f223.

The proper fix is queued in Will's tree now
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

01cf158e

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功