提交 · e65a6897be5e4939d477c4969a05e12d90b08409 · openeuler / Kernel

02 12月, 2022 1 次提交

iommu/vt-d: Add a fix for devices need extra dtlb flush · e65a6897

由 Jacob Pan 提交于 12月 01, 2022

QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in
address translation service (ATS). These devices may inadvertently issue
ATS invalidation completion before posted writes initiated with
translated address that utilized translations matching the invalidation
address range, violating the invalidation completion ordering.

This patch adds an extra device TLB invalidation for the affected devices,
it is needed to ensure no more posted writes with translated address
following the invalidation completion. Therefore, the ordering is
preserved and data-corruption is prevented.

Device TLBs are invalidated under the following six conditions:
1. Device driver does DMA API unmap IOVA
2. Device driver unbind a PASID from a process, sva_unbind_device()
3. PASID is torn down, after PASID cache is flushed. e.g. process
exit_mmap() due to crash
4. Under SVA usage, called by mmu_notifier.invalidate_range() where
VM has to free pages that were unmapped
5. userspace driver unmaps a DMA buffer
6. Cache invalidation in vSVA usage (upcoming)

For #1 and #2, device drivers are responsible for stopping DMA traffic
before unmap/unbind. For #3, iommu driver gets mmu_notifier to
invalidate TLB the same way as normal user unmap which will do an extra
invalidation. The dTLB invalidation after PASID cache flush does not
need an extra invalidation.

Therefore, we only need to deal with #4 and #5 in this patch. #1 is also
covered by this patch due to common code path with #5.
Tested-by: NYuzhang Luo <yuzhang.luo@intel.com>
Reviewed-by: NAshok Raj <ashok.raj@intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20221130062449.1360063-1-jacob.jun.pan@linux.intel.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

e65a6897

26 9月, 2022 2 次提交

iommu/vt-d: Rename cap_5lp_support to cap_fl5lp_support · b722cb32

由 Yi Liu 提交于 9月 26, 2022

This renaming better describes it is for first level page table (a.k.a
first stage page table since VT-d spec 3.4).
Signed-off-by: NYi Liu <yi.l.liu@intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220916071326.2223901-1-yi.l.liu@intel.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

b722cb32

iommu/vt-d: Remove unnecessary SVA data accesses in page fault path · 06f4b8d0

由 Lu Baolu 提交于 9月 26, 2022

The existing I/O page fault handling code accesses the per-PASID SVA data
structures. This is unnecessary and makes the fault handling code only
suitable for SVA scenarios. This removes the SVA data accesses from the
I/O page fault reporting and responding code, so that the fault handling
code could be generic.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220914011821.400986-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

06f4b8d0

15 7月, 2022 5 次提交

iommu/vt-d: Refactor iommu information of each domain · ba949f4c

由 Lu Baolu 提交于 7月 12, 2022

When a DMA domain is attached to a device, it needs to allocate a domain
ID from its IOMMU. Currently, the domain ID information is stored in two
static arrays embedded in the domain structure. This can lead to memory
waste when the driver is running on a small platform.

This optimizes these static arrays by replacing them with an xarray and
consuming memory on demand.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Reviewed-by: NSteve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20220702015610.2849494-4-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

ba949f4c

iommu/vt-d: Acquiring lock in pasid manipulation helpers · 8430fd3f

由 Lu Baolu 提交于 7月 12, 2022

The iommu->lock is used to protect the per-IOMMU pasid directory table
and pasid table. Move the spinlock acquisition/release into the helpers
to make the code self-contained.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220706025524.2904370-8-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

8430fd3f

iommu/vt-d: Replace spin_lock_irqsave() with spin_lock() · ffd5869d

由 Lu Baolu 提交于 7月 12, 2022

The iommu->lock is used to protect changes in root/context/pasid tables
and domain ID allocation. There's no use case to change these resources
in any interrupt context. Therefore, it is unnecessary to disable the
interrupts when the spinlock is held. The same thing happens on the
device_domain_lock side, which protects the device domain attachment
information. This replaces spin_lock/unlock_irqsave/irqrestore() calls
with the normal spin_lock/unlock().
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220706025524.2904370-6-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

ffd5869d

iommu/vt-d: Move include/linux/intel-iommu.h under iommu · 2585a279

由 Lu Baolu 提交于 7月 12, 2022

This header file is private to the Intel IOMMU driver. Move it to the
driver folder.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NSteve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20220514014322.2927339-8-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

2585a279

iommu/vt-d: Move trace/events/intel_iommu.h under iommu · 933ab6d3

由 Lu Baolu 提交于 7月 12, 2022

This header file is private to the Intel IOMMU driver. Move it to the
driver folder.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Reviewed-by: NSteve Wahl <steve.wahl@hpe.com>
Link: https://lore.kernel.org/r/20220514014322.2927339-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

933ab6d3

28 4月, 2022 1 次提交

iommu/vt-d: Drop stop marker messages · da8669ff

由 Lu Baolu 提交于 4月 23, 2022

The page fault handling framework in the IOMMU core explicitly states
that it doesn't handle PCI PASID Stop Marker and the IOMMU drivers must
discard them before reporting faults. This handles Stop Marker messages
in prq_event_thread() before reporting events to the core.

The VT-d driver explicitly drains the pending page requests when a CPU
page table (represented by a mm struct) is unbound from a PASID according
to the procedures defined in the VT-d spec. The Stop Marker messages do
not need a response. Hence, it is safe to drop the Stop Marker messages
silently if any of them is found in the page request queue.

Fixes: d5b9e4bf ("iommu/vt-d: Report prq to io-pgfault framework")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NKevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20220421113558.3504874-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220423082330.3897867-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

da8669ff

04 3月, 2022 2 次提交

iommu/vt-d: Remove unused function intel_svm_capable() · b897a1b7

由 YueHaibing 提交于 3月 01, 2022

This is unused since commit 40483774 ("iommu/vt-d: Use
iommu_sva_alloc(free)_pasid() helpers").
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20220216113851.25004-1-yuehaibing@huawei.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220301020159.633356-12-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

b897a1b7

iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO · 586081d3

由 Lu Baolu 提交于 3月 01, 2022

Allocate and set the per-device iommu private data during iommu device
probe. This makes the per-device iommu private data always available
during iommu_probe_device() and iommu_release_device(). With this changed,
the dummy DEFER_DEVICE_DOMAIN_INFO pointer could be removed. The wrappers
for getting the private data and domain are also cleaned.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20220301020159.633356-6-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

586081d3

28 2月, 2022 1 次提交

iommu/vt-d: Remove guest pasid related callbacks · 989192ac

由 Lu Baolu 提交于 2月 16, 2022

The guest pasid related callbacks are not called in the tree. Remove them
to avoid dead code.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Reviewed-by: NJason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220216025249.3459465-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

989192ac

15 2月, 2022 1 次提交

iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit · 701fac40

由 Fenghua Yu 提交于 2月 07, 2022

PASIDs are process-wide. It was attempted to use refcounted PASIDs to
free them when the last thread drops the refcount. This turned out to
be complex and error prone. Given the fact that the PASID space is 20
bits, which allows up to 1M processes to have a PASID associated
concurrently, PASID resource exhaustion is not a realistic concern.

Therefore, it was decided to simplify the approach and stick with lazy
on demand PASID allocation, but drop the eager free approach and make an
allocated PASID's lifetime bound to the lifetime of the process.

Get rid of the refcounting mechanisms and replace/rename the interfaces
to reflect this new approach.

  [ bp: Massage commit message. ]
Suggested-by: NDave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NJoerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20220207230254.3342514-6-fenghua.yu@intel.com

701fac40

18 10月, 2021 1 次提交

iommu/vt-d: Clean up unused PASID updating functions · 00ecd540

由 Fenghua Yu 提交于 10月 14, 2021

update_pasid() and its call chain are currently unused in the tree because
Thomas disabled the ENQCMD feature. The feature will be re-enabled shortly
using a different approach and update_pasid() and its call chain will not
be used in the new approach.

Remove the useless functions.
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Reviewed-by: NTony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210920192349.2602141-1-fenghua.yu@intel.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211014053839.727419-8-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

00ecd540

09 9月, 2021 2 次提交

iommu/vt-d: Fix a deadlock in intel_svm_drain_prq() · 6ef05051

由 Fenghua Yu 提交于 8月 28, 2021

pasid_mutex and dev->iommu->param->lock are held while unbinding mm is
flushing IO page fault workqueue and waiting for all page fault works to
finish. But an in-flight page fault work also need to hold the two locks
while unbinding mm are holding them and waiting for the work to finish.
This may cause an ABBA deadlock issue as shown below:

	idxd 0000:00:0a.0: unbind PASID 2
	======================================================
	WARNING: possible circular locking dependency detected
	5.14.0-rc7+ #549 Not tainted [  186.615245] ----------
	dsa_test/898 is trying to acquire lock:
	ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at:
	iopf_queue_flush_dev+0x29/0x60
	but task is already holding lock:
	ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
	intel_svm_unbind+0x34/0x1e0
	which lock already depends on the new lock.

	the existing dependency chain (in reverse order) is:

	-> #2 (pasid_mutex){+.+.}-{3:3}:
	       __mutex_lock+0x75/0x730
	       mutex_lock_nested+0x1b/0x20
	       intel_svm_page_response+0x8e/0x260
	       iommu_page_response+0x122/0x200
	       iopf_handle_group+0x1c2/0x240
	       process_one_work+0x2a5/0x5a0
	       worker_thread+0x55/0x400
	       kthread+0x13b/0x160
	       ret_from_fork+0x22/0x30

	-> #1 (&param->fault_param->lock){+.+.}-{3:3}:
	       __mutex_lock+0x75/0x730
	       mutex_lock_nested+0x1b/0x20
	       iommu_report_device_fault+0xc2/0x170
	       prq_event_thread+0x28a/0x580
	       irq_thread_fn+0x28/0x60
	       irq_thread+0xcf/0x180
	       kthread+0x13b/0x160
	       ret_from_fork+0x22/0x30

	-> #0 (&param->lock){+.+.}-{3:3}:
	       __lock_acquire+0x1134/0x1d60
	       lock_acquire+0xc6/0x2e0
	       __mutex_lock+0x75/0x730
	       mutex_lock_nested+0x1b/0x20
	       iopf_queue_flush_dev+0x29/0x60
	       intel_svm_drain_prq+0x127/0x210
	       intel_svm_unbind+0xc5/0x1e0
	       iommu_sva_unbind_device+0x62/0x80
	       idxd_cdev_release+0x15a/0x200 [idxd]
	       __fput+0x9c/0x250
	       ____fput+0xe/0x10
	       task_work_run+0x64/0xa0
	       exit_to_user_mode_prepare+0x227/0x230
	       syscall_exit_to_user_mode+0x2c/0x60
	       do_syscall_64+0x48/0x90
	       entry_SYSCALL_64_after_hwframe+0x44/0xae

	other info that might help us debug this:

	Chain exists of:
	  &param->lock --> &param->fault_param->lock --> pasid_mutex

	 Possible unsafe locking scenario:

	       CPU0                    CPU1
	       ----                    ----
	  lock(pasid_mutex);
				       lock(&param->fault_param->lock);
				       lock(pasid_mutex);
	  lock(&param->lock);

	 *** DEADLOCK ***

	2 locks held by dsa_test/898:
	 #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at:
	 iommu_sva_unbind_device+0x53/0x80
	 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
	 intel_svm_unbind+0x34/0x1e0

	stack backtrace:
	CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549
	Hardware name: Intel Corporation Kabylake Client platform/KBL S
	DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016
	Call Trace:
	 dump_stack_lvl+0x5b/0x74
	 dump_stack+0x10/0x12
	 print_circular_bug.cold+0x13d/0x142
	 check_noncircular+0xf1/0x110
	 __lock_acquire+0x1134/0x1d60
	 lock_acquire+0xc6/0x2e0
	 ? iopf_queue_flush_dev+0x29/0x60
	 ? pci_mmcfg_read+0xde/0x240
	 __mutex_lock+0x75/0x730
	 ? iopf_queue_flush_dev+0x29/0x60
	 ? pci_mmcfg_read+0xfd/0x240
	 ? iopf_queue_flush_dev+0x29/0x60
	 mutex_lock_nested+0x1b/0x20
	 iopf_queue_flush_dev+0x29/0x60
	 intel_svm_drain_prq+0x127/0x210
	 ? intel_pasid_tear_down_entry+0x22e/0x240
	 intel_svm_unbind+0xc5/0x1e0
	 iommu_sva_unbind_device+0x62/0x80
	 idxd_cdev_release+0x15a/0x200

pasid_mutex protects pasid and svm data mapping data. It's unnecessary
to hold pasid_mutex while flushing the workqueue. To fix the deadlock
issue, unlock pasid_pasid during flushing the workqueue to allow the works
to be handled.

Fixes: d5b9e4bf ("iommu/vt-d: Report prq to io-pgfault framework")
Reported-and-tested-by: NDave Jiang <dave.jiang@intel.com>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com
[joro: Removed timing information from kernel log messages]
Signed-off-by: NJoerg Roedel <jroedel@suse.de>

6ef05051

iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm() · a21518cb

由 Fenghua Yu 提交于 8月 28, 2021

The mm->pasid will be used in intel_svm_free_pasid() after load_pasid()
during unbinding mm. Clearing it in load_pasid() will cause PASID cannot
be freed in intel_svm_free_pasid().

Additionally mm->pasid was updated already before load_pasid() during pasid
allocation. No need to update it again in load_pasid() during binding mm.
Don't update mm->pasid to avoid the issues in both binding mm and unbinding
mm.

Fixes: 40483774 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers")
Reported-and-tested-by: NDave Jiang <dave.jiang@intel.com>
Co-developed-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

a21518cb

19 8月, 2021 1 次提交

iommu/vt-d: Allow devices to have more than 32 outstanding PRs · 48811c44

由 Lu Baolu 提交于 8月 18, 2021

The minimum per-IOMMU PRQ queue size is one 4K page, this is more entries
than the hardcoded limit of 32 in the current VT-d code. Some devices can
support up to 512 outstanding PRQs but underutilized by this limit of 32.
Although, 32 gives some rough fairness when multiple devices share the same
IOMMU PRQ queue, but far from optimal for customized use case. This extends
the per-IOMMU PRQ queue size to four 4K pages and let the devices have as
many outstanding page requests as they can.
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210818134852.1847070-7-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

48811c44

18 8月, 2021 1 次提交

iommu/vt-d: Fix PASID reference leak · 62ef907a

由 Fenghua Yu 提交于 8月 17, 2021

A PASID reference is increased whenever a device is bound to an mm (and
its PASID) successfully (i.e. the device's sdev user count is increased).
But the reference is not dropped every time the device is unbound
successfully from the mm (i.e. the device's sdev user count is decreased).
The reference is dropped only once by calling intel_svm_free_pasid() when
there isn't any device bound to the mm. intel_svm_free_pasid() drops the
reference and only frees the PASID on zero reference.

Fix the issue by dropping the PASID reference and freeing the PASID when
no reference on successful unbinding the device by calling
intel_svm_free_pasid() .

Fixes: 40483774 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers")
Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210813181345.1870742-1-fenghua.yu@intel.comSigned-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

62ef907a

10 6月, 2021 9 次提交

iommu/vt-d: Fix out-bounds-warning in intel/svm.c · 606636dc

由 Gustavo A. R. Silva 提交于 6月 10, 2021

Replace a couple of calls to memcpy() with simple assignments in order
to fix the following out-of-bounds warning:

drivers/iommu/intel/svm.c:1198:4: warning: 'memcpy' offset [25, 32] from
    the object at 'desc' is out of the bounds of referenced subobject
    'qw2' with type 'long long unsigned int' at offset 16 [-Warray-bounds]

The problem is that the original code is trying to copy data into a
couple of struct members adjacent to each other in a single call to
memcpy(). This causes a legitimate compiler warning because memcpy()
overruns the length of &desc.qw2 and &resp.qw2, respectively.

This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().

Link: https://github.com/KSPP/linux/issues/109Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210414201403.GA392764@embeddedor
Link: https://lore.kernel.org/r/20210610020115.1637656-18-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

606636dc

iommu/vt-d: Add PRQ handling latency sampling · 0f4834ab

由 Lu Baolu 提交于 6月 10, 2021

The execution time for page fault request handling is performance critical
and needs to be monitored. This adds code to sample the execution time of
page fault request handling.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-17-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

0f4834ab

iommu/vt-d: Add prq_report trace event · e93a67f5

由 Lu Baolu 提交于 6月 10, 2021

This adds a new trace event to track the page fault request report.
This event will provide almost all information defined in a page
request descriptor.

A sample output:
| prq_report: dmar0/0000:00:0a.0 seq# 1: rid=0x50 addr=0x559ef6f97 r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 2: rid=0x50 addr=0x559ef6f9c rw--l pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 3: rid=0x50 addr=0x559ef6f98 r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 4: rid=0x50 addr=0x559ef6f9d rw--l pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 5: rid=0x50 addr=0x559ef6f99 r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 6: rid=0x50 addr=0x559ef6f9e rw--l pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 7: rid=0x50 addr=0x559ef6f9a r---- pasid=0x2 index=0x1
| prq_report: dmar0/0000:00:0a.0 seq# 8: rid=0x50 addr=0x559ef6f9f rw--l pasid=0x2 index=0x1

This will be helpful for I/O page fault related debugging.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-13-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

e93a67f5

iommu/vt-d: Report prq to io-pgfault framework · d5b9e4bf

由 Lu Baolu 提交于 6月 10, 2021

Let the IO page fault requests get handled through the io-pgfault
framework.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-12-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

d5b9e4bf

iommu/vt-d: Allocate/register iopf queue for sva devices · 4c82b886

由 Lu Baolu 提交于 6月 10, 2021

This allocates and registers the iopf queue infrastructure for devices
which want to support IO page fault for SVA.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-11-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

4c82b886

iommu/vt-d: Refactor prq_event_thread() · ae7f09b1

由 Lu Baolu 提交于 6月 10, 2021

Refactor prq_event_thread() by moving handling single prq event out of
the main loop.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-10-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

ae7f09b1

iommu/vt-d: Use common helper to lookup svm devices · 9e52cc0f

由 Lu Baolu 提交于 6月 10, 2021

It's common to iterate the svm device list and find a matched device. Add
common helpers to do this and consolidate the code.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-9-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

9e52cc0f

iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers · 40483774

由 Lu Baolu 提交于 6月 10, 2021

Align the pasid alloc/free code with the generic helpers defined in the
iommu core. This also refactored the SVA binding code to improve the
readability.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-8-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

40483774

iommu/vt-d: Add pasid private data helpers · 100b8a14

由 Lu Baolu 提交于 6月 10, 2021

We are about to use iommu_sva_alloc/free_pasid() helpers in iommu core.
That means the pasid life cycle will be managed by iommu core. Use a
local array to save the per pasid private data instead of attaching it
the real pasid.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com
Link: https://lore.kernel.org/r/20210610020115.1637656-7-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

100b8a14

07 4月, 2021 4 次提交

iommu/vt-d: Report the right page fault address · 03d20509

由 Lu Baolu 提交于 3月 20, 2021

The Address field of the Page Request Descriptor only keeps bit [63:12]
of the offending address. Convert it to a full address before reporting
it to device drivers.

Fixes: eb8d93ea ("iommu/vt-d: Report page request faults for guest SVA")
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320025415.641201-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

03d20509

iommu/vt-d: Remove SVM_FLAG_PRIVATE_PASID · 06905ea8

由 Lu Baolu 提交于 3月 23, 2021

The SVM_FLAG_PRIVATE_PASID has never been referenced in the tree, and
there's no plan to have anything to use it. So cleanup it.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323010600.678627-4-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

06905ea8

iommu/vt-d: Remove svm_dev_ops · 2e1a44c1

由 Lu Baolu 提交于 3月 23, 2021

The svm_dev_ops has never been referenced in the tree, and there's no
plan to have anything to use it. Remove it to make the code neat.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210323010600.678627-3-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

2e1a44c1

iommu/vt-d: Don't set then clear private data in prq_event_thread() · 1d421058

由 Lu Baolu 提交于 3月 20, 2021

The VT-d specification (section 7.6) requires that the value in the
Private Data field of a Page Group Response Descriptor must match
the value in the Private Data field of the respective Page Request
Descriptor.

The private data field of a page group response descriptor is set then
immediately cleared in prq_event_thread(). This breaks the rule defined
by the VT-d specification. Fix it by moving clearing code up.

Fixes: 5b438f4b ("iommu/vt-d: Support page request in scalable mode")
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Reviewed-by: NLiu Yi L <yi.l.liu@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210320024156.640798-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

1d421058

18 3月, 2021 2 次提交

iommu/vt-d: Calculate and set flags for handle_mm_fault · 396bd6f3

由 Jacob Pan 提交于 3月 02, 2021

Page requests are originated from the user page fault. Therefore, we
shall set FAULT_FLAG_USER.

FAULT_FLAG_REMOTE indicates that we are walking an mm which is not
guaranteed to be the same as the current->mm and should not be subject
to protection key enforcement. Therefore, we should set FAULT_FLAG_REMOTE
to avoid faults when both SVM and PKEY are used.

References: commit 1b2ee126 ("mm/core: Do not enforce PKEY permissions on remote mm access")
Reviewed-by: NRaj Ashok <ashok.raj@intel.com>
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-5-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

396bd6f3

iommu/vt-d: Reject unsupported page request modes · 78a523fe

由 Jacob Pan 提交于 3月 02, 2021

When supervisor/privilige mode SVM is used, we bind init_mm.pgd with
a supervisor PASID. There should not be any page fault for init_mm.
Execution request with DMA read is also not supported.

This patch checks PRQ descriptor for both unsupported configurations,
reject them both with invalid responses.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: NJacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/1614680040-1989-4-git-send-email-jacob.jun.pan@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

78a523fe

29 1月, 2021 2 次提交

iommu/vt-d: Use INVALID response code instead of FAILURE · 3aa7c62c

由 Lu Baolu 提交于 1月 26, 2021

The VT-d IOMMU response RESPONSE_FAILURE for a page request in below
cases:

- When it gets a Page_Request with no PASID;
- When it gets a Page_Request with PASID that is not in use for this
  device.

This is allowed by the spec, but IOMMU driver doesn't support such cases
today. When the device receives RESPONSE_FAILURE, it sends the device
state machine to HALT state. Now if we try to unload the driver, it hangs
since the device doesn't send any outbound transactions to host when the
driver is trying to clear things up. The only possible responses would be
for invalidation requests.

Let's use RESPONSE_INVALID instead for now, so that the device state
machine doesn't enter HALT state.
Suggested-by: NAshok Raj <ashok.raj@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210126080730.2232859-3-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

3aa7c62c

iommu/vt-d: Clear PRQ overflow only when PRQ is empty · 28a77185

由 Lu Baolu 提交于 1月 26, 2021

It is incorrect to always clear PRO when it's set w/o first checking
whether the overflow condition has been cleared. Current code assumes
that if an overflow condition occurs it must have been cleared by earlier
loop. However since the code runs in a threaded context, the overflow
condition could occur even after setting the head to the tail under some
extreme condition. To be sane, we should read both head/tail again when
seeing a pending PRO and only clear PRO after all pending PRs have been
handled.
Suggested-by: NKevin Tian <kevin.tian@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@MWHPR11MB1886.namprd11.prod.outlook.com/
Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

28a77185

28 1月, 2021 1 次提交

iommu/vt-d: Consolidate duplicate cache invaliation code · 9872f9bd

由 Lu Baolu 提交于 1月 14, 2021

The pasid based IOTLB and devTLB invalidation code is duplicate in
several places. Consolidate them by using the common helpers.
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210114085021.717041-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>

9872f9bd

12 1月, 2021 1 次提交

iommu/vt-d: Fix unaligned addresses for intel_flush_svm_range_dev() · 2d6ffc63

由 Lu Baolu 提交于 12月 31, 2020

The VT-d hardware will ignore those Addr bits which have been masked by
the AM field in the PASID-based-IOTLB invalidation descriptor. As the
result, if the starting address in the descriptor is not aligned with
the address mask, some IOTLB caches might not invalidate. Hence people
will see below errors.

[ 1093.704661] dmar_fault: 29 callbacks suppressed
[ 1093.704664] DMAR: DRHD: handling fault status reg 3
[ 1093.712738] DMAR: [DMA Read] Request device [7a:02.0] PASID 2
               fault addr 7f81c968d000 [fault reason 113]
               SM: Present bit in first-level paging entry is clear

Fix this by using aligned address for PASID-based-IOTLB invalidation.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Reported-and-tested-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-2-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

2d6ffc63

07 1月, 2021 2 次提交

iommu/vt-d: Move intel_iommu info from struct intel_svm to struct intel_svm_dev · 9ad9f45b

由 Liu Yi L 提交于 1月 07, 2021

'struct intel_svm' is shared by all devices bound to a give process,
but records only a single pointer to a 'struct intel_iommu'. Consequently,
cache invalidations may only be applied to a single DMAR unit, and are
erroneously skipped for the other devices.

In preparation for fixing this, rework the structures so that the iommu
pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
to track them in its device list.

Fixes: 1c4f88b7 ("iommu/vt-d: Shared virtual address in scalable mode")
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Raj Ashok <ashok.raj@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Reported-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Reported-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: NXin Zeng <xin.zeng@intel.com>
Signed-off-by: NLiu Yi L <yi.l.liu@intel.com>
Tested-by: NGuo Kaijie <Kaijie.Guo@intel.com>
Cc: stable@vger.kernel.org # v5.0+
Acked-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.comSigned-off-by: NWill Deacon <will@kernel.org>

9ad9f45b

iommu/vt-d: Fix lockdep splat in sva bind()/unbind() · 420d42f6

由 Lu Baolu 提交于 12月 31, 2020

Lock(&iommu->lock) without disabling irq causes lockdep warnings.

========================================================
WARNING: possible irq lock inversion dependency detected
5.11.0-rc1+ #828 Not tainted
--------------------------------------------------------
kworker/0:1H/120 just changed the state of lock:
ffffffffad9ea1b8 (device_domain_lock){..-.}-{2:2}, at:
iommu_flush_dev_iotlb.part.0+0x32/0x120
but this lock took another, SOFTIRQ-unsafe lock in the past:
 (&iommu->lock){+.+.}-{2:2}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&iommu->lock);
                               local_irq_disable();
                               lock(device_domain_lock);
                               lock(&iommu->lock);
  <Interrupt>
    lock(device_domain_lock);

 *** DEADLOCK ***
Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201231005323.2178523-5-baolu.lu@linux.intel.comSigned-off-by: NWill Deacon <will@kernel.org>

420d42f6

23 11月, 2020 1 次提交

iommu/ioasid: Add ioasid references · cb4789b0

由 Jean-Philippe Brucker 提交于 11月 06, 2020

Let IOASID users take references to existing ioasids with ioasid_get().
ioasid_put() drops a reference and only frees the ioasid when its
reference number is zero. It returns true if the ioasid was freed.
For drivers that don't call ioasid_get(), ioasid_put() is the same as
ioasid_free().
Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: NEric Auger <eric.auger@redhat.com>
Reviewed-by: NLu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20201106155048.997886-2-jean-philippe@linaro.orgSigned-off-by: NWill Deacon <will@kernel.org>

cb4789b0

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功