未验证 提交 4ab3abdd 编写于 作者: O openeuler-ci-bot 提交者: Gitee

!223 SPR: IDXD driver (on top of OLK-5.10) - DSA/IAA incremental backporting...

!223 SPR: IDXD driver (on top of OLK-5.10) - DSA/IAA incremental backporting patches until upstream 6.1

Merge Pull Request from: @xiaochenshen 
 
 **IDXD kernel driver:** 
IDXD driver is the common driver framework of Intel Data Stream Accelerator (DSA) and Intel In-memory Analytics Accelerator (IAA). This patchset covers the incremental backporting kernel patches until upstream 6.1. It fixes issues:
1. https://gitee.com/openeuler/intel-kernel/issues/I596WO 
2. https://gitee.com/openeuler/intel-kernel/issues/I590PB

 **DSA – Intel Data Streaming Accelerator:** 
Intel DSA is a high-performance data copy and transformation accelerator that is integrated in Intel Sapphire Rapids (SPR) processors, targeted for optimizing streaming data movement and transformation operations common with applications for high-performance storage, networking, persistent memory, and various data processing applications. See more details in DSA spec:
https://software.intel.com/content/www/us/en/develop/articles/intel-data-streaming-accelerator-architecture-specification.html

 **IAA - Intel In-memory Analytics Accelerator:** 
Intel In-memory Analytics Accelerator is the integrated accelerator that accelerates analytics primitives (scan, filter, etc.), CRC calculations, compression, decompression, and more on Intel Sapphire Rapids (SPR) processors. See more details in IAA spec:
https://cdrdv2.intel.com/v1/dl/getContent/721858

 **There are 173 patches in total in this patch set. It covers:** 
1. IDXD driver incremental patches between 5.10 LTS and upstream 6.1 (Shared WQ, SVM, IAA, driver refactoring and bug fixes).
2. ENQCMD and PASID re-enabling patches (as dependencies of IDXD driver)
3. Other dependencies in IOMMU driver.
4. kABI fixes for OpenEuler.
5. Enable necessary kernel configs in openeuler_defconfig.

 **Passed tests:** 
1. Unit tests: passed
- accel-config test
- accel-config/test dsa_user_test_runner.sh
- accel-config/test iaa_user_test_runner.sh
- Kernel dmatest test (SVA disabled: "modprobe idxd sva=0")
- Intel internal DSA config test suite (dsa_config_bat_tests, dsa_config_func_tests)
- Intel internal IAX config test suite (iax_config_bat_tests, iax_config_func_tests)
3. Build successfully.
4. Boot test: passed.

 **Kernel config changes against default:**
```
@@ -6381,7 +6381,11 @@ CONFIG_DMA_VIRTUAL_CHANNELS=y
 CONFIG_DMA_ACPI=y
 # CONFIG_ALTERA_MSGDMA is not set
 CONFIG_INTEL_IDMA64=m
+CONFIG_INTEL_IDXD_BUS=m
 CONFIG_INTEL_IDXD=m
+# CONFIG_INTEL_IDXD_COMPAT is not set
+CONFIG_INTEL_IDXD_SVM=y
+CONFIG_INTEL_IDXD_PERFMON=y
 CONFIG_INTEL_IOATDMA=m
 # CONFIG_PLX_DMA is not set
 # CONFIG_QCOM_HIDMA_MGMT is not set
@@ -6632,11 +6636,12 @@ CONFIG_IOMMU_SUPPORT=y
 # CONFIG_IOMMU_DEBUGFS is not set
 CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y
 CONFIG_IOMMU_DMA=y
+CONFIG_IOMMU_SVA=y
 CONFIG_AMD_IOMMU=y
 CONFIG_AMD_IOMMU_V2=m
 CONFIG_DMAR_TABLE=y
 CONFIG_INTEL_IOMMU=y
-# CONFIG_INTEL_IOMMU_SVM is not set
+CONFIG_INTEL_IOMMU_SVM=y
 # CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
 CONFIG_INTEL_IOMMU_FLOPPY_WA=y
 # CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON is not set
```

 **Kernel command line to enable intel iommu scalable mode (in grub.cfg):**
```
intel_iommu=on,sm_on
``` 
 
Link:https://gitee.com/openeuler/kernel/pulls/223 
Reviewed-by: Zheng Zengkai <zhengzengkai@huawei.com> 
Reviewed-by: Chen Wei <chenwei@xfusion.com> 
Reviewed-by: Liu Chao <liuchao173@huawei.com> 
Reviewed-by: Jun Tian <jun.j.tian@intel.com> 
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> 
...@@ -22,6 +22,7 @@ Date: Oct 25, 2019 ...@@ -22,6 +22,7 @@ Date: Oct 25, 2019
KernelVersion: 5.6.0 KernelVersion: 5.6.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: The largest number of work descriptors in a batch. Description: The largest number of work descriptors in a batch.
It's not visible when the device does not support batch.
What: /sys/bus/dsa/devices/dsa<m>/max_work_queues_size What: /sys/bus/dsa/devices/dsa<m>/max_work_queues_size
Date: Oct 25, 2019 Date: Oct 25, 2019
...@@ -41,14 +42,16 @@ KernelVersion: 5.6.0 ...@@ -41,14 +42,16 @@ KernelVersion: 5.6.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: The maximum number of groups can be created under this device. Description: The maximum number of groups can be created under this device.
What: /sys/bus/dsa/devices/dsa<m>/max_tokens What: /sys/bus/dsa/devices/dsa<m>/max_read_buffers
Date: Oct 25, 2019 Date: Dec 10, 2021
KernelVersion: 5.6.0 KernelVersion: 5.17.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: The total number of bandwidth tokens supported by this device. Description: The total number of read buffers supported by this device.
The bandwidth tokens represent resources within the DSA The read buffers represent resources within the DSA
implementation, and these resources are allocated by engines to implementation, and these resources are allocated by engines to
support operations. support operations. See DSA spec v1.2 9.2.4 Total Read Buffers.
It's not visible when the device does not support Read Buffer
allocation control.
What: /sys/bus/dsa/devices/dsa<m>/max_transfer_size What: /sys/bus/dsa/devices/dsa<m>/max_transfer_size
Date: Oct 25, 2019 Date: Oct 25, 2019
...@@ -77,6 +80,13 @@ Contact: dmaengine@vger.kernel.org ...@@ -77,6 +80,13 @@ Contact: dmaengine@vger.kernel.org
Description: The operation capability bit mask specify the operation types Description: The operation capability bit mask specify the operation types
supported by the this device. supported by the this device.
What: /sys/bus/dsa/devices/dsa<m>/pasid_enabled
Date: Oct 27, 2020
KernelVersion: 5.11.0
Contact: dmaengine@vger.kernel.org
Description: To indicate if PASID (process address space identifier) is
enabled or not for this device.
What: /sys/bus/dsa/devices/dsa<m>/state What: /sys/bus/dsa/devices/dsa<m>/state
Date: Oct 25, 2019 Date: Oct 25, 2019
KernelVersion: 5.6.0 KernelVersion: 5.6.0
...@@ -108,19 +118,30 @@ KernelVersion: 5.6.0 ...@@ -108,19 +118,30 @@ KernelVersion: 5.6.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: To indicate if this device is configurable or not. Description: To indicate if this device is configurable or not.
What: /sys/bus/dsa/devices/dsa<m>/token_limit What: /sys/bus/dsa/devices/dsa<m>/read_buffer_limit
Date: Oct 25, 2019 Date: Dec 10, 2021
KernelVersion: 5.6.0 KernelVersion: 5.17.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: The maximum number of bandwidth tokens that may be in use at Description: The maximum number of read buffers that may be in use at
one time by operations that access low bandwidth memory in the one time by operations that access low bandwidth memory in the
device. device. See DSA spec v1.2 9.2.8 GENCFG on Global Read Buffer Limit.
It's not visible when the device does not support Read Buffer
allocation control.
What: /sys/bus/dsa/devices/dsa<m>/cmd_status What: /sys/bus/dsa/devices/dsa<m>/cmd_status
Date: Aug 28, 2020 Date: Aug 28, 2020
KernelVersion: 5.10.0 KernelVersion: 5.10.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: The last executed device administrative command's status/error. Description: The last executed device administrative command's status/error.
Also last configuration error overloaded.
Writing to it will clear the status.
What: /sys/bus/dsa/devices/wq<m>.<n>/block_on_fault
Date: Oct 27, 2020
KernelVersion: 5.11.0
Contact: dmaengine@vger.kernel.org
Description: To indicate block on fault is allowed or not for the work queue
to support on demand paging.
What: /sys/bus/dsa/devices/wq<m>.<n>/group_id What: /sys/bus/dsa/devices/wq<m>.<n>/group_id
Date: Oct 25, 2019 Date: Oct 25, 2019
...@@ -189,9 +210,95 @@ KernelVersion: 5.10.0 ...@@ -189,9 +210,95 @@ KernelVersion: 5.10.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: The max batch size for this workqueue. Cannot exceed device Description: The max batch size for this workqueue. Cannot exceed device
max batch size. Configurable parameter. max batch size. Configurable parameter.
It's not visible when the device does not support batch.
What: /sys/bus/dsa/devices/wq<m>.<n>/ats_disable
Date: Nov 13, 2020
KernelVersion: 5.11.0
Contact: dmaengine@vger.kernel.org
Description: Indicate whether ATS disable is turned on for the workqueue.
0 indicates ATS is on, and 1 indicates ATS is off for the workqueue.
What: /sys/bus/dsa/devices/wq<m>.<n>/occupancy
Date May 25, 2021
KernelVersion: 5.14.0
Contact: dmaengine@vger.kernel.org
Description: Show the current number of entries in this WQ if WQ Occupancy
Support bit WQ capabilities is 1.
What: /sys/bus/dsa/devices/wq<m>.<n>/enqcmds_retries
Date Oct 29, 2021
KernelVersion: 5.17.0
Contact: dmaengine@vger.kernel.org
Description: Indicate the number of retires for an enqcmds submission on a sharedwq.
A max value to set attribute is capped at 64.
What: /sys/bus/dsa/devices/wq<m>.<n>/op_config
Date: Sept 14, 2022
KernelVersion: 6.0.0
Contact: dmaengine@vger.kernel.org
Description: Shows the operation capability bits displayed in bitmap format
presented by %*pb printk() output format specifier.
The attribute can be configured when the WQ is disabled in
order to configure the WQ to accept specific bits that
correlates to the operations allowed. It's visible only
on platforms that support the capability.
What: /sys/bus/dsa/devices/engine<m>.<n>/group_id What: /sys/bus/dsa/devices/engine<m>.<n>/group_id
Date: Oct 25, 2019 Date: Oct 25, 2019
KernelVersion: 5.6.0 KernelVersion: 5.6.0
Contact: dmaengine@vger.kernel.org Contact: dmaengine@vger.kernel.org
Description: The group that this engine belongs to. Description: The group that this engine belongs to.
What: /sys/bus/dsa/devices/group<m>.<n>/use_read_buffer_limit
Date: Dec 10, 2021
KernelVersion: 5.17.0
Contact: dmaengine@vger.kernel.org
Description: Enable the use of global read buffer limit for the group. See DSA
spec v1.2 9.2.18 GRPCFG Use Global Read Buffer Limit.
It's not visible when the device does not support Read Buffer
allocation control.
What: /sys/bus/dsa/devices/group<m>.<n>/read_buffers_allowed
Date: Dec 10, 2021
KernelVersion: 5.17.0
Contact: dmaengine@vger.kernel.org
Description: Indicates max number of read buffers that may be in use at one time
by all engines in the group. See DSA spec v1.2 9.2.18 GRPCFG Read
Buffers Allowed.
It's not visible when the device does not support Read Buffer
allocation control.
What: /sys/bus/dsa/devices/group<m>.<n>/read_buffers_reserved
Date: Dec 10, 2021
KernelVersion: 5.17.0
Contact: dmaengine@vger.kernel.org
Description: Indicates the number of Read Buffers reserved for the use of
engines in the group. See DSA spec v1.2 9.2.18 GRPCFG Read Buffers
Reserved.
It's not visible when the device does not support Read Buffer
allocation control.
What: /sys/bus/dsa/devices/group<m>.<n>/desc_progress_limit
Date: Sept 14, 2022
KernelVersion: 6.0.0
Contact: dmaengine@vger.kernel.org
Description: Allows control of the number of work descriptors that can be
concurrently processed by an engine in the group as a fraction
of the Maximum Work Descriptors in Progress value specified in
the ENGCAP register. The acceptable values are 0 (default),
1 (1/2 of max value), 2 (1/4 of the max value), and 3 (1/8 of
the max value). It's visible only on platforms that support
the capability.
What: /sys/bus/dsa/devices/group<m>.<n>/batch_progress_limit
Date: Sept 14, 2022
KernelVersion: 6.0.0
Contact: dmaengine@vger.kernel.org
Description: Allows control of the number of batch descriptors that can be
concurrently processed by an engine in the group as a fraction
of the Maximum Batch Descriptors in Progress value specified in
the ENGCAP register. The acceptable values are 0 (default),
1 (1/2 of max value), 2 (1/4 of the max value), and 3 (1/8 of
the max value). It's visible only on platforms that support
the capability.
What: /sys/bus/event_source/devices/dsa*/format
Date: April 2021
KernelVersion: 5.13
Contact: Tom Zanussi <tom.zanussi@linux.intel.com>
Description: Read-only. Attribute group to describe the magic bits
that go into perf_event_attr.config or
perf_event_attr.config1 for the IDXD DSA pmu. (See also
ABI/testing/sysfs-bus-event_source-devices-format).
Each attribute in this group defines a bit range in
perf_event_attr.config or perf_event_attr.config1.
All supported attributes are listed below (See the
IDXD DSA Spec for possible attribute values)::
event_category = "config:0-3" - event category
event = "config:4-31" - event ID
filter_wq = "config1:0-31" - workqueue filter
filter_tc = "config1:32-39" - traffic class filter
filter_pgsz = "config1:40-43" - page size filter
filter_sz = "config1:44-51" - transfer size filter
filter_eng = "config1:52-59" - engine filter
What: /sys/bus/event_source/devices/dsa*/cpumask
Date: April 2021
KernelVersion: 5.13
Contact: Tom Zanussi <tom.zanussi@linux.intel.com>
Description: Read-only. This file always returns the cpu to which the
IDXD DSA pmu is bound for access to all dsa pmu
performance monitoring events.
...@@ -1747,6 +1747,17 @@ ...@@ -1747,6 +1747,17 @@
In such case C2/C3 won't be used again. In such case C2/C3 won't be used again.
idle=nomwait: Disable mwait for CPU C-states idle=nomwait: Disable mwait for CPU C-states
idxd.sva= [HW]
Format: <bool>
Allow force disabling of Shared Virtual Memory (SVA)
support for the idxd driver. By default it is set to
true (1).
idxd.tc_override= [HW]
Format: <bool>
Allow override of default traffic class configuration
for the device. By default it is set to false (0).
ieee754= [MIPS] Select IEEE Std 754 conformance mode ieee754= [MIPS] Select IEEE Std 754 conformance mode
Format: { strict | legacy | 2008 | relaxed } Format: { strict | legacy | 2008 | relaxed }
Default: strict Default: strict
......
...@@ -104,18 +104,47 @@ The MSR must be configured on each logical CPU before any application ...@@ -104,18 +104,47 @@ The MSR must be configured on each logical CPU before any application
thread can interact with a device. Threads that belong to the same thread can interact with a device. Threads that belong to the same
process share the same page tables, thus the same MSR value. process share the same page tables, thus the same MSR value.
PASID is cleared when a process is created. The PASID allocation and MSR PASID Life Cycle Management
programming may occur long after a process and its threads have been created. ===========================
One thread must call iommu_sva_bind_device() to allocate the PASID for the
process. If a thread uses ENQCMD without the MSR first being populated, a #GP PASID is initialized as INVALID_IOASID (-1) when a process is created.
will be raised. The kernel will update the PASID MSR with the PASID for all
threads in the process. A single process PASID can be used simultaneously Only processes that access SVA-capable devices need to have a PASID
with multiple devices since they all share the same address space. allocated. This allocation happens when a process opens/binds an SVA-capable
device but finds no PASID for this process. Subsequent binds of the same, or
One thread can call iommu_sva_unbind_device() to free the allocated PASID. other devices will share the same PASID.
The kernel will clear the PASID MSR for all threads belonging to the process.
Although the PASID is allocated to the process by opening a device,
New threads inherit the MSR value from the parent. it is not active in any of the threads of that process. It's loaded to the
IA32_PASID MSR lazily when a thread tries to submit a work descriptor
to a device using the ENQCMD.
That first access will trigger a #GP fault because the IA32_PASID MSR
has not been initialized with the PASID value assigned to the process
when the device was opened. The Linux #GP handler notes that a PASID has
been allocated for the process, and so initializes the IA32_PASID MSR
and returns so that the ENQCMD instruction is re-executed.
On fork(2) or exec(2) the PASID is removed from the process as it no
longer has the same address space that it had when the device was opened.
On clone(2) the new task shares the same address space, so will be
able to use the PASID allocated to the process. The IA32_PASID is not
preemptively initialized as the PASID value might not be allocated yet or
the kernel does not know whether this thread is going to access the device
and the cleared IA32_PASID MSR reduces context switch overhead by xstate
init optimization. Since #GP faults have to be handled on any threads that
were created before the PASID was assigned to the mm of the process, newly
created threads might as well be treated in a consistent way.
Due to complexity of freeing the PASID and clearing all IA32_PASID MSRs in
all threads in unbind, free the PASID lazily only on mm exit.
If a process does a close(2) of the device file descriptor and munmap(2)
of the device MMIO portal, then the driver will unbind the device. The
PASID is still marked VALID in the PASID_MSR for any threads in the
process that accessed the device. But this is harmless as without the
MMIO portal they cannot submit new work to the device.
Relationships Relationships
============= =============
......
...@@ -8949,7 +8949,8 @@ S: Supported ...@@ -8949,7 +8949,8 @@ S: Supported
Q: https://patchwork.kernel.org/project/linux-dmaengine/list/ Q: https://patchwork.kernel.org/project/linux-dmaengine/list/
F: drivers/dma/ioat* F: drivers/dma/ioat*
INTEL IADX DRIVER INTEL IDXD DRIVER
M: Fenghua Yu <fenghua.yu@intel.com>
M: Dave Jiang <dave.jiang@intel.com> M: Dave Jiang <dave.jiang@intel.com>
L: dmaengine@vger.kernel.org L: dmaengine@vger.kernel.org
S: Supported S: Supported
......
...@@ -6357,7 +6357,11 @@ CONFIG_DMA_VIRTUAL_CHANNELS=y ...@@ -6357,7 +6357,11 @@ CONFIG_DMA_VIRTUAL_CHANNELS=y
CONFIG_DMA_ACPI=y CONFIG_DMA_ACPI=y
# CONFIG_ALTERA_MSGDMA is not set # CONFIG_ALTERA_MSGDMA is not set
CONFIG_INTEL_IDMA64=m CONFIG_INTEL_IDMA64=m
CONFIG_INTEL_IDXD_BUS=m
CONFIG_INTEL_IDXD=m CONFIG_INTEL_IDXD=m
# CONFIG_INTEL_IDXD_COMPAT is not set
CONFIG_INTEL_IDXD_SVM=y
CONFIG_INTEL_IDXD_PERFMON=y
CONFIG_INTEL_IOATDMA=m CONFIG_INTEL_IOATDMA=m
# CONFIG_PLX_DMA is not set # CONFIG_PLX_DMA is not set
# CONFIG_QCOM_HIDMA_MGMT is not set # CONFIG_QCOM_HIDMA_MGMT is not set
...@@ -6606,11 +6610,12 @@ CONFIG_IOMMU_SUPPORT=y ...@@ -6606,11 +6610,12 @@ CONFIG_IOMMU_SUPPORT=y
# CONFIG_IOMMU_DEBUGFS is not set # CONFIG_IOMMU_DEBUGFS is not set
CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y
CONFIG_IOMMU_DMA=y CONFIG_IOMMU_DMA=y
CONFIG_IOMMU_SVA=y
CONFIG_AMD_IOMMU=y CONFIG_AMD_IOMMU=y
CONFIG_AMD_IOMMU_V2=m CONFIG_AMD_IOMMU_V2=m
CONFIG_DMAR_TABLE=y CONFIG_DMAR_TABLE=y
CONFIG_INTEL_IOMMU=y CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_SVM is not set CONFIG_INTEL_IOMMU_SVM=y
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set # CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_INTEL_IOMMU_FLOPPY_WA=y CONFIG_INTEL_IOMMU_FLOPPY_WA=y
# CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON is not set # CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON is not set
......
...@@ -75,8 +75,11 @@ ...@@ -75,8 +75,11 @@
# define DISABLE_UNRET (1 << (X86_FEATURE_UNRET & 31)) # define DISABLE_UNRET (1 << (X86_FEATURE_UNRET & 31))
#endif #endif
/* Force disable because it's broken beyond repair */ #ifdef CONFIG_INTEL_IOMMU_SVM
#define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31)) # define DISABLE_ENQCMD 0
#else
# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
#endif
#ifdef CONFIG_X86_SGX #ifdef CONFIG_X86_SGX
# define DISABLE_SGX 0 # define DISABLE_SGX 0
......
...@@ -81,8 +81,6 @@ extern int cpu_has_xfeatures(u64 xfeatures_mask, const char **feature_name); ...@@ -81,8 +81,6 @@ extern int cpu_has_xfeatures(u64 xfeatures_mask, const char **feature_name);
*/ */
#define PASID_DISABLED 0 #define PASID_DISABLED 0
static inline void update_pasid(void) { }
/* Trap handling */ /* Trap handling */
extern int fpu__exception_code(struct fpu *fpu, int trap_nr); extern int fpu__exception_code(struct fpu *fpu, int trap_nr);
extern void fpu_sync_fpstate(struct fpu *fpu); extern void fpu_sync_fpstate(struct fpu *fpu);
......
...@@ -231,10 +231,10 @@ static inline void serialize(void) ...@@ -231,10 +231,10 @@ static inline void serialize(void)
} }
/* The dst parameter must be 64-bytes aligned */ /* The dst parameter must be 64-bytes aligned */
static inline void movdir64b(void *dst, const void *src) static inline void movdir64b(void __iomem *dst, const void *src)
{ {
const struct { char _[64]; } *__src = src; const struct { char _[64]; } *__src = src;
struct { char _[64]; } *__dst = dst; struct { char _[64]; } __iomem *__dst = dst;
/* /*
* MOVDIR64B %(rdx), rax. * MOVDIR64B %(rdx), rax.
......
...@@ -502,6 +502,13 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags) ...@@ -502,6 +502,13 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags)
fpu_inherit_perms(dst_fpu); fpu_inherit_perms(dst_fpu);
fpregs_unlock(); fpregs_unlock();
/*
* Children never inherit PASID state.
* Force it to have its init value:
*/
if (use_xsave())
dst_fpu->fpstate->regs.xsave.header.xfeatures &= ~XFEATURE_MASK_PASID;
trace_x86_fpu_copy_src(src_fpu); trace_x86_fpu_copy_src(src_fpu);
trace_x86_fpu_copy_dst(dst_fpu); trace_x86_fpu_copy_dst(dst_fpu);
......
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include <linux/io.h> #include <linux/io.h>
#include <linux/hardirq.h> #include <linux/hardirq.h>
#include <linux/atomic.h> #include <linux/atomic.h>
#include <linux/ioasid.h>
#include <asm/stacktrace.h> #include <asm/stacktrace.h>
#include <asm/processor.h> #include <asm/processor.h>
...@@ -562,6 +563,57 @@ static bool fixup_iopl_exception(struct pt_regs *regs) ...@@ -562,6 +563,57 @@ static bool fixup_iopl_exception(struct pt_regs *regs)
return true; return true;
} }
/*
* The unprivileged ENQCMD instruction generates #GPs if the
* IA32_PASID MSR has not been populated. If possible, populate
* the MSR from a PASID previously allocated to the mm.
*/
static bool try_fixup_enqcmd_gp(void)
{
#ifdef CONFIG_IOMMU_SVA
u32 pasid;
/*
* MSR_IA32_PASID is managed using XSAVE. Directly
* writing to the MSR is only possible when fpregs
* are valid and the fpstate is not. This is
* guaranteed when handling a userspace exception
* in *before* interrupts are re-enabled.
*/
lockdep_assert_irqs_disabled();
/*
* Hardware without ENQCMD will not generate
* #GPs that can be fixed up here.
*/
if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
return false;
pasid = current->mm->pasid;
/*
* If the mm has not been allocated a
* PASID, the #GP can not be fixed up.
*/
if (!pasid_valid(pasid))
return false;
/*
* Did this thread already have its PASID activated?
* If so, the #GP must be from something else.
*/
if (current->pasid_activated)
return false;
wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
current->pasid_activated = 1;
return true;
#else
return false;
#endif
}
DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
{ {
char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR; char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
...@@ -570,6 +622,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection) ...@@ -570,6 +622,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
unsigned long gp_addr; unsigned long gp_addr;
int ret; int ret;
if (user_mode(regs) && try_fixup_enqcmd_gp())
return;
cond_local_irq_enable(regs); cond_local_irq_enable(regs);
if (static_cpu_has(X86_FEATURE_UMIP)) { if (static_cpu_has(X86_FEATURE_UMIP)) {
......
...@@ -283,10 +283,15 @@ config INTEL_IDMA64 ...@@ -283,10 +283,15 @@ config INTEL_IDMA64
Enable DMA support for Intel Low Power Subsystem such as found on Enable DMA support for Intel Low Power Subsystem such as found on
Intel Skylake PCH. Intel Skylake PCH.
config INTEL_IDXD_BUS
tristate
default INTEL_IDXD
config INTEL_IDXD config INTEL_IDXD
tristate "Intel Data Accelerators support" tristate "Intel Data Accelerators support"
depends on PCI && X86_64 && !UML depends on PCI && X86_64 && !UML
depends on PCI_MSI depends on PCI_MSI
depends on PCI_PASID
depends on SBITMAP depends on SBITMAP
select DMA_ENGINE select DMA_ENGINE
help help
...@@ -297,6 +302,45 @@ config INTEL_IDXD ...@@ -297,6 +302,45 @@ config INTEL_IDXD
If unsure, say N. If unsure, say N.
config INTEL_IDXD_COMPAT
bool "Legacy behavior for idxd driver"
depends on PCI && X86_64
select INTEL_IDXD_BUS
help
Compatible driver to support old /sys/bus/dsa/drivers/dsa behavior.
The old behavior performed driver bind/unbind for device and wq
devices all under the dsa driver. The compat driver will emulate
the legacy behavior in order to allow existing support apps (i.e.
accel-config) to continue function. It is expected that accel-config
v3.2 and earlier will need the compat mode. A distro with later
accel-config version can disable this compat config.
Say Y if you have old applications that require such behavior.
If unsure, say N.
# Config symbol that collects all the dependencies that's necessary to
# support shared virtual memory for the devices supported by idxd.
config INTEL_IDXD_SVM
bool "Accelerator Shared Virtual Memory Support"
depends on INTEL_IDXD
depends on INTEL_IOMMU_SVM
depends on PCI_PRI
depends on PCI_PASID
depends on PCI_IOV
config INTEL_IDXD_PERFMON
bool "Intel Data Accelerators performance monitor support"
depends on INTEL_IDXD
help
Enable performance monitor (pmu) support for the Intel(R)
data accelerators present in Intel Xeon CPU. With this
enabled, perf can be used to monitor the DSA (Intel Data
Streaming Accelerator) events described in the Intel DSA
spec.
If unsure, say N.
config INTEL_IOATDMA config INTEL_IOATDMA
tristate "Intel I/OAT DMA support" tristate "Intel I/OAT DMA support"
depends on PCI && X86_64 && !UML depends on PCI && X86_64 && !UML
......
...@@ -42,7 +42,7 @@ obj-$(CONFIG_IMX_DMA) += imx-dma.o ...@@ -42,7 +42,7 @@ obj-$(CONFIG_IMX_DMA) += imx-dma.o
obj-$(CONFIG_IMX_SDMA) += imx-sdma.o obj-$(CONFIG_IMX_SDMA) += imx-sdma.o
obj-$(CONFIG_INTEL_IDMA64) += idma64.o obj-$(CONFIG_INTEL_IDMA64) += idma64.o
obj-$(CONFIG_INTEL_IOATDMA) += ioat/ obj-$(CONFIG_INTEL_IOATDMA) += ioat/
obj-$(CONFIG_INTEL_IDXD) += idxd/ obj-y += idxd/
obj-$(CONFIG_INTEL_IOP_ADMA) += iop-adma.o obj-$(CONFIG_INTEL_IOP_ADMA) += iop-adma.o
obj-$(CONFIG_K3_DMA) += k3dma.o obj-$(CONFIG_K3_DMA) += k3dma.o
obj-$(CONFIG_LPC18XX_DMAMUX) += lpc18xx-dmamux.o obj-$(CONFIG_LPC18XX_DMAMUX) += lpc18xx-dmamux.o
......
ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=IDXD
obj-$(CONFIG_INTEL_IDXD) += idxd.o obj-$(CONFIG_INTEL_IDXD) += idxd.o
idxd-y := init.o irq.o device.o sysfs.o submit.o dma.o cdev.o idxd-y := init.o irq.o device.o sysfs.o submit.o dma.o cdev.o
idxd-$(CONFIG_INTEL_IDXD_PERFMON) += perfmon.o
obj-$(CONFIG_INTEL_IDXD_BUS) += idxd_bus.o
idxd_bus-y := bus.o
obj-$(CONFIG_INTEL_IDXD_COMPAT) += idxd_compat.o
idxd_compat-y := compat.o
// SPDX-License-Identifier: GPL-2.0
/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/device.h>
#include "idxd.h"
int __idxd_driver_register(struct idxd_device_driver *idxd_drv, struct module *owner,
const char *mod_name)
{
struct device_driver *drv = &idxd_drv->drv;
if (!idxd_drv->type) {
pr_debug("driver type not set (%ps)\n", __builtin_return_address(0));
return -EINVAL;
}
drv->name = idxd_drv->name;
drv->bus = &dsa_bus_type;
drv->owner = owner;
drv->mod_name = mod_name;
return driver_register(drv);
}
EXPORT_SYMBOL_GPL(__idxd_driver_register);
void idxd_driver_unregister(struct idxd_device_driver *idxd_drv)
{
driver_unregister(&idxd_drv->drv);
}
EXPORT_SYMBOL_GPL(idxd_driver_unregister);
static int idxd_config_bus_match(struct device *dev,
struct device_driver *drv)
{
struct idxd_device_driver *idxd_drv =
container_of(drv, struct idxd_device_driver, drv);
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
int i = 0;
while (idxd_drv->type[i] != IDXD_DEV_NONE) {
if (idxd_dev->type == idxd_drv->type[i])
return 1;
i++;
}
return 0;
}
static int idxd_config_bus_probe(struct device *dev)
{
struct idxd_device_driver *idxd_drv =
container_of(dev->driver, struct idxd_device_driver, drv);
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
return idxd_drv->probe(idxd_dev);
}
static int idxd_config_bus_remove(struct device *dev)
{
struct idxd_device_driver *idxd_drv =
container_of(dev->driver, struct idxd_device_driver, drv);
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
idxd_drv->remove(idxd_dev);
return 0;
}
struct bus_type dsa_bus_type = {
.name = "dsa",
.match = idxd_config_bus_match,
.probe = idxd_config_bus_probe,
.remove = idxd_config_bus_remove,
};
EXPORT_SYMBOL_GPL(dsa_bus_type);
static int __init dsa_bus_init(void)
{
return bus_register(&dsa_bus_type);
}
module_init(dsa_bus_init);
static void __exit dsa_bus_exit(void)
{
bus_unregister(&dsa_bus_type);
}
module_exit(dsa_bus_exit);
MODULE_DESCRIPTION("IDXD driver dsa_bus_type driver");
MODULE_LICENSE("GPL v2");
...@@ -11,6 +11,7 @@ ...@@ -11,6 +11,7 @@
#include <linux/cdev.h> #include <linux/cdev.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/poll.h> #include <linux/poll.h>
#include <linux/iommu.h>
#include <uapi/linux/idxd.h> #include <uapi/linux/idxd.h>
#include "registers.h" #include "registers.h"
#include "idxd.h" #include "idxd.h"
...@@ -27,21 +28,24 @@ struct idxd_cdev_context { ...@@ -27,21 +28,24 @@ struct idxd_cdev_context {
*/ */
static struct idxd_cdev_context ictx[IDXD_TYPE_MAX] = { static struct idxd_cdev_context ictx[IDXD_TYPE_MAX] = {
{ .name = "dsa" }, { .name = "dsa" },
{ .name = "iax" }
}; };
struct idxd_user_context { struct idxd_user_context {
struct idxd_wq *wq; struct idxd_wq *wq;
struct task_struct *task; struct task_struct *task;
unsigned int pasid;
unsigned int flags; unsigned int flags;
struct iommu_sva *sva;
}; };
static void idxd_cdev_dev_release(struct device *dev) static void idxd_cdev_dev_release(struct device *dev)
{ {
struct idxd_cdev *idxd_cdev = container_of(dev, struct idxd_cdev, dev); struct idxd_cdev *idxd_cdev = dev_to_cdev(dev);
struct idxd_cdev_context *cdev_ctx; struct idxd_cdev_context *cdev_ctx;
struct idxd_wq *wq = idxd_cdev->wq; struct idxd_wq *wq = idxd_cdev->wq;
cdev_ctx = &ictx[wq->idxd->type]; cdev_ctx = &ictx[wq->idxd->data->type];
ida_simple_remove(&cdev_ctx->minor_ida, idxd_cdev->minor); ida_simple_remove(&cdev_ctx->minor_ida, idxd_cdev->minor);
kfree(idxd_cdev); kfree(idxd_cdev);
} }
...@@ -72,6 +76,8 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp) ...@@ -72,6 +76,8 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp)
struct idxd_wq *wq; struct idxd_wq *wq;
struct device *dev; struct device *dev;
int rc = 0; int rc = 0;
struct iommu_sva *sva;
unsigned int pasid;
wq = inode_wq(inode); wq = inode_wq(inode);
idxd = wq->idxd; idxd = wq->idxd;
...@@ -92,6 +98,35 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp) ...@@ -92,6 +98,35 @@ static int idxd_cdev_open(struct inode *inode, struct file *filp)
ctx->wq = wq; ctx->wq = wq;
filp->private_data = ctx; filp->private_data = ctx;
if (device_user_pasid_enabled(idxd)) {
sva = iommu_sva_bind_device(dev, current->mm, NULL);
if (IS_ERR(sva)) {
rc = PTR_ERR(sva);
dev_err(dev, "pasid allocation failed: %d\n", rc);
goto failed;
}
pasid = iommu_sva_get_pasid(sva);
if (pasid == IOMMU_PASID_INVALID) {
iommu_sva_unbind_device(sva);
rc = -EINVAL;
goto failed;
}
ctx->sva = sva;
ctx->pasid = pasid;
if (wq_dedicated(wq)) {
rc = idxd_wq_set_pasid(wq, pasid);
if (rc < 0) {
iommu_sva_unbind_device(sva);
dev_err(dev, "wq set pasid failed: %d\n", rc);
goto failed;
}
}
}
idxd_wq_get(wq); idxd_wq_get(wq);
mutex_unlock(&wq->wq_lock); mutex_unlock(&wq->wq_lock);
return 0; return 0;
...@@ -108,13 +143,27 @@ static int idxd_cdev_release(struct inode *node, struct file *filep) ...@@ -108,13 +143,27 @@ static int idxd_cdev_release(struct inode *node, struct file *filep)
struct idxd_wq *wq = ctx->wq; struct idxd_wq *wq = ctx->wq;
struct idxd_device *idxd = wq->idxd; struct idxd_device *idxd = wq->idxd;
struct device *dev = &idxd->pdev->dev; struct device *dev = &idxd->pdev->dev;
int rc;
dev_dbg(dev, "%s called\n", __func__); dev_dbg(dev, "%s called\n", __func__);
filep->private_data = NULL; filep->private_data = NULL;
/* Wait for in-flight operations to complete. */ /* Wait for in-flight operations to complete. */
idxd_wq_drain(wq); if (wq_shared(wq)) {
idxd_device_drain_pasid(idxd, ctx->pasid);
} else {
if (device_user_pasid_enabled(idxd)) {
/* The wq disable in the disable pasid function will drain the wq */
rc = idxd_wq_disable_pasid(wq);
if (rc < 0)
dev_err(dev, "wq disable pasid failed.\n");
} else {
idxd_wq_drain(wq);
}
}
if (ctx->sva)
iommu_sva_unbind_device(ctx->sva);
kfree(ctx); kfree(ctx);
mutex_lock(&wq->wq_lock); mutex_lock(&wq->wq_lock);
idxd_wq_put(wq); idxd_wq_put(wq);
...@@ -169,14 +218,13 @@ static __poll_t idxd_cdev_poll(struct file *filp, ...@@ -169,14 +218,13 @@ static __poll_t idxd_cdev_poll(struct file *filp,
struct idxd_user_context *ctx = filp->private_data; struct idxd_user_context *ctx = filp->private_data;
struct idxd_wq *wq = ctx->wq; struct idxd_wq *wq = ctx->wq;
struct idxd_device *idxd = wq->idxd; struct idxd_device *idxd = wq->idxd;
unsigned long flags;
__poll_t out = 0; __poll_t out = 0;
poll_wait(filp, &wq->err_queue, wait); poll_wait(filp, &wq->err_queue, wait);
spin_lock_irqsave(&idxd->dev_lock, flags); spin_lock(&idxd->dev_lock);
if (idxd->sw_err.valid) if (idxd->sw_err.valid)
out = EPOLLIN | EPOLLRDNORM; out = EPOLLIN | EPOLLRDNORM;
spin_unlock_irqrestore(&idxd->dev_lock, flags); spin_unlock(&idxd->dev_lock);
return out; return out;
} }
...@@ -191,7 +239,7 @@ static const struct file_operations idxd_cdev_fops = { ...@@ -191,7 +239,7 @@ static const struct file_operations idxd_cdev_fops = {
int idxd_cdev_get_major(struct idxd_device *idxd) int idxd_cdev_get_major(struct idxd_device *idxd)
{ {
return MAJOR(ictx[idxd->type].devt); return MAJOR(ictx[idxd->data->type].devt);
} }
int idxd_wq_add_cdev(struct idxd_wq *wq) int idxd_wq_add_cdev(struct idxd_wq *wq)
...@@ -207,10 +255,11 @@ int idxd_wq_add_cdev(struct idxd_wq *wq) ...@@ -207,10 +255,11 @@ int idxd_wq_add_cdev(struct idxd_wq *wq)
if (!idxd_cdev) if (!idxd_cdev)
return -ENOMEM; return -ENOMEM;
idxd_cdev->idxd_dev.type = IDXD_DEV_CDEV;
idxd_cdev->wq = wq; idxd_cdev->wq = wq;
cdev = &idxd_cdev->cdev; cdev = &idxd_cdev->cdev;
dev = &idxd_cdev->dev; dev = cdev_dev(idxd_cdev);
cdev_ctx = &ictx[wq->idxd->type]; cdev_ctx = &ictx[wq->idxd->data->type];
minor = ida_simple_get(&cdev_ctx->minor_ida, 0, MINORMASK, GFP_KERNEL); minor = ida_simple_get(&cdev_ctx->minor_ida, 0, MINORMASK, GFP_KERNEL);
if (minor < 0) { if (minor < 0) {
kfree(idxd_cdev); kfree(idxd_cdev);
...@@ -219,13 +268,12 @@ int idxd_wq_add_cdev(struct idxd_wq *wq) ...@@ -219,13 +268,12 @@ int idxd_wq_add_cdev(struct idxd_wq *wq)
idxd_cdev->minor = minor; idxd_cdev->minor = minor;
device_initialize(dev); device_initialize(dev);
dev->parent = &wq->conf_dev; dev->parent = wq_confdev(wq);
dev->bus = idxd_get_bus_type(idxd); dev->bus = &dsa_bus_type;
dev->type = &idxd_cdev_device_type; dev->type = &idxd_cdev_device_type;
dev->devt = MKDEV(MAJOR(cdev_ctx->devt), minor); dev->devt = MKDEV(MAJOR(cdev_ctx->devt), minor);
rc = dev_set_name(dev, "%s/wq%u.%u", idxd_get_dev_name(idxd), rc = dev_set_name(dev, "%s/wq%u.%u", idxd->data->name_prefix, idxd->id, wq->id);
idxd->id, wq->id);
if (rc < 0) if (rc < 0)
goto err; goto err;
...@@ -248,15 +296,88 @@ int idxd_wq_add_cdev(struct idxd_wq *wq) ...@@ -248,15 +296,88 @@ int idxd_wq_add_cdev(struct idxd_wq *wq)
void idxd_wq_del_cdev(struct idxd_wq *wq) void idxd_wq_del_cdev(struct idxd_wq *wq)
{ {
struct idxd_cdev *idxd_cdev; struct idxd_cdev *idxd_cdev;
struct idxd_cdev_context *cdev_ctx;
cdev_ctx = &ictx[wq->idxd->type];
idxd_cdev = wq->idxd_cdev; idxd_cdev = wq->idxd_cdev;
wq->idxd_cdev = NULL; wq->idxd_cdev = NULL;
cdev_device_del(&idxd_cdev->cdev, &idxd_cdev->dev); cdev_device_del(&idxd_cdev->cdev, cdev_dev(idxd_cdev));
put_device(&idxd_cdev->dev); put_device(cdev_dev(idxd_cdev));
} }
static int idxd_user_drv_probe(struct idxd_dev *idxd_dev)
{
struct idxd_wq *wq = idxd_dev_to_wq(idxd_dev);
struct idxd_device *idxd = wq->idxd;
int rc;
if (idxd->state != IDXD_DEV_ENABLED)
return -ENXIO;
/*
* User type WQ is enabled only when SVA is enabled for two reasons:
* - If no IOMMU or IOMMU Passthrough without SVA, userspace
* can directly access physical address through the WQ.
* - The IDXD cdev driver does not provide any ways to pin
* user pages and translate the address from user VA to IOVA or
* PA without IOMMU SVA. Therefore the application has no way
* to instruct the device to perform DMA function. This makes
* the cdev not usable for normal application usage.
*/
if (!device_user_pasid_enabled(idxd)) {
idxd->cmd_status = IDXD_SCMD_WQ_USER_NO_IOMMU;
dev_dbg(&idxd->pdev->dev,
"User type WQ cannot be enabled without SVA.\n");
return -EOPNOTSUPP;
}
mutex_lock(&wq->wq_lock);
wq->type = IDXD_WQT_USER;
rc = drv_enable_wq(wq);
if (rc < 0)
goto err;
rc = idxd_wq_add_cdev(wq);
if (rc < 0) {
idxd->cmd_status = IDXD_SCMD_CDEV_ERR;
goto err_cdev;
}
idxd->cmd_status = 0;
mutex_unlock(&wq->wq_lock);
return 0;
err_cdev:
drv_disable_wq(wq);
err:
wq->type = IDXD_WQT_NONE;
mutex_unlock(&wq->wq_lock);
return rc;
}
static void idxd_user_drv_remove(struct idxd_dev *idxd_dev)
{
struct idxd_wq *wq = idxd_dev_to_wq(idxd_dev);
mutex_lock(&wq->wq_lock);
idxd_wq_del_cdev(wq);
drv_disable_wq(wq);
wq->type = IDXD_WQT_NONE;
mutex_unlock(&wq->wq_lock);
}
static enum idxd_dev_type dev_types[] = {
IDXD_DEV_WQ,
IDXD_DEV_NONE,
};
struct idxd_device_driver idxd_user_drv = {
.probe = idxd_user_drv_probe,
.remove = idxd_user_drv_remove,
.name = "user",
.type = dev_types,
};
EXPORT_SYMBOL_GPL(idxd_user_drv);
int idxd_cdev_register(void) int idxd_cdev_register(void)
{ {
int rc, i; int rc, i;
......
// SPDX-License-Identifier: GPL-2.0
/* Copyright(c) 2021 Intel Corporation. All rights rsvd. */
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/device/bus.h>
#include "idxd.h"
extern int device_driver_attach(struct device_driver *drv, struct device *dev);
extern void device_driver_detach(struct device *dev);
#define DRIVER_ATTR_IGNORE_LOCKDEP(_name, _mode, _show, _store) \
struct driver_attribute driver_attr_##_name = \
__ATTR_IGNORE_LOCKDEP(_name, _mode, _show, _store)
static ssize_t unbind_store(struct device_driver *drv, const char *buf, size_t count)
{
struct bus_type *bus = drv->bus;
struct device *dev;
int rc = -ENODEV;
dev = bus_find_device_by_name(bus, NULL, buf);
if (dev && dev->driver) {
device_driver_detach(dev);
rc = count;
}
return rc;
}
static DRIVER_ATTR_IGNORE_LOCKDEP(unbind, 0200, NULL, unbind_store);
static ssize_t bind_store(struct device_driver *drv, const char *buf, size_t count)
{
struct bus_type *bus = drv->bus;
struct device *dev;
struct device_driver *alt_drv = NULL;
int rc = -ENODEV;
struct idxd_dev *idxd_dev;
dev = bus_find_device_by_name(bus, NULL, buf);
if (!dev || dev->driver || drv != &dsa_drv.drv)
return -ENODEV;
idxd_dev = confdev_to_idxd_dev(dev);
if (is_idxd_dev(idxd_dev)) {
alt_drv = driver_find("idxd", bus);
} else if (is_idxd_wq_dev(idxd_dev)) {
struct idxd_wq *wq = confdev_to_wq(dev);
if (is_idxd_wq_kernel(wq))
alt_drv = driver_find("dmaengine", bus);
else if (is_idxd_wq_user(wq))
alt_drv = driver_find("user", bus);
}
if (!alt_drv)
return -ENODEV;
rc = device_driver_attach(alt_drv, dev);
if (rc < 0)
return rc;
return count;
}
static DRIVER_ATTR_IGNORE_LOCKDEP(bind, 0200, NULL, bind_store);
static struct attribute *dsa_drv_compat_attrs[] = {
&driver_attr_bind.attr,
&driver_attr_unbind.attr,
NULL,
};
static const struct attribute_group dsa_drv_compat_attr_group = {
.attrs = dsa_drv_compat_attrs,
};
static const struct attribute_group *dsa_drv_compat_groups[] = {
&dsa_drv_compat_attr_group,
NULL,
};
static int idxd_dsa_drv_probe(struct idxd_dev *idxd_dev)
{
return -ENODEV;
}
static void idxd_dsa_drv_remove(struct idxd_dev *idxd_dev)
{
}
static enum idxd_dev_type dev_types[] = {
IDXD_DEV_NONE,
};
struct idxd_device_driver dsa_drv = {
.name = "dsa",
.probe = idxd_dsa_drv_probe,
.remove = idxd_dsa_drv_remove,
.type = dev_types,
.drv = {
.suppress_bind_attrs = true,
.groups = dsa_drv_compat_groups,
},
};
module_idxd_driver(dsa_drv);
MODULE_IMPORT_NS(IDXD);
此差异已折叠。
...@@ -21,20 +21,27 @@ static inline struct idxd_wq *to_idxd_wq(struct dma_chan *c) ...@@ -21,20 +21,27 @@ static inline struct idxd_wq *to_idxd_wq(struct dma_chan *c)
} }
void idxd_dma_complete_txd(struct idxd_desc *desc, void idxd_dma_complete_txd(struct idxd_desc *desc,
enum idxd_complete_type comp_type) enum idxd_complete_type comp_type,
bool free_desc)
{ {
struct idxd_device *idxd = desc->wq->idxd;
struct dma_async_tx_descriptor *tx; struct dma_async_tx_descriptor *tx;
struct dmaengine_result res; struct dmaengine_result res;
int complete = 1; int complete = 1;
if (desc->completion->status == DSA_COMP_SUCCESS) if (desc->completion->status == DSA_COMP_SUCCESS) {
res.result = DMA_TRANS_NOERROR; res.result = DMA_TRANS_NOERROR;
else if (desc->completion->status) } else if (desc->completion->status) {
if (idxd->request_int_handles && comp_type != IDXD_COMPLETE_ABORT &&
desc->completion->status == DSA_COMP_INT_HANDLE_INVAL &&
idxd_queue_int_handle_resubmit(desc))
return;
res.result = DMA_TRANS_WRITE_FAILED; res.result = DMA_TRANS_WRITE_FAILED;
else if (comp_type == IDXD_COMPLETE_ABORT) } else if (comp_type == IDXD_COMPLETE_ABORT) {
res.result = DMA_TRANS_ABORTED; res.result = DMA_TRANS_ABORTED;
else } else {
complete = 0; complete = 0;
}
tx = &desc->txd; tx = &desc->txd;
if (complete && tx->cookie) { if (complete && tx->cookie) {
...@@ -44,6 +51,9 @@ void idxd_dma_complete_txd(struct idxd_desc *desc, ...@@ -44,6 +51,9 @@ void idxd_dma_complete_txd(struct idxd_desc *desc,
tx->callback = NULL; tx->callback = NULL;
tx->callback_result = NULL; tx->callback_result = NULL;
} }
if (free_desc)
idxd_free_desc(desc->wq, desc);
} }
static void op_flag_setup(unsigned long flags, u32 *desc_flags) static void op_flag_setup(unsigned long flags, u32 *desc_flags)
...@@ -64,22 +74,17 @@ static inline void idxd_prep_desc_common(struct idxd_wq *wq, ...@@ -64,22 +74,17 @@ static inline void idxd_prep_desc_common(struct idxd_wq *wq,
u64 addr_f1, u64 addr_f2, u64 len, u64 addr_f1, u64 addr_f2, u64 len,
u64 compl, u32 flags) u64 compl, u32 flags)
{ {
struct idxd_device *idxd = wq->idxd;
hw->flags = flags; hw->flags = flags;
hw->opcode = opcode; hw->opcode = opcode;
hw->src_addr = addr_f1; hw->src_addr = addr_f1;
hw->dst_addr = addr_f2; hw->dst_addr = addr_f2;
hw->xfer_size = len; hw->xfer_size = len;
hw->priv = !!(wq->type == IDXD_WQT_KERNEL);
hw->completion_addr = compl;
/* /*
* Descriptor completion vectors are 1-8 for MSIX. We will round * For dedicated WQ, this field is ignored and HW will use the WQCFG.priv
* robin through the 8 vectors. * field instead. This field should be set to 1 for kernel descriptors.
*/ */
wq->vec_ptr = (wq->vec_ptr % idxd->num_wq_irqs) + 1; hw->priv = 1;
hw->int_handle = wq->vec_ptr; hw->completion_addr = compl;
} }
static struct dma_async_tx_descriptor * static struct dma_async_tx_descriptor *
...@@ -245,7 +250,7 @@ void idxd_unregister_dma_device(struct idxd_device *idxd) ...@@ -245,7 +250,7 @@ void idxd_unregister_dma_device(struct idxd_device *idxd)
dma_async_device_unregister(&idxd->idxd_dma->dma); dma_async_device_unregister(&idxd->idxd_dma->dma);
} }
int idxd_register_dma_channel(struct idxd_wq *wq) static int idxd_register_dma_channel(struct idxd_wq *wq)
{ {
struct idxd_device *idxd = wq->idxd; struct idxd_device *idxd = wq->idxd;
struct dma_device *dma = &idxd->idxd_dma->dma; struct dma_device *dma = &idxd->idxd_dma->dma;
...@@ -277,12 +282,12 @@ int idxd_register_dma_channel(struct idxd_wq *wq) ...@@ -277,12 +282,12 @@ int idxd_register_dma_channel(struct idxd_wq *wq)
wq->idxd_chan = idxd_chan; wq->idxd_chan = idxd_chan;
idxd_chan->wq = wq; idxd_chan->wq = wq;
get_device(&wq->conf_dev); get_device(wq_confdev(wq));
return 0; return 0;
} }
void idxd_unregister_dma_channel(struct idxd_wq *wq) static void idxd_unregister_dma_channel(struct idxd_wq *wq)
{ {
struct idxd_dma_chan *idxd_chan = wq->idxd_chan; struct idxd_dma_chan *idxd_chan = wq->idxd_chan;
struct dma_chan *chan = &idxd_chan->chan; struct dma_chan *chan = &idxd_chan->chan;
...@@ -292,5 +297,68 @@ void idxd_unregister_dma_channel(struct idxd_wq *wq) ...@@ -292,5 +297,68 @@ void idxd_unregister_dma_channel(struct idxd_wq *wq)
list_del(&chan->device_node); list_del(&chan->device_node);
kfree(wq->idxd_chan); kfree(wq->idxd_chan);
wq->idxd_chan = NULL; wq->idxd_chan = NULL;
put_device(&wq->conf_dev); put_device(wq_confdev(wq));
}
static int idxd_dmaengine_drv_probe(struct idxd_dev *idxd_dev)
{
struct device *dev = &idxd_dev->conf_dev;
struct idxd_wq *wq = idxd_dev_to_wq(idxd_dev);
struct idxd_device *idxd = wq->idxd;
int rc;
if (idxd->state != IDXD_DEV_ENABLED)
return -ENXIO;
mutex_lock(&wq->wq_lock);
wq->type = IDXD_WQT_KERNEL;
rc = drv_enable_wq(wq);
if (rc < 0) {
dev_dbg(dev, "Enable wq %d failed: %d\n", wq->id, rc);
rc = -ENXIO;
goto err;
}
rc = idxd_register_dma_channel(wq);
if (rc < 0) {
idxd->cmd_status = IDXD_SCMD_DMA_CHAN_ERR;
dev_dbg(dev, "Failed to register dma channel\n");
goto err_dma;
}
idxd->cmd_status = 0;
mutex_unlock(&wq->wq_lock);
return 0;
err_dma:
drv_disable_wq(wq);
err:
wq->type = IDXD_WQT_NONE;
mutex_unlock(&wq->wq_lock);
return rc;
} }
static void idxd_dmaengine_drv_remove(struct idxd_dev *idxd_dev)
{
struct idxd_wq *wq = idxd_dev_to_wq(idxd_dev);
mutex_lock(&wq->wq_lock);
__idxd_wq_quiesce(wq);
idxd_unregister_dma_channel(wq);
drv_disable_wq(wq);
mutex_unlock(&wq->wq_lock);
}
static enum idxd_dev_type dev_types[] = {
IDXD_DEV_WQ,
IDXD_DEV_NONE,
};
struct idxd_device_driver idxd_dmaengine_drv = {
.probe = idxd_dmaengine_drv_probe,
.remove = idxd_dmaengine_drv_remove,
.name = "dmaengine",
.type = dev_types,
};
EXPORT_SYMBOL_GPL(idxd_dmaengine_drv);
...@@ -8,14 +8,37 @@ ...@@ -8,14 +8,37 @@
#include <linux/percpu-rwsem.h> #include <linux/percpu-rwsem.h>
#include <linux/wait.h> #include <linux/wait.h>
#include <linux/cdev.h> #include <linux/cdev.h>
#include <linux/idr.h>
#include <linux/pci.h>
#include <linux/ioasid.h>
#include <linux/bitmap.h>
#include <linux/perf_event.h>
#include <uapi/linux/idxd.h>
#include "registers.h" #include "registers.h"
#define IDXD_DRIVER_VERSION "1.00" #define IDXD_DRIVER_VERSION "1.00"
extern struct kmem_cache *idxd_desc_pool; extern struct kmem_cache *idxd_desc_pool;
extern bool tc_override;
struct idxd_device;
struct idxd_wq; struct idxd_wq;
struct idxd_dev;
enum idxd_dev_type {
IDXD_DEV_NONE = -1,
IDXD_DEV_DSA = 0,
IDXD_DEV_IAX,
IDXD_DEV_WQ,
IDXD_DEV_GROUP,
IDXD_DEV_ENGINE,
IDXD_DEV_CDEV,
IDXD_DEV_MAX_TYPE,
};
struct idxd_dev {
struct device conf_dev;
enum idxd_dev_type type;
};
#define IDXD_REG_TIMEOUT 50 #define IDXD_REG_TIMEOUT 50
#define IDXD_DRAIN_TIMEOUT 5000 #define IDXD_DRAIN_TIMEOUT 5000
...@@ -23,34 +46,83 @@ struct idxd_wq; ...@@ -23,34 +46,83 @@ struct idxd_wq;
enum idxd_type { enum idxd_type {
IDXD_TYPE_UNKNOWN = -1, IDXD_TYPE_UNKNOWN = -1,
IDXD_TYPE_DSA = 0, IDXD_TYPE_DSA = 0,
IDXD_TYPE_MAX IDXD_TYPE_IAX,
IDXD_TYPE_MAX,
}; };
#define IDXD_NAME_SIZE 128 #define IDXD_NAME_SIZE 128
#define IDXD_PMU_EVENT_MAX 64
#define IDXD_ENQCMDS_RETRIES 32
#define IDXD_ENQCMDS_MAX_RETRIES 64
struct idxd_device_driver { struct idxd_device_driver {
const char *name;
enum idxd_dev_type *type;
int (*probe)(struct idxd_dev *idxd_dev);
void (*remove)(struct idxd_dev *idxd_dev);
struct device_driver drv; struct device_driver drv;
}; };
extern struct idxd_device_driver dsa_drv;
extern struct idxd_device_driver idxd_drv;
extern struct idxd_device_driver idxd_dmaengine_drv;
extern struct idxd_device_driver idxd_user_drv;
#define INVALID_INT_HANDLE -1
struct idxd_irq_entry { struct idxd_irq_entry {
struct idxd_device *idxd;
int id; int id;
int vector;
struct llist_head pending_llist; struct llist_head pending_llist;
struct list_head work_list; struct list_head work_list;
/*
* Lock to protect access between irq thread process descriptor
* and irq thread processing error descriptor.
*/
spinlock_t list_lock;
int int_handle;
ioasid_t pasid;
}; };
struct idxd_group { struct idxd_group {
struct device conf_dev; struct idxd_dev idxd_dev;
struct idxd_device *idxd; struct idxd_device *idxd;
struct grpcfg grpcfg; struct grpcfg grpcfg;
int id; int id;
int num_engines; int num_engines;
int num_wqs; int num_wqs;
bool use_token_limit; bool use_rdbuf_limit;
u8 tokens_allowed; u8 rdbufs_allowed;
u8 tokens_reserved; u8 rdbufs_reserved;
int tc_a; int tc_a;
int tc_b; int tc_b;
int desc_progress_limit;
int batch_progress_limit;
};
struct idxd_pmu {
struct idxd_device *idxd;
struct perf_event *event_list[IDXD_PMU_EVENT_MAX];
int n_events;
DECLARE_BITMAP(used_mask, IDXD_PMU_EVENT_MAX);
struct pmu pmu;
char name[IDXD_NAME_SIZE];
int cpu;
int n_counters;
int counter_width;
int n_event_categories;
bool per_counter_caps_supported;
unsigned long supported_event_categories;
unsigned long supported_filters;
int n_filters;
struct hlist_node cpuhp_node;
}; };
#define IDXD_MAX_PRIORITY 0xf #define IDXD_MAX_PRIORITY 0xf
...@@ -62,6 +134,8 @@ enum idxd_wq_state { ...@@ -62,6 +134,8 @@ enum idxd_wq_state {
enum idxd_wq_flag { enum idxd_wq_flag {
WQ_FLAG_DEDICATED = 0, WQ_FLAG_DEDICATED = 0,
WQ_FLAG_BLOCK_ON_FAULT,
WQ_FLAG_ATS_DISABLE,
}; };
enum idxd_wq_type { enum idxd_wq_type {
...@@ -73,7 +147,7 @@ enum idxd_wq_type { ...@@ -73,7 +147,7 @@ enum idxd_wq_type {
struct idxd_cdev { struct idxd_cdev {
struct idxd_wq *wq; struct idxd_wq *wq;
struct cdev cdev; struct cdev cdev;
struct device dev; struct idxd_dev idxd_dev;
int minor; int minor;
}; };
...@@ -81,6 +155,10 @@ struct idxd_cdev { ...@@ -81,6 +155,10 @@ struct idxd_cdev {
#define WQ_NAME_SIZE 1024 #define WQ_NAME_SIZE 1024
#define WQ_TYPE_SIZE 10 #define WQ_TYPE_SIZE 10
#define WQ_DEFAULT_QUEUE_DEPTH 16
#define WQ_DEFAULT_MAX_XFER SZ_2M
#define WQ_DEFAULT_MAX_BATCH 32
enum idxd_op_type { enum idxd_op_type {
IDXD_OP_BLOCK = 0, IDXD_OP_BLOCK = 0,
IDXD_OP_NONBLOCK = 1, IDXD_OP_NONBLOCK = 1,
...@@ -89,6 +167,7 @@ enum idxd_op_type { ...@@ -89,6 +167,7 @@ enum idxd_op_type {
enum idxd_complete_type { enum idxd_complete_type {
IDXD_COMPLETE_NORMAL = 0, IDXD_COMPLETE_NORMAL = 0,
IDXD_COMPLETE_ABORT, IDXD_COMPLETE_ABORT,
IDXD_COMPLETE_DEV_FAIL,
}; };
struct idxd_dma_chan { struct idxd_dma_chan {
...@@ -97,12 +176,18 @@ struct idxd_dma_chan { ...@@ -97,12 +176,18 @@ struct idxd_dma_chan {
}; };
struct idxd_wq { struct idxd_wq {
void __iomem *dportal; void __iomem *portal;
struct device conf_dev; u32 portal_offset;
unsigned int enqcmds_retries;
struct percpu_ref wq_active;
struct completion wq_dead;
struct completion wq_resurrect;
struct idxd_dev idxd_dev;
struct idxd_cdev *idxd_cdev; struct idxd_cdev *idxd_cdev;
struct wait_queue_head err_queue; struct wait_queue_head err_queue;
struct idxd_device *idxd; struct idxd_device *idxd;
int id; int id;
struct idxd_irq_entry ie;
enum idxd_wq_type type; enum idxd_wq_type type;
struct idxd_group *group; struct idxd_group *group;
int client_count; int client_count;
...@@ -113,10 +198,14 @@ struct idxd_wq { ...@@ -113,10 +198,14 @@ struct idxd_wq {
enum idxd_wq_state state; enum idxd_wq_state state;
unsigned long flags; unsigned long flags;
union wqcfg *wqcfg; union wqcfg *wqcfg;
u32 vec_ptr; /* interrupt steering */ unsigned long *opcap_bmap;
struct dsa_hw_desc **hw_descs; struct dsa_hw_desc **hw_descs;
int num_descs; int num_descs;
struct dsa_completion_record *compls; union {
struct dsa_completion_record *compls;
struct iax_completion_record *iax_compls;
};
dma_addr_t compls_addr; dma_addr_t compls_addr;
int compls_size; int compls_size;
struct idxd_desc **descs; struct idxd_desc **descs;
...@@ -128,7 +217,7 @@ struct idxd_wq { ...@@ -128,7 +217,7 @@ struct idxd_wq {
}; };
struct idxd_engine { struct idxd_engine {
struct device conf_dev; struct idxd_dev idxd_dev;
int id; int id;
struct idxd_group *group; struct idxd_group *group;
struct idxd_device *idxd; struct idxd_device *idxd;
...@@ -142,18 +231,20 @@ struct idxd_hw { ...@@ -142,18 +231,20 @@ struct idxd_hw {
union group_cap_reg group_cap; union group_cap_reg group_cap;
union engine_cap_reg engine_cap; union engine_cap_reg engine_cap;
struct opcap opcap; struct opcap opcap;
u32 cmd_cap;
}; };
enum idxd_device_state { enum idxd_device_state {
IDXD_DEV_HALTED = -1, IDXD_DEV_HALTED = -1,
IDXD_DEV_DISABLED = 0, IDXD_DEV_DISABLED = 0,
IDXD_DEV_CONF_READY,
IDXD_DEV_ENABLED, IDXD_DEV_ENABLED,
}; };
enum idxd_device_flag { enum idxd_device_flag {
IDXD_FLAG_CONFIGURABLE = 0, IDXD_FLAG_CONFIGURABLE = 0,
IDXD_FLAG_CMD_RUNNING, IDXD_FLAG_CMD_RUNNING,
IDXD_FLAG_PASID_ENABLED,
IDXD_FLAG_USER_PASID_ENABLED,
}; };
struct idxd_dma_dev { struct idxd_dma_dev {
...@@ -161,27 +252,42 @@ struct idxd_dma_dev { ...@@ -161,27 +252,42 @@ struct idxd_dma_dev {
struct dma_device dma; struct dma_device dma;
}; };
struct idxd_device { struct idxd_driver_data {
const char *name_prefix;
enum idxd_type type; enum idxd_type type;
struct device conf_dev; struct device_type *dev_type;
int compl_size;
int align;
};
struct idxd_device {
struct idxd_dev idxd_dev;
struct idxd_driver_data *data;
struct list_head list; struct list_head list;
struct idxd_hw hw; struct idxd_hw hw;
enum idxd_device_state state; enum idxd_device_state state;
unsigned long flags; unsigned long flags;
int id; int id;
int major; int major;
u8 cmd_status; u32 cmd_status;
struct idxd_irq_entry ie; /* misc irq, msix 0 */
struct pci_dev *pdev; struct pci_dev *pdev;
void __iomem *reg_base; void __iomem *reg_base;
spinlock_t dev_lock; /* spinlock for device */ spinlock_t dev_lock; /* spinlock for device */
spinlock_t cmd_lock; /* spinlock for device commands */
struct completion *cmd_done; struct completion *cmd_done;
struct idxd_group *groups; struct idxd_group **groups;
struct idxd_wq *wqs; struct idxd_wq **wqs;
struct idxd_engine *engines; struct idxd_engine **engines;
struct iommu_sva *sva;
unsigned int pasid;
int num_groups; int num_groups;
int irq_cnt;
bool request_int_handles;
u32 msix_perm_offset; u32 msix_perm_offset;
u32 wqcfg_offset; u32 wqcfg_offset;
...@@ -192,29 +298,37 @@ struct idxd_device { ...@@ -192,29 +298,37 @@ struct idxd_device {
u32 max_batch_size; u32 max_batch_size;
int max_groups; int max_groups;
int max_engines; int max_engines;
int max_tokens; int max_rdbufs;
int max_wqs; int max_wqs;
int max_wq_size; int max_wq_size;
int token_limit; int rdbuf_limit;
int nr_tokens; /* non-reserved tokens */ int nr_rdbufs; /* non-reserved read buffers */
unsigned int wqcfg_size; unsigned int wqcfg_size;
unsigned long *wq_enable_map;
union sw_err_reg sw_err; union sw_err_reg sw_err;
wait_queue_head_t cmd_waitq; wait_queue_head_t cmd_waitq;
struct msix_entry *msix_entries;
int num_wq_irqs;
struct idxd_irq_entry *irq_entries;
struct idxd_dma_dev *idxd_dma; struct idxd_dma_dev *idxd_dma;
struct workqueue_struct *wq; struct workqueue_struct *wq;
struct work_struct work; struct work_struct work;
struct idxd_pmu *idxd_pmu;
unsigned long *opcap_bmap;
}; };
/* IDXD software descriptor */ /* IDXD software descriptor */
struct idxd_desc { struct idxd_desc {
struct dsa_hw_desc *hw; union {
struct dsa_hw_desc *hw;
struct iax_hw_desc *iax_hw;
};
dma_addr_t desc_dma; dma_addr_t desc_dma;
struct dsa_completion_record *completion; union {
struct dsa_completion_record *completion;
struct iax_completion_record *iax_completion;
};
dma_addr_t compl_dma; dma_addr_t compl_dma;
struct dma_async_tx_descriptor txd; struct dma_async_tx_descriptor txd;
struct llist_node llnode; struct llist_node llnode;
...@@ -224,21 +338,172 @@ struct idxd_desc { ...@@ -224,21 +338,172 @@ struct idxd_desc {
struct idxd_wq *wq; struct idxd_wq *wq;
}; };
#define confdev_to_idxd(dev) container_of(dev, struct idxd_device, conf_dev) /*
#define confdev_to_wq(dev) container_of(dev, struct idxd_wq, conf_dev) * This is software defined error for the completion status. We overload the error code
* that will never appear in completion status and only SWERR register.
*/
enum idxd_completion_status {
IDXD_COMP_DESC_ABORT = 0xff,
};
#define idxd_confdev(idxd) &idxd->idxd_dev.conf_dev
#define wq_confdev(wq) &wq->idxd_dev.conf_dev
#define engine_confdev(engine) &engine->idxd_dev.conf_dev
#define group_confdev(group) &group->idxd_dev.conf_dev
#define cdev_dev(cdev) &cdev->idxd_dev.conf_dev
#define confdev_to_idxd_dev(dev) container_of(dev, struct idxd_dev, conf_dev)
#define idxd_dev_to_idxd(idxd_dev) container_of(idxd_dev, struct idxd_device, idxd_dev)
#define idxd_dev_to_wq(idxd_dev) container_of(idxd_dev, struct idxd_wq, idxd_dev)
static inline struct idxd_device *confdev_to_idxd(struct device *dev)
{
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
return idxd_dev_to_idxd(idxd_dev);
}
static inline struct idxd_wq *confdev_to_wq(struct device *dev)
{
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
return idxd_dev_to_wq(idxd_dev);
}
static inline struct idxd_engine *confdev_to_engine(struct device *dev)
{
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
return container_of(idxd_dev, struct idxd_engine, idxd_dev);
}
static inline struct idxd_group *confdev_to_group(struct device *dev)
{
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
return container_of(idxd_dev, struct idxd_group, idxd_dev);
}
static inline struct idxd_cdev *dev_to_cdev(struct device *dev)
{
struct idxd_dev *idxd_dev = confdev_to_idxd_dev(dev);
return container_of(idxd_dev, struct idxd_cdev, idxd_dev);
}
static inline void idxd_dev_set_type(struct idxd_dev *idev, int type)
{
if (type >= IDXD_DEV_MAX_TYPE) {
idev->type = IDXD_DEV_NONE;
return;
}
idev->type = type;
}
static inline struct idxd_irq_entry *idxd_get_ie(struct idxd_device *idxd, int idx)
{
return (idx == 0) ? &idxd->ie : &idxd->wqs[idx - 1]->ie;
}
static inline struct idxd_wq *ie_to_wq(struct idxd_irq_entry *ie)
{
return container_of(ie, struct idxd_wq, ie);
}
static inline struct idxd_device *ie_to_idxd(struct idxd_irq_entry *ie)
{
return container_of(ie, struct idxd_device, ie);
}
extern struct bus_type dsa_bus_type; extern struct bus_type dsa_bus_type;
extern bool support_enqcmd;
extern struct ida idxd_ida;
extern struct device_type dsa_device_type;
extern struct device_type iax_device_type;
extern struct device_type idxd_wq_device_type;
extern struct device_type idxd_engine_device_type;
extern struct device_type idxd_group_device_type;
static inline bool is_dsa_dev(struct idxd_dev *idxd_dev)
{
return idxd_dev->type == IDXD_DEV_DSA;
}
static inline bool is_iax_dev(struct idxd_dev *idxd_dev)
{
return idxd_dev->type == IDXD_DEV_IAX;
}
static inline bool is_idxd_dev(struct idxd_dev *idxd_dev)
{
return is_dsa_dev(idxd_dev) || is_iax_dev(idxd_dev);
}
static inline bool is_idxd_wq_dev(struct idxd_dev *idxd_dev)
{
return idxd_dev->type == IDXD_DEV_WQ;
}
static inline bool is_idxd_wq_dmaengine(struct idxd_wq *wq)
{
if (wq->type == IDXD_WQT_KERNEL && strcmp(wq->name, "dmaengine") == 0)
return true;
return false;
}
static inline bool is_idxd_wq_user(struct idxd_wq *wq)
{
return wq->type == IDXD_WQT_USER;
}
static inline bool is_idxd_wq_kernel(struct idxd_wq *wq)
{
return wq->type == IDXD_WQT_KERNEL;
}
static inline bool wq_dedicated(struct idxd_wq *wq) static inline bool wq_dedicated(struct idxd_wq *wq)
{ {
return test_bit(WQ_FLAG_DEDICATED, &wq->flags); return test_bit(WQ_FLAG_DEDICATED, &wq->flags);
} }
static inline bool wq_shared(struct idxd_wq *wq)
{
return !test_bit(WQ_FLAG_DEDICATED, &wq->flags);
}
static inline bool device_pasid_enabled(struct idxd_device *idxd)
{
return test_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags);
}
static inline bool device_user_pasid_enabled(struct idxd_device *idxd)
{
return test_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags);
}
static inline bool wq_pasid_enabled(struct idxd_wq *wq)
{
return (is_idxd_wq_kernel(wq) && device_pasid_enabled(wq->idxd)) ||
(is_idxd_wq_user(wq) && device_user_pasid_enabled(wq->idxd));
}
static inline bool wq_shared_supported(struct idxd_wq *wq)
{
return (support_enqcmd && wq_pasid_enabled(wq));
}
enum idxd_portal_prot { enum idxd_portal_prot {
IDXD_PORTAL_UNLIMITED = 0, IDXD_PORTAL_UNLIMITED = 0,
IDXD_PORTAL_LIMITED, IDXD_PORTAL_LIMITED,
}; };
enum idxd_interrupt_type {
IDXD_IRQ_MSIX = 0,
IDXD_IRQ_IMS,
};
static inline int idxd_get_wq_portal_offset(enum idxd_portal_prot prot) static inline int idxd_get_wq_portal_offset(enum idxd_portal_prot prot)
{ {
return prot * 0x1000; return prot * 0x1000;
...@@ -250,14 +515,22 @@ static inline int idxd_get_wq_portal_full_offset(int wq_id, ...@@ -250,14 +515,22 @@ static inline int idxd_get_wq_portal_full_offset(int wq_id,
return ((wq_id * 4) << PAGE_SHIFT) + idxd_get_wq_portal_offset(prot); return ((wq_id * 4) << PAGE_SHIFT) + idxd_get_wq_portal_offset(prot);
} }
static inline void idxd_set_type(struct idxd_device *idxd) #define IDXD_PORTAL_MASK (PAGE_SIZE - 1)
/*
* Even though this function can be accessed by multiple threads, it is safe to use.
* At worst the address gets used more than once before it gets incremented. We don't
* hit a threshold until iops becomes many million times a second. So the occasional
* reuse of the same address is tolerable compare to using an atomic variable. This is
* safe on a system that has atomic load/store for 32bit integers. Given that this is an
* Intel iEP device, that should not be a problem.
*/
static inline void __iomem *idxd_wq_portal_addr(struct idxd_wq *wq)
{ {
struct pci_dev *pdev = idxd->pdev; int ofs = wq->portal_offset;
if (pdev->device == PCI_DEVICE_ID_INTEL_DSA_SPR0) wq->portal_offset = (ofs + sizeof(struct dsa_raw_desc)) & IDXD_PORTAL_MASK;
idxd->type = IDXD_TYPE_DSA; return wq->portal + ofs;
else
idxd->type = IDXD_TYPE_UNKNOWN;
} }
static inline void idxd_wq_get(struct idxd_wq *wq) static inline void idxd_wq_get(struct idxd_wq *wq)
...@@ -275,58 +548,113 @@ static inline int idxd_wq_refcount(struct idxd_wq *wq) ...@@ -275,58 +548,113 @@ static inline int idxd_wq_refcount(struct idxd_wq *wq)
return wq->client_count; return wq->client_count;
}; };
const char *idxd_get_dev_name(struct idxd_device *idxd); /*
* Intel IAA does not support batch processing.
* The max batch size of device, max batch size of wq and
* max batch shift of wqcfg should be always 0 on IAA.
*/
static inline void idxd_set_max_batch_size(int idxd_type, struct idxd_device *idxd,
u32 max_batch_size)
{
if (idxd_type == IDXD_TYPE_IAX)
idxd->max_batch_size = 0;
else
idxd->max_batch_size = max_batch_size;
}
static inline void idxd_wq_set_max_batch_size(int idxd_type, struct idxd_wq *wq,
u32 max_batch_size)
{
if (idxd_type == IDXD_TYPE_IAX)
wq->max_batch_size = 0;
else
wq->max_batch_size = max_batch_size;
}
static inline void idxd_wqcfg_set_max_batch_shift(int idxd_type, union wqcfg *wqcfg,
u32 max_batch_shift)
{
if (idxd_type == IDXD_TYPE_IAX)
wqcfg->max_batch_shift = 0;
else
wqcfg->max_batch_shift = max_batch_shift;
}
int __must_check __idxd_driver_register(struct idxd_device_driver *idxd_drv,
struct module *module, const char *mod_name);
#define idxd_driver_register(driver) \
__idxd_driver_register(driver, THIS_MODULE, KBUILD_MODNAME)
void idxd_driver_unregister(struct idxd_device_driver *idxd_drv);
#define module_idxd_driver(__idxd_driver) \
module_driver(__idxd_driver, idxd_driver_register, idxd_driver_unregister)
int idxd_register_bus_type(void); int idxd_register_bus_type(void);
void idxd_unregister_bus_type(void); void idxd_unregister_bus_type(void);
int idxd_setup_sysfs(struct idxd_device *idxd); int idxd_register_devices(struct idxd_device *idxd);
void idxd_cleanup_sysfs(struct idxd_device *idxd); void idxd_unregister_devices(struct idxd_device *idxd);
int idxd_register_driver(void); int idxd_register_driver(void);
void idxd_unregister_driver(void); void idxd_unregister_driver(void);
struct bus_type *idxd_get_bus_type(struct idxd_device *idxd); void idxd_wqs_quiesce(struct idxd_device *idxd);
bool idxd_queue_int_handle_resubmit(struct idxd_desc *desc);
/* device interrupt control */ /* device interrupt control */
irqreturn_t idxd_irq_handler(int vec, void *data);
irqreturn_t idxd_misc_thread(int vec, void *data); irqreturn_t idxd_misc_thread(int vec, void *data);
irqreturn_t idxd_wq_thread(int irq, void *data); irqreturn_t idxd_wq_thread(int irq, void *data);
void idxd_mask_error_interrupts(struct idxd_device *idxd); void idxd_mask_error_interrupts(struct idxd_device *idxd);
void idxd_unmask_error_interrupts(struct idxd_device *idxd); void idxd_unmask_error_interrupts(struct idxd_device *idxd);
void idxd_mask_msix_vectors(struct idxd_device *idxd);
void idxd_mask_msix_vector(struct idxd_device *idxd, int vec_id);
void idxd_unmask_msix_vector(struct idxd_device *idxd, int vec_id);
/* device control */ /* device control */
int idxd_register_idxd_drv(void);
void idxd_unregister_idxd_drv(void);
int idxd_device_drv_probe(struct idxd_dev *idxd_dev);
void idxd_device_drv_remove(struct idxd_dev *idxd_dev);
int drv_enable_wq(struct idxd_wq *wq);
void drv_disable_wq(struct idxd_wq *wq);
int idxd_device_init_reset(struct idxd_device *idxd); int idxd_device_init_reset(struct idxd_device *idxd);
int idxd_device_enable(struct idxd_device *idxd); int idxd_device_enable(struct idxd_device *idxd);
int idxd_device_disable(struct idxd_device *idxd); int idxd_device_disable(struct idxd_device *idxd);
void idxd_device_reset(struct idxd_device *idxd); void idxd_device_reset(struct idxd_device *idxd);
void idxd_device_cleanup(struct idxd_device *idxd); void idxd_device_clear_state(struct idxd_device *idxd);
int idxd_device_config(struct idxd_device *idxd); int idxd_device_config(struct idxd_device *idxd);
void idxd_device_wqs_clear_state(struct idxd_device *idxd); void idxd_device_drain_pasid(struct idxd_device *idxd, int pasid);
int idxd_device_load_config(struct idxd_device *idxd);
int idxd_device_request_int_handle(struct idxd_device *idxd, int idx, int *handle,
enum idxd_interrupt_type irq_type);
int idxd_device_release_int_handle(struct idxd_device *idxd, int handle,
enum idxd_interrupt_type irq_type);
/* work queue control */ /* work queue control */
void idxd_wqs_unmap_portal(struct idxd_device *idxd);
int idxd_wq_alloc_resources(struct idxd_wq *wq); int idxd_wq_alloc_resources(struct idxd_wq *wq);
void idxd_wq_free_resources(struct idxd_wq *wq); void idxd_wq_free_resources(struct idxd_wq *wq);
int idxd_wq_enable(struct idxd_wq *wq); int idxd_wq_enable(struct idxd_wq *wq);
int idxd_wq_disable(struct idxd_wq *wq); int idxd_wq_disable(struct idxd_wq *wq, bool reset_config);
void idxd_wq_drain(struct idxd_wq *wq); void idxd_wq_drain(struct idxd_wq *wq);
void idxd_wq_reset(struct idxd_wq *wq); void idxd_wq_reset(struct idxd_wq *wq);
int idxd_wq_map_portal(struct idxd_wq *wq); int idxd_wq_map_portal(struct idxd_wq *wq);
void idxd_wq_unmap_portal(struct idxd_wq *wq); void idxd_wq_unmap_portal(struct idxd_wq *wq);
void idxd_wq_disable_cleanup(struct idxd_wq *wq); int idxd_wq_set_pasid(struct idxd_wq *wq, int pasid);
int idxd_wq_disable_pasid(struct idxd_wq *wq);
void __idxd_wq_quiesce(struct idxd_wq *wq);
void idxd_wq_quiesce(struct idxd_wq *wq);
int idxd_wq_init_percpu_ref(struct idxd_wq *wq);
void idxd_wq_free_irq(struct idxd_wq *wq);
int idxd_wq_request_irq(struct idxd_wq *wq);
/* submission */ /* submission */
int idxd_submit_desc(struct idxd_wq *wq, struct idxd_desc *desc); int idxd_submit_desc(struct idxd_wq *wq, struct idxd_desc *desc);
struct idxd_desc *idxd_alloc_desc(struct idxd_wq *wq, enum idxd_op_type optype); struct idxd_desc *idxd_alloc_desc(struct idxd_wq *wq, enum idxd_op_type optype);
void idxd_free_desc(struct idxd_wq *wq, struct idxd_desc *desc); void idxd_free_desc(struct idxd_wq *wq, struct idxd_desc *desc);
int idxd_enqcmds(struct idxd_wq *wq, void __iomem *portal, const void *desc);
/* dmaengine */ /* dmaengine */
int idxd_register_dma_device(struct idxd_device *idxd); int idxd_register_dma_device(struct idxd_device *idxd);
void idxd_unregister_dma_device(struct idxd_device *idxd); void idxd_unregister_dma_device(struct idxd_device *idxd);
int idxd_register_dma_channel(struct idxd_wq *wq);
void idxd_unregister_dma_channel(struct idxd_wq *wq);
void idxd_parse_completion_status(u8 status, enum dmaengine_tx_result *res); void idxd_parse_completion_status(u8 status, enum dmaengine_tx_result *res);
void idxd_dma_complete_txd(struct idxd_desc *desc, void idxd_dma_complete_txd(struct idxd_desc *desc,
enum idxd_complete_type comp_type); enum idxd_complete_type comp_type, bool free_desc);
/* cdev */ /* cdev */
int idxd_cdev_register(void); int idxd_cdev_register(void);
...@@ -335,4 +663,19 @@ int idxd_cdev_get_major(struct idxd_device *idxd); ...@@ -335,4 +663,19 @@ int idxd_cdev_get_major(struct idxd_device *idxd);
int idxd_wq_add_cdev(struct idxd_wq *wq); int idxd_wq_add_cdev(struct idxd_wq *wq);
void idxd_wq_del_cdev(struct idxd_wq *wq); void idxd_wq_del_cdev(struct idxd_wq *wq);
/* perfmon */
#if IS_ENABLED(CONFIG_INTEL_IDXD_PERFMON)
int perfmon_pmu_init(struct idxd_device *idxd);
void perfmon_pmu_remove(struct idxd_device *idxd);
void perfmon_counter_overflow(struct idxd_device *idxd);
void perfmon_init(void);
void perfmon_exit(void);
#else
static inline int perfmon_pmu_init(struct idxd_device *idxd) { return 0; }
static inline void perfmon_pmu_remove(struct idxd_device *idxd) {}
static inline void perfmon_counter_overflow(struct idxd_device *idxd) {}
static inline void perfmon_init(void) {}
static inline void perfmon_exit(void) {}
#endif
#endif #endif
此差异已折叠。
...@@ -6,11 +6,27 @@ ...@@ -6,11 +6,27 @@
#include <linux/pci.h> #include <linux/pci.h>
#include <linux/io-64-nonatomic-lo-hi.h> #include <linux/io-64-nonatomic-lo-hi.h>
#include <linux/dmaengine.h> #include <linux/dmaengine.h>
#include <linux/delay.h>
#include <uapi/linux/idxd.h> #include <uapi/linux/idxd.h>
#include "../dmaengine.h" #include "../dmaengine.h"
#include "idxd.h" #include "idxd.h"
#include "registers.h" #include "registers.h"
enum irq_work_type {
IRQ_WORK_NORMAL = 0,
IRQ_WORK_PROCESS_FAULT,
};
struct idxd_resubmit {
struct work_struct work;
struct idxd_desc *desc;
};
struct idxd_int_handle_revoke {
struct work_struct work;
struct idxd_device *idxd;
};
static void idxd_device_reinit(struct work_struct *work) static void idxd_device_reinit(struct work_struct *work)
{ {
struct idxd_device *idxd = container_of(work, struct idxd_device, work); struct idxd_device *idxd = container_of(work, struct idxd_device, work);
...@@ -27,13 +43,14 @@ static void idxd_device_reinit(struct work_struct *work) ...@@ -27,13 +43,14 @@ static void idxd_device_reinit(struct work_struct *work)
goto out; goto out;
for (i = 0; i < idxd->max_wqs; i++) { for (i = 0; i < idxd->max_wqs; i++) {
struct idxd_wq *wq = &idxd->wqs[i]; if (test_bit(i, idxd->wq_enable_map)) {
struct idxd_wq *wq = idxd->wqs[i];
if (wq->state == IDXD_WQ_ENABLED) {
rc = idxd_wq_enable(wq); rc = idxd_wq_enable(wq);
if (rc < 0) { if (rc < 0) {
clear_bit(i, idxd->wq_enable_map);
dev_warn(dev, "Unable to re-enable wq %s\n", dev_warn(dev, "Unable to re-enable wq %s\n",
dev_name(&wq->conf_dev)); dev_name(wq_confdev(wq)));
} }
} }
} }
...@@ -41,16 +58,163 @@ static void idxd_device_reinit(struct work_struct *work) ...@@ -41,16 +58,163 @@ static void idxd_device_reinit(struct work_struct *work)
return; return;
out: out:
idxd_device_wqs_clear_state(idxd); idxd_device_clear_state(idxd);
} }
irqreturn_t idxd_irq_handler(int vec, void *data) /*
* The function sends a drain descriptor for the interrupt handle. The drain ensures
* all descriptors with this interrupt handle is flushed and the interrupt
* will allow the cleanup of the outstanding descriptors.
*/
static void idxd_int_handle_revoke_drain(struct idxd_irq_entry *ie)
{ {
struct idxd_irq_entry *irq_entry = data; struct idxd_wq *wq = ie_to_wq(ie);
struct idxd_device *idxd = irq_entry->idxd; struct idxd_device *idxd = wq->idxd;
struct device *dev = &idxd->pdev->dev;
struct dsa_hw_desc desc = {};
void __iomem *portal;
int rc;
/* Issue a simple drain operation with interrupt but no completion record */
desc.flags = IDXD_OP_FLAG_RCI;
desc.opcode = DSA_OPCODE_DRAIN;
desc.priv = 1;
if (ie->pasid != INVALID_IOASID)
desc.pasid = ie->pasid;
desc.int_handle = ie->int_handle;
portal = idxd_wq_portal_addr(wq);
/*
* The wmb() makes sure that the descriptor is all there before we
* issue.
*/
wmb();
if (wq_dedicated(wq)) {
iosubmit_cmds512(portal, &desc, 1);
} else {
rc = idxd_enqcmds(wq, portal, &desc);
/* This should not fail unless hardware failed. */
if (rc < 0)
dev_warn(dev, "Failed to submit drain desc on wq %d\n", wq->id);
}
}
static void idxd_abort_invalid_int_handle_descs(struct idxd_irq_entry *ie)
{
LIST_HEAD(flist);
struct idxd_desc *d, *t;
struct llist_node *head;
spin_lock(&ie->list_lock);
head = llist_del_all(&ie->pending_llist);
if (head) {
llist_for_each_entry_safe(d, t, head, llnode)
list_add_tail(&d->list, &ie->work_list);
}
list_for_each_entry_safe(d, t, &ie->work_list, list) {
if (d->completion->status == DSA_COMP_INT_HANDLE_INVAL)
list_move_tail(&d->list, &flist);
}
spin_unlock(&ie->list_lock);
list_for_each_entry_safe(d, t, &flist, list) {
list_del(&d->list);
idxd_dma_complete_txd(d, IDXD_COMPLETE_ABORT, true);
}
}
static void idxd_int_handle_revoke(struct work_struct *work)
{
struct idxd_int_handle_revoke *revoke =
container_of(work, struct idxd_int_handle_revoke, work);
struct idxd_device *idxd = revoke->idxd;
struct pci_dev *pdev = idxd->pdev;
struct device *dev = &pdev->dev;
int i, new_handle, rc;
if (!idxd->request_int_handles) {
kfree(revoke);
dev_warn(dev, "Unexpected int handle refresh interrupt.\n");
return;
}
idxd_mask_msix_vector(idxd, irq_entry->id); /*
return IRQ_WAKE_THREAD; * The loop attempts to acquire new interrupt handle for all interrupt
* vectors that supports a handle. If a new interrupt handle is acquired and the
* wq is kernel type, the driver will kill the percpu_ref to pause all
* ongoing descriptor submissions. The interrupt handle is then changed.
* After change, the percpu_ref is revived and all the pending submissions
* are woken to try again. A drain is sent to for the interrupt handle
* at the end to make sure all invalid int handle descriptors are processed.
*/
for (i = 1; i < idxd->irq_cnt; i++) {
struct idxd_irq_entry *ie = idxd_get_ie(idxd, i);
struct idxd_wq *wq = ie_to_wq(ie);
if (ie->int_handle == INVALID_INT_HANDLE)
continue;
rc = idxd_device_request_int_handle(idxd, i, &new_handle, IDXD_IRQ_MSIX);
if (rc < 0) {
dev_warn(dev, "get int handle %d failed: %d\n", i, rc);
/*
* Failed to acquire new interrupt handle. Kill the WQ
* and release all the pending submitters. The submitters will
* get error return code and handle appropriately.
*/
ie->int_handle = INVALID_INT_HANDLE;
idxd_wq_quiesce(wq);
idxd_abort_invalid_int_handle_descs(ie);
continue;
}
/* No change in interrupt handle, nothing needs to be done */
if (ie->int_handle == new_handle)
continue;
if (wq->state != IDXD_WQ_ENABLED || wq->type != IDXD_WQT_KERNEL) {
/*
* All the MSIX interrupts are allocated at once during probe.
* Therefore we need to update all interrupts even if the WQ
* isn't supporting interrupt operations.
*/
ie->int_handle = new_handle;
continue;
}
mutex_lock(&wq->wq_lock);
reinit_completion(&wq->wq_resurrect);
/* Kill percpu_ref to pause additional descriptor submissions */
percpu_ref_kill(&wq->wq_active);
/* Wait for all submitters quiesce before we change interrupt handle */
wait_for_completion(&wq->wq_dead);
ie->int_handle = new_handle;
/* Revive percpu ref and wake up all the waiting submitters */
percpu_ref_reinit(&wq->wq_active);
complete_all(&wq->wq_resurrect);
mutex_unlock(&wq->wq_lock);
/*
* The delay here is to wait for all possible MOVDIR64B that
* are issued before percpu_ref_kill() has happened to have
* reached the PCIe domain before the drain is issued. The driver
* needs to ensure that the drain descriptor issued does not pass
* all the other issued descriptors that contain the invalid
* interrupt handle in order to ensure that the drain descriptor
* interrupt will allow the cleanup of all the descriptors with
* invalid interrupt handle.
*/
if (wq_dedicated(wq))
udelay(100);
idxd_int_handle_revoke_drain(ie);
}
kfree(revoke);
} }
static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
...@@ -61,8 +225,11 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -61,8 +225,11 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
int i; int i;
bool err = false; bool err = false;
if (cause & IDXD_INTC_HALT_STATE)
goto halt;
if (cause & IDXD_INTC_ERR) { if (cause & IDXD_INTC_ERR) {
spin_lock_bh(&idxd->dev_lock); spin_lock(&idxd->dev_lock);
for (i = 0; i < 4; i++) for (i = 0; i < 4; i++)
idxd->sw_err.bits[i] = ioread64(idxd->reg_base + idxd->sw_err.bits[i] = ioread64(idxd->reg_base +
IDXD_SWERR_OFFSET + i * sizeof(u64)); IDXD_SWERR_OFFSET + i * sizeof(u64));
...@@ -72,7 +239,7 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -72,7 +239,7 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
if (idxd->sw_err.valid && idxd->sw_err.wq_idx_valid) { if (idxd->sw_err.valid && idxd->sw_err.wq_idx_valid) {
int id = idxd->sw_err.wq_idx; int id = idxd->sw_err.wq_idx;
struct idxd_wq *wq = &idxd->wqs[id]; struct idxd_wq *wq = idxd->wqs[id];
if (wq->type == IDXD_WQT_USER) if (wq->type == IDXD_WQT_USER)
wake_up_interruptible(&wq->err_queue); wake_up_interruptible(&wq->err_queue);
...@@ -80,14 +247,14 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -80,14 +247,14 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
int i; int i;
for (i = 0; i < idxd->max_wqs; i++) { for (i = 0; i < idxd->max_wqs; i++) {
struct idxd_wq *wq = &idxd->wqs[i]; struct idxd_wq *wq = idxd->wqs[i];
if (wq->type == IDXD_WQT_USER) if (wq->type == IDXD_WQT_USER)
wake_up_interruptible(&wq->err_queue); wake_up_interruptible(&wq->err_queue);
} }
} }
spin_unlock_bh(&idxd->dev_lock); spin_unlock(&idxd->dev_lock);
val |= IDXD_INTC_ERR; val |= IDXD_INTC_ERR;
for (i = 0; i < 4; i++) for (i = 0; i < 4; i++)
...@@ -96,6 +263,23 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -96,6 +263,23 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
err = true; err = true;
} }
if (cause & IDXD_INTC_INT_HANDLE_REVOKED) {
struct idxd_int_handle_revoke *revoke;
val |= IDXD_INTC_INT_HANDLE_REVOKED;
revoke = kzalloc(sizeof(*revoke), GFP_ATOMIC);
if (revoke) {
revoke->idxd = idxd;
INIT_WORK(&revoke->work, idxd_int_handle_revoke);
queue_work(idxd->wq, &revoke->work);
} else {
dev_err(dev, "Failed to allocate work for int handle revoke\n");
idxd_wqs_quiesce(idxd);
}
}
if (cause & IDXD_INTC_CMD) { if (cause & IDXD_INTC_CMD) {
val |= IDXD_INTC_CMD; val |= IDXD_INTC_CMD;
complete(idxd->cmd_done); complete(idxd->cmd_done);
...@@ -107,11 +291,8 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -107,11 +291,8 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
} }
if (cause & IDXD_INTC_PERFMON_OVFL) { if (cause & IDXD_INTC_PERFMON_OVFL) {
/*
* Driver does not utilize perfmon counter overflow interrupt
* yet.
*/
val |= IDXD_INTC_PERFMON_OVFL; val |= IDXD_INTC_PERFMON_OVFL;
perfmon_counter_overflow(idxd);
} }
val ^= cause; val ^= cause;
...@@ -122,6 +303,7 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -122,6 +303,7 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
if (!err) if (!err)
return 0; return 0;
halt:
gensts.bits = ioread32(idxd->reg_base + IDXD_GENSTATS_OFFSET); gensts.bits = ioread32(idxd->reg_base + IDXD_GENSTATS_OFFSET);
if (gensts.state == IDXD_DEVICE_STATE_HALT) { if (gensts.state == IDXD_DEVICE_STATE_HALT) {
idxd->state = IDXD_DEV_HALTED; idxd->state = IDXD_DEV_HALTED;
...@@ -134,13 +316,14 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -134,13 +316,14 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
INIT_WORK(&idxd->work, idxd_device_reinit); INIT_WORK(&idxd->work, idxd_device_reinit);
queue_work(idxd->wq, &idxd->work); queue_work(idxd->wq, &idxd->work);
} else { } else {
spin_lock_bh(&idxd->dev_lock); idxd->state = IDXD_DEV_HALTED;
idxd_device_wqs_clear_state(idxd); idxd_wqs_quiesce(idxd);
idxd_wqs_unmap_portal(idxd);
idxd_device_clear_state(idxd);
dev_err(&idxd->pdev->dev, dev_err(&idxd->pdev->dev,
"idxd halted, need %s.\n", "idxd halted, need %s.\n",
gensts.reset_type == IDXD_DEVICE_RESET_FLR ? gensts.reset_type == IDXD_DEVICE_RESET_FLR ?
"FLR" : "system reset"); "FLR" : "system reset");
spin_unlock_bh(&idxd->dev_lock);
return -ENXIO; return -ENXIO;
} }
} }
...@@ -151,7 +334,7 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause) ...@@ -151,7 +334,7 @@ static int process_misc_interrupts(struct idxd_device *idxd, u32 cause)
irqreturn_t idxd_misc_thread(int vec, void *data) irqreturn_t idxd_misc_thread(int vec, void *data)
{ {
struct idxd_irq_entry *irq_entry = data; struct idxd_irq_entry *irq_entry = data;
struct idxd_device *idxd = irq_entry->idxd; struct idxd_device *idxd = ie_to_idxd(irq_entry);
int rc; int rc;
u32 cause; u32 cause;
...@@ -168,67 +351,126 @@ irqreturn_t idxd_misc_thread(int vec, void *data) ...@@ -168,67 +351,126 @@ irqreturn_t idxd_misc_thread(int vec, void *data)
iowrite32(cause, idxd->reg_base + IDXD_INTCAUSE_OFFSET); iowrite32(cause, idxd->reg_base + IDXD_INTCAUSE_OFFSET);
} }
idxd_unmask_msix_vector(idxd, irq_entry->id);
return IRQ_HANDLED; return IRQ_HANDLED;
} }
static int irq_process_pending_llist(struct idxd_irq_entry *irq_entry, static void idxd_int_handle_resubmit_work(struct work_struct *work)
int *processed) {
struct idxd_resubmit *irw = container_of(work, struct idxd_resubmit, work);
struct idxd_desc *desc = irw->desc;
struct idxd_wq *wq = desc->wq;
int rc;
desc->completion->status = 0;
rc = idxd_submit_desc(wq, desc);
if (rc < 0) {
dev_dbg(&wq->idxd->pdev->dev, "Failed to resubmit desc %d to wq %d.\n",
desc->id, wq->id);
/*
* If the error is not -EAGAIN, it means the submission failed due to wq
* has been killed instead of ENQCMDS failure. Here the driver needs to
* notify the submitter of the failure by reporting abort status.
*
* -EAGAIN comes from ENQCMDS failure. idxd_submit_desc() will handle the
* abort.
*/
if (rc != -EAGAIN) {
desc->completion->status = IDXD_COMP_DESC_ABORT;
idxd_dma_complete_txd(desc, IDXD_COMPLETE_ABORT, false);
}
idxd_free_desc(wq, desc);
}
kfree(irw);
}
bool idxd_queue_int_handle_resubmit(struct idxd_desc *desc)
{
struct idxd_wq *wq = desc->wq;
struct idxd_device *idxd = wq->idxd;
struct idxd_resubmit *irw;
irw = kzalloc(sizeof(*irw), GFP_KERNEL);
if (!irw)
return false;
irw->desc = desc;
INIT_WORK(&irw->work, idxd_int_handle_resubmit_work);
queue_work(idxd->wq, &irw->work);
return true;
}
static void irq_process_pending_llist(struct idxd_irq_entry *irq_entry)
{ {
struct idxd_desc *desc, *t; struct idxd_desc *desc, *t;
struct llist_node *head; struct llist_node *head;
int queued = 0;
*processed = 0;
head = llist_del_all(&irq_entry->pending_llist); head = llist_del_all(&irq_entry->pending_llist);
if (!head) if (!head)
return 0; return;
llist_for_each_entry_safe(desc, t, head, llnode) { llist_for_each_entry_safe(desc, t, head, llnode) {
if (desc->completion->status) { u8 status = desc->completion->status & DSA_COMP_STATUS_MASK;
idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
idxd_free_desc(desc->wq, desc); if (status) {
(*processed)++; /*
* Check against the original status as ABORT is software defined
* and 0xff, which DSA_COMP_STATUS_MASK can mask out.
*/
if (unlikely(desc->completion->status == IDXD_COMP_DESC_ABORT)) {
idxd_dma_complete_txd(desc, IDXD_COMPLETE_ABORT, true);
continue;
}
idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL, true);
} else { } else {
list_add_tail(&desc->list, &irq_entry->work_list); spin_lock(&irq_entry->list_lock);
queued++; list_add_tail(&desc->list,
&irq_entry->work_list);
spin_unlock(&irq_entry->list_lock);
} }
} }
return queued;
} }
static int irq_process_work_list(struct idxd_irq_entry *irq_entry, static void irq_process_work_list(struct idxd_irq_entry *irq_entry)
int *processed)
{ {
struct list_head *node, *next; LIST_HEAD(flist);
int queued = 0; struct idxd_desc *desc, *n;
*processed = 0;
if (list_empty(&irq_entry->work_list))
return 0;
list_for_each_safe(node, next, &irq_entry->work_list) { /*
struct idxd_desc *desc = * This lock protects list corruption from access of list outside of the irq handler
container_of(node, struct idxd_desc, list); * thread.
*/
spin_lock(&irq_entry->list_lock);
if (list_empty(&irq_entry->work_list)) {
spin_unlock(&irq_entry->list_lock);
return;
}
list_for_each_entry_safe(desc, n, &irq_entry->work_list, list) {
if (desc->completion->status) { if (desc->completion->status) {
list_del(&desc->list); list_move_tail(&desc->list, &flist);
/* process and callback */
idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL);
idxd_free_desc(desc->wq, desc);
(*processed)++;
} else {
queued++;
} }
} }
return queued; spin_unlock(&irq_entry->list_lock);
list_for_each_entry(desc, &flist, list) {
/*
* Check against the original status as ABORT is software defined
* and 0xff, which DSA_COMP_STATUS_MASK can mask out.
*/
if (unlikely(desc->completion->status == IDXD_COMP_DESC_ABORT)) {
idxd_dma_complete_txd(desc, IDXD_COMPLETE_ABORT, true);
continue;
}
idxd_dma_complete_txd(desc, IDXD_COMPLETE_NORMAL, true);
}
} }
static int idxd_desc_process(struct idxd_irq_entry *irq_entry) irqreturn_t idxd_wq_thread(int irq, void *data)
{ {
int rc, processed, total = 0; struct idxd_irq_entry *irq_entry = data;
/* /*
* There are two lists we are processing. The pending_llist is where * There are two lists we are processing. The pending_llist is where
...@@ -247,31 +489,9 @@ static int idxd_desc_process(struct idxd_irq_entry *irq_entry) ...@@ -247,31 +489,9 @@ static int idxd_desc_process(struct idxd_irq_entry *irq_entry)
* and process the completed entries. * and process the completed entries.
* 4. If the entry is still waiting on hardware, list_add_tail() to * 4. If the entry is still waiting on hardware, list_add_tail() to
* the work_list. * the work_list.
* 5. Repeat until no more descriptors.
*/ */
do { irq_process_work_list(irq_entry);
rc = irq_process_work_list(irq_entry, &processed); irq_process_pending_llist(irq_entry);
total += processed;
if (rc != 0)
continue;
rc = irq_process_pending_llist(irq_entry, &processed);
total += processed;
} while (rc != 0);
return total;
}
irqreturn_t idxd_wq_thread(int irq, void *data)
{
struct idxd_irq_entry *irq_entry = data;
int processed;
processed = idxd_desc_process(irq_entry);
idxd_unmask_msix_vector(irq_entry->idxd, irq_entry->id);
if (processed == 0)
return IRQ_NONE;
return IRQ_HANDLED; return IRQ_HANDLED;
} }
此差异已折叠。
/* SPDX-License-Identifier: GPL-2.0 */
/* Copyright(c) 2020 Intel Corporation. All rights rsvd. */
#ifndef _PERFMON_H_
#define _PERFMON_H_
#include <linux/slab.h>
#include <linux/pci.h>
#include <linux/sbitmap.h>
#include <linux/dmaengine.h>
#include <linux/percpu-rwsem.h>
#include <linux/wait.h>
#include <linux/cdev.h>
#include <linux/uuid.h>
#include <linux/idxd.h>
#include <linux/perf_event.h>
#include "registers.h"
static inline struct idxd_pmu *event_to_pmu(struct perf_event *event)
{
struct idxd_pmu *idxd_pmu;
struct pmu *pmu;
pmu = event->pmu;
idxd_pmu = container_of(pmu, struct idxd_pmu, pmu);
return idxd_pmu;
}
static inline struct idxd_device *event_to_idxd(struct perf_event *event)
{
struct idxd_pmu *idxd_pmu;
struct pmu *pmu;
pmu = event->pmu;
idxd_pmu = container_of(pmu, struct idxd_pmu, pmu);
return idxd_pmu->idxd;
}
static inline struct idxd_device *pmu_to_idxd(struct pmu *pmu)
{
struct idxd_pmu *idxd_pmu;
idxd_pmu = container_of(pmu, struct idxd_pmu, pmu);
return idxd_pmu->idxd;
}
enum dsa_perf_events {
DSA_PERF_EVENT_WQ = 0,
DSA_PERF_EVENT_ENGINE,
DSA_PERF_EVENT_ADDR_TRANS,
DSA_PERF_EVENT_OP,
DSA_PERF_EVENT_COMPL,
DSA_PERF_EVENT_MAX,
};
enum filter_enc {
FLT_WQ = 0,
FLT_TC,
FLT_PG_SZ,
FLT_XFER_SZ,
FLT_ENG,
FLT_MAX,
};
#define CONFIG_RESET 0x0000000000000001
#define CNTR_RESET 0x0000000000000002
#define CNTR_ENABLE 0x0000000000000001
#define INTR_OVFL 0x0000000000000002
#define COUNTER_FREEZE 0x00000000FFFFFFFF
#define COUNTER_UNFREEZE 0x0000000000000000
#define OVERFLOW_SIZE 32
#define CNTRCFG_ENABLE BIT(0)
#define CNTRCFG_IRQ_OVERFLOW BIT(1)
#define CNTRCFG_CATEGORY_SHIFT 8
#define CNTRCFG_EVENT_SHIFT 32
#define PERFMON_TABLE_OFFSET(_idxd) \
({ \
typeof(_idxd) __idxd = (_idxd); \
((__idxd)->reg_base + (__idxd)->perfmon_offset); \
})
#define PERFMON_REG_OFFSET(idxd, offset) \
(PERFMON_TABLE_OFFSET(idxd) + (offset))
#define PERFCAP_REG(idxd) (PERFMON_REG_OFFSET(idxd, IDXD_PERFCAP_OFFSET))
#define PERFRST_REG(idxd) (PERFMON_REG_OFFSET(idxd, IDXD_PERFRST_OFFSET))
#define OVFSTATUS_REG(idxd) (PERFMON_REG_OFFSET(idxd, IDXD_OVFSTATUS_OFFSET))
#define PERFFRZ_REG(idxd) (PERFMON_REG_OFFSET(idxd, IDXD_PERFFRZ_OFFSET))
#define FLTCFG_REG(idxd, cntr, flt) \
(PERFMON_REG_OFFSET(idxd, IDXD_FLTCFG_OFFSET) + ((cntr) * 32) + ((flt) * 4))
#define CNTRCFG_REG(idxd, cntr) \
(PERFMON_REG_OFFSET(idxd, IDXD_CNTRCFG_OFFSET) + ((cntr) * 8))
#define CNTRDATA_REG(idxd, cntr) \
(PERFMON_REG_OFFSET(idxd, IDXD_CNTRDATA_OFFSET) + ((cntr) * 8))
#define CNTRCAP_REG(idxd, cntr) \
(PERFMON_REG_OFFSET(idxd, IDXD_CNTRCAP_OFFSET) + ((cntr) * 8))
#define EVNTCAP_REG(idxd, category) \
(PERFMON_REG_OFFSET(idxd, IDXD_EVNTCAP_OFFSET) + ((category) * 8))
#define DEFINE_PERFMON_FORMAT_ATTR(_name, _format) \
static ssize_t __perfmon_idxd_##_name##_show(struct kobject *kobj, \
struct kobj_attribute *attr, \
char *page) \
{ \
BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE); \
return sprintf(page, _format "\n"); \
} \
static struct kobj_attribute format_attr_idxd_##_name = \
__ATTR(_name, 0444, __perfmon_idxd_##_name##_show, NULL)
#endif
...@@ -5,6 +5,10 @@ ...@@ -5,6 +5,10 @@
/* PCI Config */ /* PCI Config */
#define PCI_DEVICE_ID_INTEL_DSA_SPR0 0x0b25 #define PCI_DEVICE_ID_INTEL_DSA_SPR0 0x0b25
#define PCI_DEVICE_ID_INTEL_IAX_SPR0 0x0cfe
#define DEVICE_VERSION_1 0x100
#define DEVICE_VERSION_2 0x200
#define IDXD_MMIO_BAR 0 #define IDXD_MMIO_BAR 0
#define IDXD_WQ_BAR 2 #define IDXD_WQ_BAR 2
...@@ -23,8 +27,8 @@ union gen_cap_reg { ...@@ -23,8 +27,8 @@ union gen_cap_reg {
u64 overlap_copy:1; u64 overlap_copy:1;
u64 cache_control_mem:1; u64 cache_control_mem:1;
u64 cache_control_cache:1; u64 cache_control_cache:1;
u64 cmd_cap:1;
u64 rsvd:3; u64 rsvd:3;
u64 int_handle_req:1;
u64 dest_readback:1; u64 dest_readback:1;
u64 drain_readback:1; u64 drain_readback:1;
u64 rsvd2:6; u64 rsvd2:6;
...@@ -32,8 +36,7 @@ union gen_cap_reg { ...@@ -32,8 +36,7 @@ union gen_cap_reg {
u64 max_batch_shift:4; u64 max_batch_shift:4;
u64 max_ims_mult:6; u64 max_ims_mult:6;
u64 config_en:1; u64 config_en:1;
u64 max_descs_per_engine:8; u64 rsvd3:32;
u64 rsvd3:24;
}; };
u64 bits; u64 bits;
} __packed; } __packed;
...@@ -47,11 +50,12 @@ union wq_cap_reg { ...@@ -47,11 +50,12 @@ union wq_cap_reg {
u64 rsvd:20; u64 rsvd:20;
u64 shared_mode:1; u64 shared_mode:1;
u64 dedicated_mode:1; u64 dedicated_mode:1;
u64 rsvd2:1; u64 wq_ats_support:1;
u64 priority:1; u64 priority:1;
u64 occupancy:1; u64 occupancy:1;
u64 occupancy_int:1; u64 occupancy_int:1;
u64 rsvd3:10; u64 op_config:1;
u64 rsvd3:9;
}; };
u64 bits; u64 bits;
} __packed; } __packed;
...@@ -61,10 +65,11 @@ union wq_cap_reg { ...@@ -61,10 +65,11 @@ union wq_cap_reg {
union group_cap_reg { union group_cap_reg {
struct { struct {
u64 num_groups:8; u64 num_groups:8;
u64 total_tokens:8; u64 total_rdbufs:8; /* formerly total_tokens */
u64 token_en:1; u64 rdbuf_ctrl:1; /* formerly token_en */
u64 token_limit:1; u64 rdbuf_limit:1; /* formerly token_limit */
u64 rsvd:46; u64 progress_limit:1; /* descriptor and batch descriptor */
u64 rsvd:45;
}; };
u64 bits; u64 bits;
} __packed; } __packed;
...@@ -87,6 +92,8 @@ struct opcap { ...@@ -87,6 +92,8 @@ struct opcap {
u64 bits[4]; u64 bits[4];
}; };
#define IDXD_MAX_OPCAP_BITS 256U
#define IDXD_OPCAP_OFFSET 0x40 #define IDXD_OPCAP_OFFSET 0x40
#define IDXD_TABLE_OFFSET 0x60 #define IDXD_TABLE_OFFSET 0x60
...@@ -102,10 +109,12 @@ union offsets_reg { ...@@ -102,10 +109,12 @@ union offsets_reg {
u64 bits[2]; u64 bits[2];
} __packed; } __packed;
#define IDXD_TABLE_MULT 0x100
#define IDXD_GENCFG_OFFSET 0x80 #define IDXD_GENCFG_OFFSET 0x80
union gencfg_reg { union gencfg_reg {
struct { struct {
u32 token_limit:8; u32 rdbuf_limit:8;
u32 rsvd:4; u32 rsvd:4;
u32 user_int_en:1; u32 user_int_en:1;
u32 rsvd2:19; u32 rsvd2:19;
...@@ -117,7 +126,8 @@ union gencfg_reg { ...@@ -117,7 +126,8 @@ union gencfg_reg {
union genctrl_reg { union genctrl_reg {
struct { struct {
u32 softerr_int_en:1; u32 softerr_int_en:1;
u32 rsvd:31; u32 halt_int_en:1;
u32 rsvd:30;
}; };
u32 bits; u32 bits;
} __packed; } __packed;
...@@ -151,6 +161,8 @@ enum idxd_device_reset_type { ...@@ -151,6 +161,8 @@ enum idxd_device_reset_type {
#define IDXD_INTC_CMD 0x02 #define IDXD_INTC_CMD 0x02
#define IDXD_INTC_OCCUPY 0x04 #define IDXD_INTC_OCCUPY 0x04
#define IDXD_INTC_PERFMON_OVFL 0x08 #define IDXD_INTC_PERFMON_OVFL 0x08
#define IDXD_INTC_HALT_STATE 0x10
#define IDXD_INTC_INT_HANDLE_REVOKED 0x80000000
#define IDXD_CMD_OFFSET 0xa0 #define IDXD_CMD_OFFSET 0xa0
union idxd_command_reg { union idxd_command_reg {
...@@ -177,8 +189,11 @@ enum idxd_cmd { ...@@ -177,8 +189,11 @@ enum idxd_cmd {
IDXD_CMD_DRAIN_PASID, IDXD_CMD_DRAIN_PASID,
IDXD_CMD_ABORT_PASID, IDXD_CMD_ABORT_PASID,
IDXD_CMD_REQUEST_INT_HANDLE, IDXD_CMD_REQUEST_INT_HANDLE,
IDXD_CMD_RELEASE_INT_HANDLE,
}; };
#define CMD_INT_HANDLE_IMS 0x10000
#define IDXD_CMDSTS_OFFSET 0xa8 #define IDXD_CMDSTS_OFFSET 0xa8
union cmdsts_reg { union cmdsts_reg {
struct { struct {
...@@ -190,6 +205,8 @@ union cmdsts_reg { ...@@ -190,6 +205,8 @@ union cmdsts_reg {
u32 bits; u32 bits;
} __packed; } __packed;
#define IDXD_CMDSTS_ACTIVE 0x80000000 #define IDXD_CMDSTS_ACTIVE 0x80000000
#define IDXD_CMDSTS_ERR_MASK 0xff
#define IDXD_CMDSTS_RES_SHIFT 8
enum idxd_cmdsts_err { enum idxd_cmdsts_err {
IDXD_CMDSTS_SUCCESS = 0, IDXD_CMDSTS_SUCCESS = 0,
...@@ -225,6 +242,8 @@ enum idxd_cmdsts_err { ...@@ -225,6 +242,8 @@ enum idxd_cmdsts_err {
IDXD_CMDSTS_ERR_NO_HANDLE, IDXD_CMDSTS_ERR_NO_HANDLE,
}; };
#define IDXD_CMDCAP_OFFSET 0xb0
#define IDXD_SWERR_OFFSET 0xc0 #define IDXD_SWERR_OFFSET 0xc0
#define IDXD_SWERR_VALID 0x00000001 #define IDXD_SWERR_VALID 0x00000001
#define IDXD_SWERR_OVERFLOW 0x00000002 #define IDXD_SWERR_OVERFLOW 0x00000002
...@@ -270,16 +289,20 @@ union msix_perm { ...@@ -270,16 +289,20 @@ union msix_perm {
union group_flags { union group_flags {
struct { struct {
u32 tc_a:3; u64 tc_a:3;
u32 tc_b:3; u64 tc_b:3;
u32 rsvd:1; u64 rsvd:1;
u32 use_token_limit:1; u64 use_rdbuf_limit:1;
u32 tokens_reserved:8; u64 rdbufs_reserved:8;
u32 rsvd2:4; u64 rsvd2:4;
u32 tokens_allowed:8; u64 rdbufs_allowed:8;
u32 rsvd3:4; u64 rsvd3:4;
u64 desc_progress_limit:2;
u64 rsvd4:2;
u64 batch_progress_limit:2;
u64 rsvd5:26;
}; };
u32 bits; u64 bits;
} __packed; } __packed;
struct grpcfg { struct grpcfg {
...@@ -301,7 +324,8 @@ union wqcfg { ...@@ -301,7 +324,8 @@ union wqcfg {
/* bytes 8-11 */ /* bytes 8-11 */
u32 mode:1; /* shared or dedicated */ u32 mode:1; /* shared or dedicated */
u32 bof:1; /* block on fault */ u32 bof:1; /* block on fault */
u32 rsvd2:2; u32 wq_ats_disable:1;
u32 rsvd2:1;
u32 priority:4; u32 priority:4;
u32 pasid:20; u32 pasid:20;
u32 pasid_en:1; u32 pasid_en:1;
...@@ -332,10 +356,19 @@ union wqcfg { ...@@ -332,10 +356,19 @@ union wqcfg {
/* bytes 28-31 */ /* bytes 28-31 */
u32 rsvd8; u32 rsvd8;
/* bytes 32-63 */
u64 op_config[4];
}; };
u32 bits[8]; u32 bits[16];
} __packed; } __packed;
#define WQCFG_PASID_IDX 2
#define WQCFG_PRIVL_IDX 2
#define WQCFG_OCCUP_IDX 6
#define WQCFG_OCCUP_MASK 0xffff
/* /*
* This macro calculates the offset into the WQCFG register * This macro calculates the offset into the WQCFG register
* idxd - struct idxd * * idxd - struct idxd *
...@@ -354,4 +387,130 @@ union wqcfg { ...@@ -354,4 +387,130 @@ union wqcfg {
#define WQCFG_STRIDES(_idxd_dev) ((_idxd_dev)->wqcfg_size / sizeof(u32)) #define WQCFG_STRIDES(_idxd_dev) ((_idxd_dev)->wqcfg_size / sizeof(u32))
#define GRPCFG_SIZE 64
#define GRPWQCFG_STRIDES 4
/*
* This macro calculates the offset into the GRPCFG register
* idxd - struct idxd *
* n - wq id
* ofs - the index of the 32b dword for the config register
*
* The WQCFG register block is divided into groups per each wq. The n index
* allows us to move to the register group that's for that particular wq.
* Each register is 32bits. The ofs gives us the number of register to access.
*/
#define GRPWQCFG_OFFSET(idxd_dev, n, ofs) ((idxd_dev)->grpcfg_offset +\
(n) * GRPCFG_SIZE + sizeof(u64) * (ofs))
#define GRPENGCFG_OFFSET(idxd_dev, n) ((idxd_dev)->grpcfg_offset + (n) * GRPCFG_SIZE + 32)
#define GRPFLGCFG_OFFSET(idxd_dev, n) ((idxd_dev)->grpcfg_offset + (n) * GRPCFG_SIZE + 40)
/* Following is performance monitor registers */
#define IDXD_PERFCAP_OFFSET 0x0
union idxd_perfcap {
struct {
u64 num_perf_counter:6;
u64 rsvd1:2;
u64 counter_width:8;
u64 num_event_category:4;
u64 global_event_category:16;
u64 filter:8;
u64 rsvd2:8;
u64 cap_per_counter:1;
u64 writeable_counter:1;
u64 counter_freeze:1;
u64 overflow_interrupt:1;
u64 rsvd3:8;
};
u64 bits;
} __packed;
#define IDXD_EVNTCAP_OFFSET 0x80
union idxd_evntcap {
struct {
u64 events:28;
u64 rsvd:36;
};
u64 bits;
} __packed;
struct idxd_event {
union {
struct {
u32 event_category:4;
u32 events:28;
};
u32 val;
};
} __packed;
#define IDXD_CNTRCAP_OFFSET 0x800
struct idxd_cntrcap {
union {
struct {
u32 counter_width:8;
u32 rsvd:20;
u32 num_events:4;
};
u32 val;
};
struct idxd_event events[];
} __packed;
#define IDXD_PERFRST_OFFSET 0x10
union idxd_perfrst {
struct {
u32 perfrst_config:1;
u32 perfrst_counter:1;
u32 rsvd:30;
};
u32 val;
} __packed;
#define IDXD_OVFSTATUS_OFFSET 0x30
#define IDXD_PERFFRZ_OFFSET 0x20
#define IDXD_CNTRCFG_OFFSET 0x100
union idxd_cntrcfg {
struct {
u64 enable:1;
u64 interrupt_ovf:1;
u64 global_freeze_ovf:1;
u64 rsvd1:5;
u64 event_category:4;
u64 rsvd2:20;
u64 events:28;
u64 rsvd3:4;
};
u64 val;
} __packed;
#define IDXD_FLTCFG_OFFSET 0x300
#define IDXD_CNTRDATA_OFFSET 0x200
union idxd_cntrdata {
struct {
u64 event_count_value;
};
u64 val;
} __packed;
union event_cfg {
struct {
u64 event_cat:4;
u64 event_enc:28;
};
u64 val;
} __packed;
union filter_cfg {
struct {
u64 wq:32;
u64 tc:8;
u64 pg_sz:4;
u64 xfer_sz:8;
u64 eng:8;
};
u64 val;
} __packed;
#endif #endif
此差异已折叠。
此差异已折叠。
...@@ -103,8 +103,8 @@ config IOMMU_DMA ...@@ -103,8 +103,8 @@ config IOMMU_DMA
select IRQ_MSI_IOMMU select IRQ_MSI_IOMMU
select NEED_SG_DMA_LENGTH select NEED_SG_DMA_LENGTH
# Shared Virtual Addressing library # Shared Virtual Addressing
config IOMMU_SVA_LIB config IOMMU_SVA
bool bool
select IOASID select IOASID
...@@ -318,7 +318,7 @@ config ARM_SMMU_V3 ...@@ -318,7 +318,7 @@ config ARM_SMMU_V3
config ARM_SMMU_V3_SVA config ARM_SMMU_V3_SVA
bool "Shared Virtual Addressing support for the ARM SMMUv3" bool "Shared Virtual Addressing support for the ARM SMMUv3"
depends on ARM_SMMU_V3 depends on ARM_SMMU_V3
select IOMMU_SVA_LIB select IOMMU_SVA
select MMU_NOTIFIER select MMU_NOTIFIER
help help
Support for sharing process address spaces with devices using the Support for sharing process address spaces with devices using the
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册