- 10 8月, 2021 1 次提交
-
-
由 Sai Prakash Ranjan 提交于
Some clocks for SMMU can have parent as XO such as gpu_cc_hub_cx_int_clk of GPU SMMU in QTI SC7280 SoC and in order to enter deep sleep states in such cases, we would need to drop the XO clock vote in unprepare call and this unprepare callback for XO is in RPMh (Resource Power Manager-Hardened) clock driver which controls RPMh managed clock resources for new QTI SoCs. Given we cannot have a sleeping calls such as clk_bulk_prepare() and clk_bulk_unprepare() in arm-smmu runtime pm callbacks since the iommu operations like map and unmap can be in atomic context and are in fast path, add this prepare and unprepare call to drop the XO vote only for system pm callbacks since it is not a fast path and we expect the system to enter deep sleep states with system pm as opposed to runtime pm. This is a similar sequence of clock requests (prepare,enable and disable,unprepare) in arm-smmu probe and remove. Signed-off-by: NSai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Co-developed-by: NRajendra Nayak <rnayak@codeaurora.org> Signed-off-by: NRajendra Nayak <rnayak@codeaurora.org> Link: https://lore.kernel.org/r/20210810064808.32486-1-saiprakash.ranjan@codeaurora.orgSigned-off-by: NWill Deacon <will@kernel.org>
-
- 02 8月, 2021 1 次提交
-
-
由 John Garry 提交于
Members of struct "llq" will be zero-inited, apart from member max_n_shift. But we write llq.val straight after the init, so it was pointless to zero init those other members. As such, separately init member max_n_shift only. In addition, struct "head" is initialised to "llq" only so that member max_n_shift is set. But that member is never referenced for "head", so remove any init there. Removing these initializations is seen as a small performance optimisation, as this code is (very) hot path. Signed-off-by: NJohn Garry <john.garry@huawei.com> Link: https://lore.kernel.org/r/1624293394-202509-1-git-send-email-john.garry@huawei.comSigned-off-by: NWill Deacon <will@kernel.org>
-
- 14 7月, 2021 5 次提交
-
-
由 Benjamin Gaignard 提交于
Restore bits 39 to 32 at correct position. It reverses the operation done in rk_dma_addr_dte_v2(). Fixes: c55356c5 ("iommu: rockchip: Add support for iommu v2") Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NBenjamin Gaignard <benjamin.gaignard@collabora.com> Link: https://lore.kernel.org/r/20210712101232.318589-1-benjamin.gaignard@collabora.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
The commit 2b0140c6 ("iommu/vt-d: Use pci_real_dma_dev() for mapping") fixes an issue of "sub-device is removed where the context entry is cleared for all aliases". But this commit didn't consider the PASID entry and PASID table in VT-d scalable mode. This fix increases the coverage of scalable mode. Suggested-by: NSanjay Kumar <sanjay.k.kumar@intel.com> Fixes: 8038bdb8 ("iommu/vt-d: Only clear real DMA device's context entries") Fixes: 2b0140c6 ("iommu/vt-d: Use pci_real_dma_dev() for mapping") Cc: stable@vger.kernel.org # v5.6+ Cc: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210712071712.3416949-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Sanjay Kumar 提交于
This fixes a bug in context cache clear operation. The code was not following the correct invalidation flow. A global device TLB invalidation should be added after the IOTLB invalidation. At the same time, it uses the domain ID from the context entry. But in scalable mode, the domain ID is in PASID table entry, not context entry. Fixes: 7373a8cc ("iommu/vt-d: Setup context and enable RID2PASID support") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: NSanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210712071315.3416543-1-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Marek Szyprowski 提交于
QCOM IOMMU driver calls bus_set_iommu() for every IOMMU device controller, what fails for the second and latter IOMMU devices. This is intended and must be not fatal to the driver registration process. Also the cleanup path should take care of the runtime PM state, what is missing in the current patch. Revert relevant changes to the QCOM IOMMU driver until a proper fix is prepared. This partially reverts commit 249c9dc6. Fixes: 249c9dc6 ("iommu/arm: Cleanup resources in case of probe error path") Suggested-by: NWill Deacon <will@kernel.org> Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Acked-by: NWill Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210705065657.30356-1-m.szyprowski@samsung.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Gustavo A. R. Silva 提交于
Fix the following fallthrough warning (arm64-randconfig with Clang): drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:382:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] Reported-by: Nkernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%25lkp@intel.com/Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org>
-
- 25 6月, 2021 3 次提交
-
-
由 Jean-Philippe Brucker 提交于
With the VIOT support in place, x86 platforms can now use the virtio-iommu. Because the other x86 IOMMU drivers aren't yet ready to use the acpi_dma_setup() path, x86 doesn't implement arch_setup_dma_ops() at the moment. Similarly to Vt-d and AMD IOMMU, clear the DMA ops and call iommu_setup_dma_ops() from probe_finalize(). Acked-by: NJoerg Roedel <jroedel@suse.de> Acked-by: NMichael S. Tsirkin <mst@redhat.com> Tested-by: NEric Auger <eric.auger@redhat.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/r/20210618152059.1194210-6-jean-philippe@linaro.orgSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Jean-Philippe Brucker 提交于
Passing a 64-bit address width to iommu_setup_dma_ops() is valid on virtual platforms, but isn't currently possible. The overflow check in iommu_dma_init_domain() prevents this even when @dma_base isn't 0. Pass a limit address instead of a size, so callers don't have to fake a size to work around the check. The base and limit parameters are being phased out, because: * they are redundant for x86 callers. dma-iommu already reserves the first page, and the upper limit is already in domain->geometry. * they can now be obtained from dev->dma_range_map on Arm. But removing them on Arm isn't completely straightforward so is left for future work. As an intermediate step, simplify the x86 callers by passing dummy limits. Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: NEric Auger <eric.auger@redhat.com> Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210618152059.1194210-5-jean-philippe@linaro.orgSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Jean-Philippe Brucker 提交于
The ACPI Virtual I/O Translation Table describes topology of para-virtual platforms, similarly to vendor tables DMAR, IVRS and IORT. For now it describes the relation between virtio-iommu and the endpoints it manages. Three steps are needed to configure DMA of endpoints: (1) acpi_viot_init(): parse the VIOT table, find or create the fwnode associated to each vIOMMU device. This needs to happen after acpi_scan_init(), because it relies on the struct device and their fwnode to be available. (2) When probing the vIOMMU device, the driver registers its IOMMU ops within the IOMMU subsystem. This step doesn't require any intervention from the VIOT driver. (3) viot_iommu_configure(): before binding the endpoint to a driver, find the associated IOMMU ops. Register them, along with the endpoint ID, into the device's iommu_fwspec. If step (3) happens before step (2), it is deferred until the IOMMU is initialized, then retried. Tested-by: NEric Auger <eric.auger@redhat.com> Reviewed-by: NEric Auger <eric.auger@redhat.com> Signed-off-by: NJean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: NRafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20210618152059.1194210-4-jean-philippe@linaro.orgSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 23 6月, 2021 3 次提交
-
-
由 Rob Clark 提交于
Add, via the adreno-smmu-priv interface, a way for the GPU to request the SMMU to stall translation on faults, and then later resume the translation, either retrying or terminating the current translation. This will be used on the GPU side to "freeze" the GPU while we snapshot useful state for devcoredump. Signed-off-by: NRob Clark <robdclark@chromium.org> Acked-by: NJordan Crouse <jordan@cosmicpenguin.net> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210610214431.539029-5-robdclark@gmail.comSigned-off-by: NRob Clark <robdclark@chromium.org>
-
由 Jordan Crouse 提交于
Add a callback in adreno-smmu-priv to read interesting SMMU registers to provide an opportunity for a richer debug experience in the GPU driver. Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRob Clark <robdclark@chromium.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210610214431.539029-3-robdclark@gmail.comSigned-off-by: NRob Clark <robdclark@chromium.org>
-
由 Jordan Crouse 提交于
Call report_iommu_fault() to allow upper-level drivers to register their own fault handlers. Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRob Clark <robdclark@chromium.org> Acked-by: NWill Deacon <will@kernel.org> Reviewed-by: NBjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20210610214431.539029-2-robdclark@gmail.comSigned-off-by: NRob Clark <robdclark@chromium.org>
-
- 18 6月, 2021 1 次提交
-
-
由 Colin Ian King 提交于
The assignment of iommu from info->iommu occurs before info is null checked hence leading to a potential null pointer dereference issue. Fix this by assigning iommu and checking if iommu is null after null checking info. Addresses-Coverity: ("Dereference before null check") Fixes: 4c82b886 ("iommu/vt-d: Allocate/register iopf queue for sva devices") Signed-off-by: NColin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210611135024.32781-1-colin.king@canonical.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 16 6月, 2021 2 次提交
-
-
由 Will Deacon 提交于
Commit 0d97174a ("iommu/arm-smmu: Implement ->probe_finalize()") added a new optional ->probe_finalize callback to 'struct arm_smmu_impl' but neglected to check that 'smmu->impl' is present prior to checking if the new callback is present. Add the missing check, which avoids dereferencing NULL when probing an SMMU which doesn't require any implementation-specific callbacks: | Unable to handle kernel NULL pointer dereference at virtual address | 0000000000000070 | | Call trace: | arm_smmu_probe_finalize+0x14/0x48 | of_iommu_configure+0xe4/0x1b8 | of_dma_configure_id+0xf8/0x2d8 | pci_dma_configure+0x44/0x88 | really_probe+0xc0/0x3c0 Fixes: 0d97174a ("iommu/arm-smmu: Implement ->probe_finalize()") Reported-by: NMarek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NWill Deacon <will@kernel.org>
-
由 Zhen Lei 提交于
Fixes scripts/checkpatch.pl warning: WARNING: Possible unnecessary 'out of memory' message Remove it can help us save a bit of memory. Signed-off-by: NZhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210609125438.14369-1-thunder.leizhen@huawei.comSigned-off-by: NWill Deacon <will@kernel.org>
-
- 11 6月, 2021 5 次提交
-
-
由 Xiyu Yang 提交于
The reference counting issue happens in several exception handling paths of arm_smmu_iova_to_phys_hard(). When those error scenarios occur, the function forgets to decrease the refcount of "smmu" increased by arm_smmu_rpm_get(), causing a refcount leak. Fix this issue by jumping to "out" label when those error scenarios occur. Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: NXin Tan <tanxin.ctf@gmail.com> Reviewed-by: NRob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/1623293391-17261-1-git-send-email-xiyuyang19@fudan.edu.cnSigned-off-by: NWill Deacon <will@kernel.org>
-
由 Xiyu Yang 提交于
arm_smmu_rpm_get() invokes pm_runtime_get_sync(), which increases the refcount of the "smmu" even though the return value is less than 0. The reference counting issue happens in some error handling paths of arm_smmu_rpm_get() in its caller functions. When arm_smmu_rpm_get() fails, the caller functions forget to decrease the refcount of "smmu" increased by arm_smmu_rpm_get(), causing a refcount leak. Fix this issue by calling pm_runtime_resume_and_get() instead of pm_runtime_get_sync() in arm_smmu_rpm_get(), which can keep the refcount balanced in case of failure. Signed-off-by: NXiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: NXin Tan <tanxin.ctf@gmail.com> Link: https://lore.kernel.org/r/1623293672-17954-1-git-send-email-xiyuyang19@fudan.edu.cnSigned-off-by: NWill Deacon <will@kernel.org>
-
由 Thierry Reding 提交于
Tegra186 requires the same SID override programming as Tegra194 in order to seamlessly transition from the firmware framebuffer to the Linux framebuffer, so the Tegra implementation needs to be used on Tegra186 devices as well. Signed-off-by: NThierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210603164632.1000458-7-thierry.reding@gmail.comSigned-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
-
由 Thierry Reding 提交于
The secure firmware keeps some SID override registers set as passthrough in order to allow devices such as the display controller to operate with no knowledge of SMMU translations until an operating system driver takes over. This is needed in order to seamlessly transition from the firmware framebuffer to the OS framebuffer. Upon successfully attaching a device to the SMMU and in the process creating identity mappings for memory regions that are being accessed, the Tegra implementation will call into the memory controller driver to program the override SIDs appropriately. Signed-off-by: NThierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210603164632.1000458-6-thierry.reding@gmail.comSigned-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
-
由 Thierry Reding 提交于
Parse the reg property in device tree and detect the number of instances represented by a device tree node. This is subsequently needed in order to support single-instance SMMUs with the Tegra implementation because additional programming is needed to properly configure the SID override registers in the memory controller. Signed-off-by: NThierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20210603164632.1000458-5-thierry.reding@gmail.comSigned-off-by: NKrzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
-
- 10 6月, 2021 19 次提交
-
-
由 Joerg Roedel 提交于
A recent commit broke the build on 32-bit x86. The linker throws these messages: ld: drivers/iommu/intel/perf.o: in function `dmar_latency_snapshot': perf.c:(.text+0x40c): undefined reference to `__udivdi3' ld: perf.c:(.text+0x458): undefined reference to `__udivdi3' The reason are the 64-bit divides in dmar_latency_snapshot(). Use the div_u64() helper function for those. Fixes: 55ee5e67 ("iommu/vt-d: Add common code for dmar latency performance monitors") Signed-off-by: NJoerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20210610083120.29224-1-joro@8bytes.org
-
由 Parav Pandit 提交于
Page directory assignment by alloc_pgtable_page() or phys_to_virt() doesn't need typecasting as both routines return void*. Hence, remove typecasting from both the calls. Signed-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-24-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Parav Pandit 提交于
No need for braces for single line statement under if() block. Signed-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-22-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Parav Pandit 提交于
DMAR domain uses per DMAR refcount. It is indexed by iommu seq_id. Older iommu_count is only incremented and decremented but no decisions are taken based on this refcount. This is not of much use. Hence, remove iommu_count and further simplify domain_detach_iommu() by returning void. Signed-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-21-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Parav Pandit 提交于
IOTLB device presence, iommu coherency and snooping are boolean capabilities. Use them as bits and keep them adjacent. Structure layout before the reorg. $ pahole -C dmar_domain drivers/iommu/intel/dmar.o struct dmar_domain { int nid; /* 0 4 */ unsigned int iommu_refcnt[128]; /* 4 512 */ /* --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- */ u16 iommu_did[128]; /* 516 256 */ /* --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- */ bool has_iotlb_device; /* 772 1 */ /* XXX 3 bytes hole, try to pack */ struct list_head devices; /* 776 16 */ struct list_head subdevices; /* 792 16 */ struct iova_domain iovad __attribute__((__aligned__(8))); /* 808 2320 */ /* --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- */ struct dma_pte * pgd; /* 3128 8 */ /* --- cacheline 49 boundary (3136 bytes) --- */ int gaw; /* 3136 4 */ int agaw; /* 3140 4 */ int flags; /* 3144 4 */ int iommu_coherency; /* 3148 4 */ int iommu_snooping; /* 3152 4 */ int iommu_count; /* 3156 4 */ int iommu_superpage; /* 3160 4 */ /* XXX 4 bytes hole, try to pack */ u64 max_addr; /* 3168 8 */ u32 default_pasid; /* 3176 4 */ /* XXX 4 bytes hole, try to pack */ struct iommu_domain domain; /* 3184 72 */ /* size: 3256, cachelines: 51, members: 18 */ /* sum members: 3245, holes: 3, sum holes: 11 */ /* forced alignments: 1 */ /* last cacheline: 56 bytes */ } __attribute__((__aligned__(8))); After arranging it for natural padding and to make flags as u8 bits, it saves 8 bytes for the struct. struct dmar_domain { int nid; /* 0 4 */ unsigned int iommu_refcnt[128]; /* 4 512 */ /* --- cacheline 8 boundary (512 bytes) was 4 bytes ago --- */ u16 iommu_did[128]; /* 516 256 */ /* --- cacheline 12 boundary (768 bytes) was 4 bytes ago --- */ u8 has_iotlb_device:1; /* 772: 0 1 */ u8 iommu_coherency:1; /* 772: 1 1 */ u8 iommu_snooping:1; /* 772: 2 1 */ /* XXX 5 bits hole, try to pack */ /* XXX 3 bytes hole, try to pack */ struct list_head devices; /* 776 16 */ struct list_head subdevices; /* 792 16 */ struct iova_domain iovad __attribute__((__aligned__(8))); /* 808 2320 */ /* --- cacheline 48 boundary (3072 bytes) was 56 bytes ago --- */ struct dma_pte * pgd; /* 3128 8 */ /* --- cacheline 49 boundary (3136 bytes) --- */ int gaw; /* 3136 4 */ int agaw; /* 3140 4 */ int flags; /* 3144 4 */ int iommu_count; /* 3148 4 */ int iommu_superpage; /* 3152 4 */ /* XXX 4 bytes hole, try to pack */ u64 max_addr; /* 3160 8 */ u32 default_pasid; /* 3168 4 */ /* XXX 4 bytes hole, try to pack */ struct iommu_domain domain; /* 3176 72 */ /* size: 3248, cachelines: 51, members: 18 */ /* sum members: 3236, holes: 3, sum holes: 11 */ /* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 5 bits */ /* forced alignments: 1 */ /* last cacheline: 48 bytes */ } __attribute__((__aligned__(8))); Signed-off-by: NParav Pandit <parav@nvidia.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210530075053.264218-1-parav@nvidia.com Link: https://lore.kernel.org/r/20210610020115.1637656-20-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 YueHaibing 提交于
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(), which makes the code a bit shorter and easier to read. Signed-off-by: NYueHaibing <yuehaibing@huawei.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210528130229.22108-1-yuehaibing@huawei.com Link: https://lore.kernel.org/r/20210610020115.1637656-19-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Gustavo A. R. Silva 提交于
Replace a couple of calls to memcpy() with simple assignments in order to fix the following out-of-bounds warning: drivers/iommu/intel/svm.c:1198:4: warning: 'memcpy' offset [25, 32] from the object at 'desc' is out of the bounds of referenced subobject 'qw2' with type 'long long unsigned int' at offset 16 [-Warray-bounds] The problem is that the original code is trying to copy data into a couple of struct members adjacent to each other in a single call to memcpy(). This causes a legitimate compiler warning because memcpy() overruns the length of &desc.qw2 and &resp.qw2, respectively. This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109Signed-off-by: NGustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210414201403.GA392764@embeddedor Link: https://lore.kernel.org/r/20210610020115.1637656-18-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
The execution time for page fault request handling is performance critical and needs to be monitored. This adds code to sample the execution time of page fault request handling. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-17-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
Queued invalidation execution time is performance critical and needs to be monitored. This adds code to sample the execution time of IOTLB/ devTLB/ICE cache invalidation. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-16-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
A debugfs interface /sys/kernel/debug/iommu/intel/dmar_perf_latency is created to control and show counts of execution time ranges for various types per DMAR. The interface may help debug any potential performance issue. By default, the interface is disabled. Possible write value of /sys/kernel/debug/iommu/intel/dmar_perf_latency 0 - disable sampling all latency data 1 - enable sampling IOTLB invalidation latency data 2 - enable sampling devTLB invalidation latency data 3 - enable sampling intr entry cache invalidation latency data 4 - enable sampling prq handling latency data Read /sys/kernel/debug/iommu/intel/dmar_perf_latency gives a snapshot of sampling result of all enabled monitors. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-15-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
The execution time of some operations is very performance critical, such as cache invalidation and PRQ processing time. This adds some common code to monitor the execution time range of those operations. The interfaces include enabling/disabling, checking status, updating sampling data and providing a common string format for users. Signed-off-by: NFenghua Yu <fenghua.yu@intel.com> Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-14-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
This adds a new trace event to track the page fault request report. This event will provide almost all information defined in a page request descriptor. A sample output: | prq_report: dmar0/0000:00:0a.0 seq# 1: rid=0x50 addr=0x559ef6f97 r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 2: rid=0x50 addr=0x559ef6f9c rw--l pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 3: rid=0x50 addr=0x559ef6f98 r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 4: rid=0x50 addr=0x559ef6f9d rw--l pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 5: rid=0x50 addr=0x559ef6f99 r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 6: rid=0x50 addr=0x559ef6f9e rw--l pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 7: rid=0x50 addr=0x559ef6f9a r---- pasid=0x2 index=0x1 | prq_report: dmar0/0000:00:0a.0 seq# 8: rid=0x50 addr=0x559ef6f9f rw--l pasid=0x2 index=0x1 This will be helpful for I/O page fault related debugging. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-13-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
Let the IO page fault requests get handled through the io-pgfault framework. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-12-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
This allocates and registers the iopf queue infrastructure for devices which want to support IO page fault for SVA. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-11-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
Refactor prq_event_thread() by moving handling single prq event out of the main loop. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-10-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
It's common to iterate the svm device list and find a matched device. Add common helpers to do this and consolidate the code. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-9-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
Align the pasid alloc/free code with the generic helpers defined in the iommu core. This also refactored the SVA binding code to improve the readability. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-8-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
We are about to use iommu_sva_alloc/free_pasid() helpers in iommu core. That means the pasid life cycle will be managed by iommu core. Use a local array to save the per pasid private data instead of attaching it the real pasid. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210520031531.712333-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-7-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Lu Baolu 提交于
Current VT-d implementation supports nested translation only if all underlying IOMMUs support the nested capability. This is unnecessary as the upper layer is allowed to create different containers and set them with different type of iommu backend. The IOMMU driver needs to guarantee that devices attached to a nested mode iommu_domain should support nested capabilility. Signed-off-by: NLu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210517065701.5078-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210610020115.1637656-6-baolu.lu@linux.intel.comSigned-off-by: NJoerg Roedel <jroedel@suse.de>
-