- 05 6月, 2021 1 次提交
-
-
由 Eric Huang 提交于
It is to provide more tlb flush types option for different case scenario. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 4月, 2021 6 次提交
-
-
由 Felix Kuehling 提交于
Control whether to build SVM support into amdgpu with a Kconfig option. This makes it easier to disable it in production kernels if this new feature causes problems in production environments. Use "depends on" instead of "select" for DEVICE_PRIVATE, as is recommended for visible options. Reviewed-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Sierra 提交于
Xnack retries are used for page fault recovery. Some AMD chip families support continuously retry while page table entries are invalid. The driver must handle the page fault interrupt and fill in a valid entry for the GPU to continue. This ioctl allows to enable/disable XNACK retries per KFD process. Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
svm range structure stores the range start address, size, attributes, flags, prefetch location and gpu bitmap which indicates which GPU this range maps to. Same virtual address is shared by CPU and GPUs. Process has svm range list which uses both interval tree and list to store all svm ranges registered by the process. Interval tree is used by GPU vm fault handler and CPU page fault handler to get svm range structure from the specific address. List is used to scan all ranges in eviction restore work. No overlap range interval [start, last] exist in svms object interval tree. If process registers new range which has overlap with old range, the old range split into 2 ranges depending on the overlap happens at head or tail part of old range. Apply attributes preferred location, prefetch location, mapping flags, migration granularity to svm range, store mapping gpu index into bitmap. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
Add svm (shared virtual memory) ioctl data structure and API definition. The svm ioctl API is designed to be extensible in the future. All operations are provided by a single IOCTL to preserve ioctl number space. The arguments structure ends with a variable size array of attributes that can be used to set or get one or multiple attributes. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
DRM render node file handles are used for CPU mapping of BOs using mmap by the Thunk. It uses the DRM render node of the GPU where the BO was allocated. DRM allows mmap access automatically when it creates a GEM handle for a BO. KFD BOs don't have GEM handles, so KFD needs to manage access manually. Use drm_vma_node_allow to allow user mode to mmap BOs allocated with kfd_ioctl_alloc_memory_of_gpu through the DRM render node that was used in the kfd_ioctl_acquire_vm call for the same GPU. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap to access the BO through the corresponding file descriptor. The VM can also be extracted from drm_priv, so drm_priv can replace the vm parameter in the kfd2kgd interface. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 4月, 2021 1 次提交
-
-
由 Alex Sierra 提交于
Remove per_device_list from kfd_process and replace it with a kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps to manage the kfd_process_devices binded to a specific kfd_process. Also, functions used by kfd_chardev to iterate over the list were removed, since they are not valid anymore. Instead, it was replaced by a local loop iterating the array. Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NJonathan Kim <jonathan.kim@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 3月, 2021 1 次提交
-
-
由 Jay Cornwall 提交于
Trap handler is set per-process per-device and is unrelated to queue management. Move implementation closer to TMA setup code. Signed-off-by: NJay Cornwall <jay.cornwall@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 12月, 2020 2 次提交
-
-
由 Felix Kuehling 提交于
Release dmabuf reference before returning from kfd_ioctl_import_dmabuf. amdgpu_amdkfd_gpuvm_import_dmabuf takes a reference to the underlying GEM BO and doesn't keep the reference to the dmabuf wrapper. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Release dmabuf reference before returning from kfd_ioctl_import_dmabuf. amdgpu_amdkfd_gpuvm_import_dmabuf takes a reference to the underlying GEM BO and doesn't keep the reference to the dmabuf wrapper. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 9月, 2020 2 次提交
-
-
由 Mukul Joshi 提交于
Move doorbell allocation for a process into kfd device and allocate doorbell space in each PDD during process creation. Currently, KFD manages its own doorbell space but for some devices, amdgpu would allocate the complete doorbell space instead of leaving a chunk of doorbell space for KFD to manage. In a system with mix of such devices, KFD would need to request process doorbell space based on the type of device, either from amdgpu or from its own doorbell space. Signed-off-by: NMukul Joshi <mukul.joshi@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Remember KFD module initializaton status in a global variable. Skip KFD device probing when the module was not initialized. Other amdgpu_amdkfd calls are then protected by the adev->kfd.dev check. Also print a clear error message when KFD disables itself. Amdgpu continues its initialization even when KFD failed. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 8月, 2020 1 次提交
-
-
由 Huang Rui 提交于
We still have a few iommu issues which need to address, so force raven as "dgpu" path for the moment. This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled or ACPI CRAT table not correct. v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2. v3: Align with existed thunk, don't change the way of raven, only renoir will use "dgpu" path by default. v4: don't update global ignore_crat in the driver, and revise fallback function if CRAT is broken. v5: refine acpi crat good but no iommu support case, and rename the title. v6: fix the issue of dGPU initialized firstly, just modify the report value in the node_show(). Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 7月, 2020 1 次提交
-
-
由 Amber Lin 提交于
When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch interface for the user space to register devices and subscribe events they are interested. After registered, the user can use annoymous file descriptor's poll function with wait-time specified and wait for events to happen. Once an event happens, the user can use read() to retrieve information related to the event. VM fault event is done in this patch. v2: - remove UNREGISTER and add event ENABLE/DISABLE - correct kfifo usage - move event message API to kfd_ioctl.h v3: send the event msg in text than in binary v4: support multiple clients v5: move events enablement from ioctl to fd write v6: sparse fix Signed-off-by: NAmber Lin <Amber.Lin@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 5月, 2020 1 次提交
-
-
由 Mukul Joshi 提交于
Track GPU VRAM usage on a per process basis and report it through sysfs. Signed-off-by: NMukul Joshi <mukul.joshi@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 4月, 2020 3 次提交
-
-
由 Joseph Greathouse 提交于
The current GWS usage model will only allows a single GWS-enabled process to be active on the GPU at once. This ensures that a barrier-using kernel gets a known amount of GPU hardware, to prevent deadlock due to inability to go beyond the GWS barrier. The HWS watches how many GWS entries are assigned to each process, and goes into over-subscription mode when two processes need more than the 64 that are available. The current KFD method for working with this is to allocate all 64 GWS entries to each GWS-capable process. When more than one GWS-enabled process is in the runlist, we must make sure the runlist is in over-subscription mode, so that the HWS gets a chained RUN_LIST packet and continues scheduling kernels. Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Joseph Greathouse 提交于
Rather than only enabling GWS support based on the hws_gws_support modparm, also check whether the GPU's HWS firmware supports GWS. Leave the old modparm in place in case users want to test GWS on GPUs not yet in the support list. v2: fix broken syntax from the first patch. Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Add a new kfd ioctl to allocate queue GWS. Queue GWS is released on queue destroy. v2: re-introduce this API with the following fixes squashed in: - drm/amdkfd: fix null pointer dereference on dev - drm/amdkfd: Return proper error code for gws alloc API - drm/amdkfd: Remove GPU ID in GWS queue creation Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 2月, 2020 1 次提交
-
-
由 Yong Zhao 提交于
Given we can query all the asic specific information from amdgpu_gfx_config, we can make get_tile_config() generic. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 2月, 2020 1 次提交
-
-
由 Rajneesh Bhardwaj 提交于
During system suspend the kfd driver aquires a lock that prohibits further kfd actions unless the gpu is resumed. This adds some info which can be useful while debugging. Reviewed-by: NOak Zeng <oak.zeng@amd.com> Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com> Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 1月, 2020 1 次提交
-
-
由 Felix Kuehling 提交于
Use filep->private_data to store a pointer to the kfd_process data structure. Take an extra reference for that, which gets released in the kfd_release callback. Check that the process calling kfd_ioctl is the same that opened the file descriptor. Return -EBADF if it's not, so that this error can be distinguished in user mode. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 11月, 2019 2 次提交
-
-
由 Yong Zhao 提交于
dorbell_off in the queue properties is mainly used for the doorbell dw offset in pci bar. We should not set it to the doorbell byte offset in process doorbell pages. This makes the code much easier to read. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yong Zhao 提交于
The new code uses straightforward bit shifts and thus has better readability. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 10月, 2019 1 次提交
-
-
由 Arnd Bergmann 提交于
The .ioctl and .compat_ioctl file operations have the same prototype so they can both point to the same function, which works great almost all the time when all the commands are compatible. One exception is the s390 architecture, where a compat pointer is only 31 bit wide, and converting it into a 64-bit pointer requires calling compat_ptr(). Most drivers here will never run in s390, but since we now have a generic helper for it, it's easy enough to use it consistently. I double-checked all these drivers to ensure that all ioctl arguments are used as pointers or are ignored, but are not interpreted as integer values. Acked-by: NJason Gunthorpe <jgg@mellanox.com> Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: NDavid Sterba <dsterba@suse.com> Acked-by: NDarren Hart (VMware) <dvhart@infradead.org> Acked-by: NJonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org> Acked-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
- 03 10月, 2019 3 次提交
-
-
由 Yong Zhao 提交于
The code use hex define, so should the printing. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yong Zhao 提交于
Since KFD pasid starts from 0x8000 (32768 in decimal), it is better perceived as a hex number. Meanwhile, change the pasid type from unsigned int to uint16_t to be consistent throughout the code. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yong Zhao 提交于
Currently this function pointer is missing for GFX10. Considering it is a void function since GFX9, fix it by checking the function pointer before dereferencing it. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 8月, 2019 1 次提交
-
-
由 Alex Deucher 提交于
This reverts commit 1a058c33. This interface is still in too much flux. Revert until it's sorted out. Acked-by: NOak Zeng <Oak.Zeng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 22 6月, 2019 1 次提交
-
-
由 Jason A. Donenfeld 提交于
This makes boot uniformly boottime and tai uniformly clocktai, to address the remaining oversights. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NArnd Bergmann <arnd@arndb.de> Link: https://lkml.kernel.org/r/20190621203249.3909-2-Jason@zx2c4.com
-
- 31 5月, 2019 1 次提交
-
-
由 Oak Zeng 提交于
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 5月, 2019 1 次提交
-
-
由 Colin Ian King 提交于
The pointer dev is set to null yet it is being dereferenced when checking dev->dqm->sched_policy. Fix this by performing the check on dev->dqm->sched_policy after dev has been assigned and null checked. Also remove the redundant null assignment to dev. Addresses-Coverity: ("Explicit null dereference") Fixes: 1a058c33 ("drm/amdkfd: New IOCTL to allocate queue GWS") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 5月, 2019 2 次提交
-
-
由 Oak Zeng 提交于
Add a new kfd ioctl to allocate queue GWS. Queue GWS is released on queue destroy. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
TTM doesn't support CPU mapping of sg type bo (under which mmio bo is created). Switch mmaping of mmio page to kfd device file. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Acked-by: NChristian Konig <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 5月, 2019 3 次提交
-
-
由 Oak Zeng 提交于
Existing QUEUE_TYPE_SDMA means PCIe optimized SDMA queues. Introduce a new QUEUE_TYPE_SDMA_XGMI, which is optimized for non-PCIe transfer such as XGMI. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
Fix some spacing issues, log output, uses of !=NULL/==NULL, unneeded extra lines and clean up a declaration from =1 to =true for clarity Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Introduce a new memory type (KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) and expose mmio page of HDP registers to user space through this new memory type. v2: moved remapped hdp regs to adev struct v3: rename the new memory type to ALLOC_MEM_FLAGS_MMIO_REMAP v4: use more generic function name v5: Fail remapped mmio allocation for asics before gfx9 Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 04 1月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 12月, 2018 1 次提交
-
-
由 Felix Kuehling 提交于
On errors, dma_buf_get returns a negative error code, rather than NULL. Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 12月, 2018 1 次提交
-
-
由 Felix Kuehling 提交于
This allows user mode to map doorbell pages into GPUVM address space. That way GPUs can submit to user mode queues (self-dispatch). Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-