- 18 11月, 2021 4 次提交
-
-
由 Graham Sider 提交于
These get funcs simply return an adev field. Replace funcs/calls with direct field accesses instead. Signed-off-by: NGraham Sider <Graham.Sider@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Graham Sider 提交于
Modified definitions: - amdgpu_amdkfd_gpuvm_acquire_process_vm - amdgpu_amdkfd_gpuvm_release_process_vm - amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu - amdgpu_amdkfd_gpuvm_free_memory_of_gpu - amdgpu_amdkfd_gpuvm_map_memory_to_gpu - amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu - amdgpu_amdkfd_gpuvm_sync_memory - amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel - amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel - amdgpu_amdkfd_gpuvm_get_vm_fault_info - amdgpu_amdkfd_gpuvm_import_dmabuf - amdgpu_amdkfd_get_tile_config Removed: - get_amdgpu_device Signed-off-by: NGraham Sider <Graham.Sider@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Graham Sider 提交于
Modified definitions: - amdgpu_amdkfd_get_fw_version - amdgpu_amdkfd_get_local_mem_info - amdgpu_amdkfd_get_gpu_clock_counter - amdgpu_amdkfd_get_max_engine_clock_in_mhz - amdgpu_amdkfd_get_cu_info - amdgpu_amdkfd_get_dmabuf_info - amdgpu_amdkfd_get_vram_usage - amdgpu_amdkfd_get_hive_id - amdgpu_amdkfd_get_unique_id - amdgpu_amdkfd_get_mmio_remap_phys_addr - amdgpu_amdkfd_get_num_gws - amdgpu_amdkfd_get_asic_rev_id - amdgpu_amdkfd_get_noretry - amdgpu_amdkfd_get_xgmi_hops_count - amdgpu_amdkfd_get_xgmi_bandwidth_mbytes - amdgpu_amdkfd_get_pcie_bandwidth_mbytes Also replaces kfd_device_by_kgd with kfd_device_by_adev, now searching via adev rather than kgd. Signed-off-by: NGraham Sider <Graham.Sider@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Graham Sider 提交于
Modified definitions: - program_sh_mem_settings - set_pasid_vmid_mapping - init_interrupts - address_watch_disable - address_watch_execute - wave_control_execute - address_watch_get_offset - get_atc_vmid_pasid_mapping_info - set_scratch_backing_va - set_vm_context_page_table_base - read_vmid_from_vmfault_reg - get_cu_occupancy - program_trap_handler_settings Signed-off-by: NGraham Sider <Graham.Sider@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 10月, 2021 2 次提交
-
-
由 Lang Yu 提交于
Actually, cu_mask has been copied to mqd memory and does't have to persist in queue_properties. Remove it from queue_properties. And use struct mqd_update_info to store such properties, then pass it to update queue operation. v2: * Rename pqm_update_queue to pqm_update_queue_properties. * Rename struct queue_update_info to struct mqd_update_info. * Rename pqm_set_cu_mask to pqm_update_mqd. Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NLang Yu <lang.yu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Lang Yu 提交于
Currently, all kfd BOs use same destruction routine. But pinned BOs are not unpinned properly. Separate them from general routine. v2 (Felix): Add safeguard to prevent user space from freeing signal BO. Kunmap signal BO in the event of setting event page error. Just kunmap signal BO to avoid duplicating the code. Signed-off-by: NLang Yu <lang.yu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 10月, 2021 3 次提交
-
-
由 Alex Sierra 提交于
svm_range_list svms declaration removed to avoid werror when CONFIG_HSA_AMD_SVM is not enabled. Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yifan Zhang 提交于
[ RUN ] KFDSVMRangeTest.PartialUnmapSysMemTest /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:245: Failure Value of: (hsaKmtAllocMemory(m_Node, m_Size, m_Flags, &m_pBuf)) Actual: 1 Expected: HSAKMT_STATUS_SUCCESS Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:248: Failure Value of: (hsaKmtMapMemoryToGPUNodes(m_pBuf, m_Size, __null, mapFlags, 1, &m_Node)) Actual: 1 Expected: HSAKMT_STATUS_SUCCESS Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDTestUtil.cpp:306: Failure Expected: ((void *)__null) != (ptr), actual: NULL vs NULL Segmentation fault (core dumped) [ ] Profile: Full Test [ ] HW capabilities: 0x9 kernel log: [ 102.029150] ret_from_fork+0x22/0x30 [ 102.029158] ---[ end trace 15c34e782714f9a3 ]--- [ 3613.603598] amdgpu: Address: 0x7f7149ccc000 already allocated by SVM [ 3613.610620] show_signal_msg: 27 callbacks suppressed These is race with deferred actions from previous memory map changes (e.g. munmap).Flush pending deffered work to avoid such case. Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Sierra 提交于
[Why] Avoid conflict with address ranges mapped by SVM mechanism that try to be allocated again through ioctl_alloc in the same process. And viceversa. [How] For ioctl_alloc_memory_of_gpu allocations Check if the address range passed into ioctl memory alloc does not exist already in the kfd_process svms->objects interval tree. For SVM allocations Look for the address range into the interval tree VA from the VM inside of each pdds used in a kfd_process. Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 03 8月, 2021 4 次提交
-
-
由 Eric Huang 提交于
It is to workaround HW bug on other Asics and based on reverting two commits back: drm/amdkfd: Add heavy-weight TLB flush after unmapping drm/amdkfd: Add memory sync before TLB flush on unmap Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 4bba567c. Revert reason: The issue has been resolved. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 7ed9876c. Revert reason: The issue has been resolved. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 430f8e6e. Revert reason: Issue has been resolved. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 7月, 2021 3 次提交
-
-
由 Eric Huang 提交于
This reverts commit 4bba567c. Revert reason: The issue has been resolved. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 7ed9876c. Revert reason: The issue has been resolved. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 430f8e6e. Revert reason: Issue has been resolved. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 7月, 2021 6 次提交
-
-
由 Eric Huang 提交于
This reverts commit 1098d658. Reason for revert: it causes regressions on several Asics. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 31f33243. Reason for revert: it causes regressions on several Asics. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 3be4dca1. Reason for revert: it causes regressions on several Asics. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 1098d658. Reason for revert: it causes regressions on several Asics. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 31f33243. Reason for revert: it causes regressions on several Asics. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
This reverts commit 3be4dca1. Reason for revert: it causes regressions on several Asics. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 6月, 2021 1 次提交
-
-
由 Felix Kuehling 提交于
When some GPUs don't support SVM, don't disabe it for the entire process. That would be inconsistent with the information the process got from the topology, which indicates SVM support per GPU. Instead disable SVM support only for the unsupported GPUs. This is done by checking any per-device attributes against the bitmap of supported GPUs. Also use the supported GPU bitmap to initialize access bitmaps for new SVM address ranges. Don't handle recoverable page faults from unsupported GPUs. (I don't think there will be unsupported GPUs that can generate recoverable page faults. But better safe than sorry.) Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 6月, 2021 1 次提交
-
-
由 Eric Huang 提交于
It is to fix a failure for SDMA updating PTEs. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 6月, 2021 3 次提交
-
-
由 Eric Huang 提交于
It is to optimize memory mapping latency, and also aviod a page fault in a corner case of changing valid PDE into PTE. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
It is a part of memory mapping optimization. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Eric Huang 提交于
It is to provide more tlb flush types option for different case scenario. Signed-off-by: NEric Huang <jinhuieric.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 4月, 2021 6 次提交
-
-
由 Felix Kuehling 提交于
Control whether to build SVM support into amdgpu with a Kconfig option. This makes it easier to disable it in production kernels if this new feature causes problems in production environments. Use "depends on" instead of "select" for DEVICE_PRIVATE, as is recommended for visible options. Reviewed-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Sierra 提交于
Xnack retries are used for page fault recovery. Some AMD chip families support continuously retry while page table entries are invalid. The driver must handle the page fault interrupt and fill in a valid entry for the GPU to continue. This ioctl allows to enable/disable XNACK retries per KFD process. Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
svm range structure stores the range start address, size, attributes, flags, prefetch location and gpu bitmap which indicates which GPU this range maps to. Same virtual address is shared by CPU and GPUs. Process has svm range list which uses both interval tree and list to store all svm ranges registered by the process. Interval tree is used by GPU vm fault handler and CPU page fault handler to get svm range structure from the specific address. List is used to scan all ranges in eviction restore work. No overlap range interval [start, last] exist in svms object interval tree. If process registers new range which has overlap with old range, the old range split into 2 ranges depending on the overlap happens at head or tail part of old range. Apply attributes preferred location, prefetch location, mapping flags, migration granularity to svm range, store mapping gpu index into bitmap. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
Add svm (shared virtual memory) ioctl data structure and API definition. The svm ioctl API is designed to be extensible in the future. All operations are provided by a single IOCTL to preserve ioctl number space. The arguments structure ends with a variable size array of attributes that can be used to set or get one or multiple attributes. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
DRM render node file handles are used for CPU mapping of BOs using mmap by the Thunk. It uses the DRM render node of the GPU where the BO was allocated. DRM allows mmap access automatically when it creates a GEM handle for a BO. KFD BOs don't have GEM handles, so KFD needs to manage access manually. Use drm_vma_node_allow to allow user mode to mmap BOs allocated with kfd_ioctl_alloc_memory_of_gpu through the DRM render node that was used in the kfd_ioctl_acquire_vm call for the same GPU. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap to access the BO through the corresponding file descriptor. The VM can also be extracted from drm_priv, so drm_priv can replace the vm parameter in the kfd2kgd interface. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 10 4月, 2021 1 次提交
-
-
由 Alex Sierra 提交于
Remove per_device_list from kfd_process and replace it with a kfd_process_device pointers array of MAX_GPU_INSTANCES size. This helps to manage the kfd_process_devices binded to a specific kfd_process. Also, functions used by kfd_chardev to iterate over the list were removed, since they are not valid anymore. Instead, it was replaced by a local loop iterating the array. Signed-off-by: NAlex Sierra <alex.sierra@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NJonathan Kim <jonathan.kim@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 3月, 2021 1 次提交
-
-
由 Jay Cornwall 提交于
Trap handler is set per-process per-device and is unrelated to queue management. Move implementation closer to TMA setup code. Signed-off-by: NJay Cornwall <jay.cornwall@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 12月, 2020 2 次提交
-
-
由 Felix Kuehling 提交于
Release dmabuf reference before returning from kfd_ioctl_import_dmabuf. amdgpu_amdkfd_gpuvm_import_dmabuf takes a reference to the underlying GEM BO and doesn't keep the reference to the dmabuf wrapper. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Release dmabuf reference before returning from kfd_ioctl_import_dmabuf. amdgpu_amdkfd_gpuvm_import_dmabuf takes a reference to the underlying GEM BO and doesn't keep the reference to the dmabuf wrapper. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 9月, 2020 2 次提交
-
-
由 Mukul Joshi 提交于
Move doorbell allocation for a process into kfd device and allocate doorbell space in each PDD during process creation. Currently, KFD manages its own doorbell space but for some devices, amdgpu would allocate the complete doorbell space instead of leaving a chunk of doorbell space for KFD to manage. In a system with mix of such devices, KFD would need to request process doorbell space based on the type of device, either from amdgpu or from its own doorbell space. Signed-off-by: NMukul Joshi <mukul.joshi@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Remember KFD module initializaton status in a global variable. Skip KFD device probing when the module was not initialized. Other amdgpu_amdkfd calls are then protected by the adev->kfd.dev check. Also print a clear error message when KFD disables itself. Amdgpu continues its initialization even when KFD failed. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NKent Russell <kent.russell@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 8月, 2020 1 次提交
-
-
由 Huang Rui 提交于
We still have a few iommu issues which need to address, so force raven as "dgpu" path for the moment. This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled or ACPI CRAT table not correct. v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2. v3: Align with existed thunk, don't change the way of raven, only renoir will use "dgpu" path by default. v4: don't update global ignore_crat in the driver, and revise fallback function if CRAT is broken. v5: refine acpi crat good but no iommu support case, and rename the title. v6: fix the issue of dGPU initialized firstly, just modify the report value in the node_show(). Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-