- 03 7月, 2019 1 次提交
-
-
由 Jack Xiao 提交于
Just for cleanup. Reviewed-by: NPrike Liang <Prike.Liang@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 7月, 2019 1 次提交
-
-
由 Jack Xiao 提交于
Since amdgpu has always requested PCIE atomics, kfd don't need duplicated PCIE atomics enablement. Referring to amdgpu request result is enough. Signed-off-by: NJack Xiao <Jack.Xiao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 6月, 2019 1 次提交
-
-
由 shaoyunl 提交于
In XGMI configuration, more than one asic can be reset at same time, kfd is able to handle this and no need to trigger the warning Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 22 6月, 2019 2 次提交
-
-
由 Oak Zeng 提交于
Previous kfd doesn't use gws so this mask was set to 0. Set it to 64 bit 1s because now kfd can use all 64 gws resources. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Cox 提交于
KFD (kernel fusion driver) is the kernel driver for the compute backend for usermode compute stack. v2: squash in updates (Alex) v3: squash in rebase fixes (Alex) Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Signed-off-by: NPhilip Cox <Philip.Cox@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 21 6月, 2019 1 次提交
-
-
由 Kent Russell 提交于
Add a folder structure to /sys/class/kfd/kfd/ called proc which contains subfolders, each representing an active KFD process' PID, containing 1 file: pasid. Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 6月, 2019 4 次提交
-
-
由 Oak Zeng 提交于
SDMA queue allocation requires the dqm lock as it modify the global dqm members. Enclose it in the dqm_lock. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
The idea to break the circular lock dependency is to temporarily drop dqm lock before calling allocate_mqd. See callstack #1 below. [ 59.510149] [drm] Initialized amdgpu 3.30.0 20150101 for 0000:04:00.0 on minor 0 [ 513.604034] ====================================================== [ 513.604205] WARNING: possible circular locking dependency detected [ 513.604375] 4.18.0-kfd-root #2 Tainted: G W [ 513.604530] ------------------------------------------------------ [ 513.604699] kswapd0/611 is trying to acquire lock: [ 513.604840] 00000000d254022e (&dqm->lock_hidden){+.+.}, at: evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.605150] but task is already holding lock: [ 513.605307] 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250 [ 513.605540] which lock already depends on the new lock. [ 513.605747] the existing dependency chain (in reverse order) is: [ 513.605944] -> #4 (&anon_vma->rwsem){++++}: [ 513.606106] __vma_adjust+0x147/0x7f0 [ 513.606231] __split_vma+0x179/0x190 [ 513.606353] mprotect_fixup+0x217/0x260 [ 513.606553] do_mprotect_pkey+0x211/0x380 [ 513.606752] __x64_sys_mprotect+0x1b/0x20 [ 513.606954] do_syscall_64+0x50/0x1a0 [ 513.607149] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.607380] -> #3 (&mapping->i_mmap_rwsem){++++}: [ 513.607678] rmap_walk_file+0x1f0/0x280 [ 513.607887] page_referenced+0xdd/0x180 [ 513.608081] shrink_page_list+0x853/0xcb0 [ 513.608279] shrink_inactive_list+0x33b/0x700 [ 513.608483] shrink_node_memcg+0x37a/0x7f0 [ 513.608682] shrink_node+0xd8/0x490 [ 513.608869] balance_pgdat+0x18b/0x3b0 [ 513.609062] kswapd+0x203/0x5c0 [ 513.609241] kthread+0x100/0x140 [ 513.609420] ret_from_fork+0x24/0x30 [ 513.609607] -> #2 (fs_reclaim){+.+.}: [ 513.609883] kmem_cache_alloc_trace+0x34/0x2e0 [ 513.610093] reservation_object_reserve_shared+0x139/0x300 [ 513.610326] ttm_bo_init_reserved+0x291/0x480 [ttm] [ 513.610567] amdgpu_bo_do_create+0x1d2/0x650 [amdgpu] [ 513.610811] amdgpu_bo_create+0x40/0x1f0 [amdgpu] [ 513.611041] amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu] [ 513.611290] amdgpu_bo_create_kernel+0x12/0x70 [amdgpu] [ 513.611584] amdgpu_ttm_init+0x2cb/0x560 [amdgpu] [ 513.611823] gmc_v9_0_sw_init+0x400/0x750 [amdgpu] [ 513.612491] amdgpu_device_init+0x14eb/0x1990 [amdgpu] [ 513.612730] amdgpu_driver_load_kms+0x78/0x290 [amdgpu] [ 513.612958] drm_dev_register+0x111/0x1a0 [ 513.613171] amdgpu_pci_probe+0x11c/0x1e0 [amdgpu] [ 513.613389] local_pci_probe+0x3f/0x90 [ 513.613581] pci_device_probe+0x102/0x1c0 [ 513.613779] driver_probe_device+0x2a7/0x480 [ 513.613984] __driver_attach+0x10a/0x110 [ 513.614179] bus_for_each_dev+0x67/0xc0 [ 513.614372] bus_add_driver+0x1eb/0x260 [ 513.614565] driver_register+0x5b/0xe0 [ 513.614756] do_one_initcall+0xac/0x357 [ 513.614952] do_init_module+0x5b/0x213 [ 513.615145] load_module+0x2542/0x2d30 [ 513.615337] __do_sys_finit_module+0xd2/0x100 [ 513.615541] do_syscall_64+0x50/0x1a0 [ 513.615731] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.615963] -> #1 (reservation_ww_class_mutex){+.+.}: [ 513.616293] amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu] [ 513.616554] init_mqd+0x223/0x260 [amdgpu] [ 513.616779] create_queue_nocpsch+0x4d9/0x600 [amdgpu] [ 513.617031] pqm_create_queue+0x37c/0x520 [amdgpu] [ 513.617270] kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu] [ 513.617522] kfd_ioctl+0x202/0x350 [amdgpu] [ 513.617724] do_vfs_ioctl+0x9f/0x6c0 [ 513.617914] ksys_ioctl+0x66/0x70 [ 513.618095] __x64_sys_ioctl+0x16/0x20 [ 513.618286] do_syscall_64+0x50/0x1a0 [ 513.618476] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.618695] -> #0 (&dqm->lock_hidden){+.+.}: [ 513.618984] __mutex_lock+0x98/0x970 [ 513.619197] evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.619459] kfd_process_evict_queues+0x3b/0xb0 [amdgpu] [ 513.619710] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu] [ 513.620103] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu] [ 513.620363] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu] [ 513.620614] __mmu_notifier_invalidate_range_start+0x70/0xb0 [ 513.620851] try_to_unmap_one+0x7fc/0x8f0 [ 513.621049] rmap_walk_anon+0x121/0x290 [ 513.621242] try_to_unmap+0x93/0xf0 [ 513.621428] shrink_page_list+0x606/0xcb0 [ 513.621625] shrink_inactive_list+0x33b/0x700 [ 513.621835] shrink_node_memcg+0x37a/0x7f0 [ 513.622034] shrink_node+0xd8/0x490 [ 513.622219] balance_pgdat+0x18b/0x3b0 [ 513.622410] kswapd+0x203/0x5c0 [ 513.622589] kthread+0x100/0x140 [ 513.622769] ret_from_fork+0x24/0x30 [ 513.622957] other info that might help us debug this: [ 513.623354] Chain exists of: &dqm->lock_hidden --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem [ 513.623900] Possible unsafe locking scenario: [ 513.624189] CPU0 CPU1 [ 513.624397] ---- ---- [ 513.624594] lock(&anon_vma->rwsem); [ 513.624771] lock(&mapping->i_mmap_rwsem); [ 513.625020] lock(&anon_vma->rwsem); [ 513.625253] lock(&dqm->lock_hidden); [ 513.625433] *** DEADLOCK *** [ 513.625783] 3 locks held by kswapd0/611: [ 513.625967] #0: 00000000f14edf84 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 [ 513.626309] #1: 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250 [ 513.626671] #2: 0000000067b5cd12 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x5/0xb0 [ 513.627037] stack backtrace: [ 513.627292] CPU: 0 PID: 611 Comm: kswapd0 Tainted: G W 4.18.0-kfd-root #2 [ 513.627632] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 513.627990] Call Trace: [ 513.628143] dump_stack+0x7c/0xbb [ 513.628315] print_circular_bug.isra.37+0x21b/0x228 [ 513.628581] __lock_acquire+0xf7d/0x1470 [ 513.628782] ? unwind_next_frame+0x6c/0x4f0 [ 513.628974] ? lock_acquire+0xec/0x1e0 [ 513.629154] lock_acquire+0xec/0x1e0 [ 513.629357] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.629587] __mutex_lock+0x98/0x970 [ 513.629790] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630047] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630309] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630562] evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630816] kfd_process_evict_queues+0x3b/0xb0 [amdgpu] [ 513.631057] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu] [ 513.631288] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu] [ 513.631536] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu] [ 513.632076] __mmu_notifier_invalidate_range_start+0x70/0xb0 [ 513.632299] try_to_unmap_one+0x7fc/0x8f0 [ 513.632487] ? page_lock_anon_vma_read+0x68/0x250 [ 513.632690] rmap_walk_anon+0x121/0x290 [ 513.632875] try_to_unmap+0x93/0xf0 [ 513.633050] ? page_remove_rmap+0x330/0x330 [ 513.633239] ? rcu_read_unlock+0x60/0x60 [ 513.633422] ? page_get_anon_vma+0x160/0x160 [ 513.633613] shrink_page_list+0x606/0xcb0 [ 513.633800] shrink_inactive_list+0x33b/0x700 [ 513.633997] shrink_node_memcg+0x37a/0x7f0 [ 513.634186] ? shrink_node+0xd8/0x490 [ 513.634363] shrink_node+0xd8/0x490 [ 513.634537] balance_pgdat+0x18b/0x3b0 [ 513.634718] kswapd+0x203/0x5c0 [ 513.634887] ? wait_woken+0xb0/0xb0 [ 513.635062] kthread+0x100/0x140 [ 513.635231] ? balance_pgdat+0x3b0/0x3b0 [ 513.635414] ? kthread_delayed_work_timer_fn+0x80/0x80 [ 513.635626] ret_from_fork+0x24/0x30 [ 513.636042] Evicting PASID 32768 queues [ 513.936236] Restoring PASID 32768 queues [ 524.708912] Evicting PASID 32768 queues [ 524.999875] Restoring PASID 32768 queues Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
This reverts commit 06b89b38. This fix is not proper. allocate_mqd can't be moved before allocate_sdma_queue as it depends on q->properties->sdma_id set in later. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
This reverts commit f77dac6c. This fix is not proper. allocate_mqd can't be moved before allocate_sdma_queue as it depends on q->properties->sdma_id set in later. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NPhilip Yang <philip.yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 6月, 2019 1 次提交
-
-
由 Greg Kroah-Hartman 提交于
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 6月, 2019 11 次提交
-
-
由 Oak Zeng 提交于
We can't have devices that are not completely initialized in kfd topology. Otherwise it is a race condition when user access not completely initialized device. This also addresses a kfd_topology_add_device accessing NULL dqm pointer issue. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Move HSA_CAP_ATS_PRESENT initialization logic from kfd iommu codes to kfd topology codes. This removes kfd_iommu_device_init's dependency on kfd_topology_add_device. Also remove duplicate code setting the same. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
SDMA queue allocation requires the dqm lock at it modify the global dqm members. Move up the dqm_lock so sdma queue allocation is enclosed in the critical section. Move mqd allocation out of critical section to avoid circular lock dependency. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
The idea to break the circular lock dependency is to move allocate_mqd out of dqm lock protection. See callstack #1 below. [ 59.510149] [drm] Initialized amdgpu 3.30.0 20150101 for 0000:04:00.0 on minor 0 [ 513.604034] ====================================================== [ 513.604205] WARNING: possible circular locking dependency detected [ 513.604375] 4.18.0-kfd-root #2 Tainted: G W [ 513.604530] ------------------------------------------------------ [ 513.604699] kswapd0/611 is trying to acquire lock: [ 513.604840] 00000000d254022e (&dqm->lock_hidden){+.+.}, at: evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.605150] but task is already holding lock: [ 513.605307] 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250 [ 513.605540] which lock already depends on the new lock. [ 513.605747] the existing dependency chain (in reverse order) is: [ 513.605944] -> #4 (&anon_vma->rwsem){++++}: [ 513.606106] __vma_adjust+0x147/0x7f0 [ 513.606231] __split_vma+0x179/0x190 [ 513.606353] mprotect_fixup+0x217/0x260 [ 513.606553] do_mprotect_pkey+0x211/0x380 [ 513.606752] __x64_sys_mprotect+0x1b/0x20 [ 513.606954] do_syscall_64+0x50/0x1a0 [ 513.607149] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.607380] -> #3 (&mapping->i_mmap_rwsem){++++}: [ 513.607678] rmap_walk_file+0x1f0/0x280 [ 513.607887] page_referenced+0xdd/0x180 [ 513.608081] shrink_page_list+0x853/0xcb0 [ 513.608279] shrink_inactive_list+0x33b/0x700 [ 513.608483] shrink_node_memcg+0x37a/0x7f0 [ 513.608682] shrink_node+0xd8/0x490 [ 513.608869] balance_pgdat+0x18b/0x3b0 [ 513.609062] kswapd+0x203/0x5c0 [ 513.609241] kthread+0x100/0x140 [ 513.609420] ret_from_fork+0x24/0x30 [ 513.609607] -> #2 (fs_reclaim){+.+.}: [ 513.609883] kmem_cache_alloc_trace+0x34/0x2e0 [ 513.610093] reservation_object_reserve_shared+0x139/0x300 [ 513.610326] ttm_bo_init_reserved+0x291/0x480 [ttm] [ 513.610567] amdgpu_bo_do_create+0x1d2/0x650 [amdgpu] [ 513.610811] amdgpu_bo_create+0x40/0x1f0 [amdgpu] [ 513.611041] amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu] [ 513.611290] amdgpu_bo_create_kernel+0x12/0x70 [amdgpu] [ 513.611584] amdgpu_ttm_init+0x2cb/0x560 [amdgpu] [ 513.611823] gmc_v9_0_sw_init+0x400/0x750 [amdgpu] [ 513.612491] amdgpu_device_init+0x14eb/0x1990 [amdgpu] [ 513.612730] amdgpu_driver_load_kms+0x78/0x290 [amdgpu] [ 513.612958] drm_dev_register+0x111/0x1a0 [ 513.613171] amdgpu_pci_probe+0x11c/0x1e0 [amdgpu] [ 513.613389] local_pci_probe+0x3f/0x90 [ 513.613581] pci_device_probe+0x102/0x1c0 [ 513.613779] driver_probe_device+0x2a7/0x480 [ 513.613984] __driver_attach+0x10a/0x110 [ 513.614179] bus_for_each_dev+0x67/0xc0 [ 513.614372] bus_add_driver+0x1eb/0x260 [ 513.614565] driver_register+0x5b/0xe0 [ 513.614756] do_one_initcall+0xac/0x357 [ 513.614952] do_init_module+0x5b/0x213 [ 513.615145] load_module+0x2542/0x2d30 [ 513.615337] __do_sys_finit_module+0xd2/0x100 [ 513.615541] do_syscall_64+0x50/0x1a0 [ 513.615731] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.615963] -> #1 (reservation_ww_class_mutex){+.+.}: [ 513.616293] amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu] [ 513.616554] init_mqd+0x223/0x260 [amdgpu] [ 513.616779] create_queue_nocpsch+0x4d9/0x600 [amdgpu] [ 513.617031] pqm_create_queue+0x37c/0x520 [amdgpu] [ 513.617270] kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu] [ 513.617522] kfd_ioctl+0x202/0x350 [amdgpu] [ 513.617724] do_vfs_ioctl+0x9f/0x6c0 [ 513.617914] ksys_ioctl+0x66/0x70 [ 513.618095] __x64_sys_ioctl+0x16/0x20 [ 513.618286] do_syscall_64+0x50/0x1a0 [ 513.618476] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 513.618695] -> #0 (&dqm->lock_hidden){+.+.}: [ 513.618984] __mutex_lock+0x98/0x970 [ 513.619197] evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.619459] kfd_process_evict_queues+0x3b/0xb0 [amdgpu] [ 513.619710] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu] [ 513.620103] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu] [ 513.620363] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu] [ 513.620614] __mmu_notifier_invalidate_range_start+0x70/0xb0 [ 513.620851] try_to_unmap_one+0x7fc/0x8f0 [ 513.621049] rmap_walk_anon+0x121/0x290 [ 513.621242] try_to_unmap+0x93/0xf0 [ 513.621428] shrink_page_list+0x606/0xcb0 [ 513.621625] shrink_inactive_list+0x33b/0x700 [ 513.621835] shrink_node_memcg+0x37a/0x7f0 [ 513.622034] shrink_node+0xd8/0x490 [ 513.622219] balance_pgdat+0x18b/0x3b0 [ 513.622410] kswapd+0x203/0x5c0 [ 513.622589] kthread+0x100/0x140 [ 513.622769] ret_from_fork+0x24/0x30 [ 513.622957] other info that might help us debug this: [ 513.623354] Chain exists of: &dqm->lock_hidden --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem [ 513.623900] Possible unsafe locking scenario: [ 513.624189] CPU0 CPU1 [ 513.624397] ---- ---- [ 513.624594] lock(&anon_vma->rwsem); [ 513.624771] lock(&mapping->i_mmap_rwsem); [ 513.625020] lock(&anon_vma->rwsem); [ 513.625253] lock(&dqm->lock_hidden); [ 513.625433] *** DEADLOCK *** [ 513.625783] 3 locks held by kswapd0/611: [ 513.625967] #0: 00000000f14edf84 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 [ 513.626309] #1: 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250 [ 513.626671] #2: 0000000067b5cd12 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x5/0xb0 [ 513.627037] stack backtrace: [ 513.627292] CPU: 0 PID: 611 Comm: kswapd0 Tainted: G W 4.18.0-kfd-root #2 [ 513.627632] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 513.627990] Call Trace: [ 513.628143] dump_stack+0x7c/0xbb [ 513.628315] print_circular_bug.isra.37+0x21b/0x228 [ 513.628581] __lock_acquire+0xf7d/0x1470 [ 513.628782] ? unwind_next_frame+0x6c/0x4f0 [ 513.628974] ? lock_acquire+0xec/0x1e0 [ 513.629154] lock_acquire+0xec/0x1e0 [ 513.629357] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.629587] __mutex_lock+0x98/0x970 [ 513.629790] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630047] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630309] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630562] evict_process_queues_nocpsch+0x26/0x140 [amdgpu] [ 513.630816] kfd_process_evict_queues+0x3b/0xb0 [amdgpu] [ 513.631057] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu] [ 513.631288] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu] [ 513.631536] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu] [ 513.632076] __mmu_notifier_invalidate_range_start+0x70/0xb0 [ 513.632299] try_to_unmap_one+0x7fc/0x8f0 [ 513.632487] ? page_lock_anon_vma_read+0x68/0x250 [ 513.632690] rmap_walk_anon+0x121/0x290 [ 513.632875] try_to_unmap+0x93/0xf0 [ 513.633050] ? page_remove_rmap+0x330/0x330 [ 513.633239] ? rcu_read_unlock+0x60/0x60 [ 513.633422] ? page_get_anon_vma+0x160/0x160 [ 513.633613] shrink_page_list+0x606/0xcb0 [ 513.633800] shrink_inactive_list+0x33b/0x700 [ 513.633997] shrink_node_memcg+0x37a/0x7f0 [ 513.634186] ? shrink_node+0xd8/0x490 [ 513.634363] shrink_node+0xd8/0x490 [ 513.634537] balance_pgdat+0x18b/0x3b0 [ 513.634718] kswapd+0x203/0x5c0 [ 513.634887] ? wait_woken+0xb0/0xb0 [ 513.635062] kthread+0x100/0x140 [ 513.635231] ? balance_pgdat+0x3b0/0x3b0 [ 513.635414] ? kthread_delayed_work_timer_fn+0x80/0x80 [ 513.635626] ret_from_fork+0x24/0x30 [ 513.636042] Evicting PASID 32768 queues [ 513.936236] Restoring PASID 32768 queues [ 524.708912] Evicting PASID 32768 queues [ 524.999875] Restoring PASID 32768 queues Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Introduce a new mqd allocation interface and split the original init_mqd function into two functions: allocate_mqd and init_mqd. Also renamed uninit_mqd to free_mqd. This is preparation work to fix a circular lock dependency. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
This is prepare work to fix a circular lock dependency. No logic change Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Also calls load_mqd with current->mm struct. The mm struct is used to read back user wptr of the queue. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Don't do the same for compute queues Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jay Cornwall 提交于
Ported from gfx8, no changes in register setup. Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Translate queue priority into pipe priority and write to MQDs. The priority values are used to perform queue and pipe arbitration. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Always mark evicted queues with q->properties.is_evicted = true, even queues that are inactive for other reason. This simplifies maintaining the eviction state as it doesn't require updating is_evicted when other queue activation conditions change. On the other hand, we now need to check those other queue activation conditions whenever an evicted queues is restored. To minimize code duplication, move the queue activation check into a macro so it can be maintained in one central place. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NPhilip Cox <Philip.Cox@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 31 5月, 2019 1 次提交
-
-
由 Oak Zeng 提交于
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 5月, 2019 1 次提交
-
-
由 Colin Ian King 提交于
The pointer dev is set to null yet it is being dereferenced when checking dev->dqm->sched_policy. Fix this by performing the check on dev->dqm->sched_policy after dev has been assigned and null checked. Also remove the redundant null assignment to dev. Addresses-Coverity: ("Explicit null dereference") Fixes: 1a058c33 ("drm/amdkfd: New IOCTL to allocate queue GWS") Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 5月, 2019 6 次提交
-
-
由 Oak Zeng 提交于
Add a field in map_queues packet to indicate whether this is a gws control queue. Only one queue per process can be gws control queue. Change num_gws field in map_process packet to 7 bits Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Add a new kfd ioctl to allocate queue GWS. Queue GWS is released on queue destroy. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Add functions in process queue manager to set/unset queue gws. Also set process's number of gws used. Currently only one queue in process can use and use all gws. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
On device initialization, KFD allocates all (64) gws which is shared by all KFD processes. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Add amdgpu_amdkfd interface to get num_gws and add num_gws to /sys/class/kfd/kfd/topology/nodes/x/properties. Only report num_gws if MEC FW support GWS barriers. Currently it is determined by a module parameter which will be replaced with MEC FW version check when firmware is ready. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
TTM doesn't support CPU mapping of sg type bo (under which mmio bo is created). Switch mmaping of mmio page to kfd device file. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Acked-by: NChristian Konig <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 5月, 2019 10 次提交
-
-
由 Amber Lin 提交于
A multi-socket server can have multiple PCIe segments so BFD is not enough to distingush each GPU. Also add domain number into account when generating gpu_id. Signed-off-by: NAmber Lin <Amber.Lin@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
Add the VegaM information to KFD Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Fix a circular lock dependency exposed under userptr memory pressure. The DQM lock is the only one taken inside the MMU notifier. We need to make sure that no reclaim is done under this lock, and that no other locks are taken under which reclaim is possible. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NPhilip Yang <Philip.Yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Alloc format was never really supported by MEC FW. FW always does one per pipe allocation. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Expose available numbers of both SDMA queue types in the topology. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Existing QUEUE_TYPE_SDMA means PCIe optimized SDMA queues. Introduce a new QUEUE_TYPE_SDMA_XGMI, which is optimized for non-PCIe transfer such as XGMI. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Previous codes assumes there are two sdma engines. This is not true e.g., Raven only has 1 SDMA engine. Fix the issue by using sdma engine number info in device_info. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yong Zhao 提交于
This avoids duplicated code. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
Instead of allocat hiq and sdma mqd from sub-allocator, allocate them from a mqd trunk pool. This is done for all asics Signed-off-by: NOak Zeng <ozeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
This is preparation work to introduce more mqd allocation scheme Signed-off-by: NOak Zeng <ozeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-