1. 24 6月, 2022 1 次提交
  2. 23 6月, 2022 2 次提交
    • P
      drm/amdkfd: Free queue after unmap queue success · ab8529b0
      Philip Yang 提交于
      After queue unmap or remove from MES successfully, free queue sysfs
      entries, doorbell and remove from queue list. Otherwise, application may
      destroy queue again, cause below kernel warning or crash backtrace.
      
      For outstanding queues, either application forget to destroy or failed
      to destroy, kfd_process_notifier_release will remove queue sysfs
      entries, kfd_process_wq_release will free queue doorbell.
      
      v2: decrement_queue_count for MES queue
      
       refcount_t: underflow; use-after-free.
       WARNING: CPU: 7 PID: 3053 at lib/refcount.c:28
        Call Trace:
         kobject_put+0xd6/0x1a0
         kfd_procfs_del_queue+0x27/0x30 [amdgpu]
         pqm_destroy_queue+0xeb/0x240 [amdgpu]
         kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu]
         kfd_ioctl+0x27d/0x500 [amdgpu]
         do_syscall_64+0x35/0x80
      
       WARNING: CPU: 2 PID: 3053 at drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device_queue_manager.c:400
        Call Trace:
         deallocate_doorbell.isra.0+0x39/0x40 [amdgpu]
         destroy_queue_cpsch+0xb3/0x270 [amdgpu]
         pqm_destroy_queue+0x108/0x240 [amdgpu]
         kfd_ioctl_destroy_queue+0x32/0x70 [amdgpu]
         kfd_ioctl+0x27d/0x500 [amdgpu]
      
       general protection fault, probably for non-canonical address
      0xdead000000000108:
       Call Trace:
        pqm_destroy_queue+0xf0/0x200 [amdgpu]
        kfd_ioctl_destroy_queue+0x2f/0x60 [amdgpu]
        kfd_ioctl+0x19b/0x600 [amdgpu]
      Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
      Reviewed-by: NGraham Sider <Graham.Sider@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      ab8529b0
    • P
      drm/amdkfd: Add queue to MES if it becomes active · f4f9b827
      Philip Yang 提交于
      We remove the user queue from MES scheduler to update queue properties.
      If the queue becomes active after updating, add the user queue to MES
      scheduler, to be able to handle command packet submission.
      
      v2: don't break pqm_set_gws
      Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
      Reviewed-by: NGraham Sider <Graham.Sider@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      f4f9b827
  3. 16 5月, 2022 1 次提交
  4. 04 5月, 2022 1 次提交
  5. 28 4月, 2022 1 次提交
  6. 21 4月, 2022 1 次提交
  7. 10 3月, 2022 1 次提交
  8. 05 3月, 2022 1 次提交
    • Y
      drm/amdkfd: judge get_atc_vmid_pasid_mapping_info before call · c8b0507f
      Yifan Zhang 提交于
      Fix the NULL point issue:
      
      [ 3076.255609] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [ 3076.255624] #PF: supervisor instruction fetch in kernel mode
      [ 3076.255637] #PF: error_code(0x0010) - not-present page
      [ 3076.255649] PGD 0 P4D 0
      [ 3076.255660] Oops: 0010 [#1] SMP NOPTI
      [ 3076.255669] CPU: 20 PID: 2415 Comm: kfdtest Tainted: G        W  OE     5.11.0-41-generic #45~20.04.1-Ubuntu
      [ 3076.255691] Hardware name: AMD Splinter/Splinter-RPL, BIOS VS2326337N.FD 02/07/2022
      [ 3076.255706] RIP: 0010:0x0
      [ 3076.255718] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
      [ 3076.255732] RSP: 0018:ffffb64283e3fc10 EFLAGS: 00010297
      [ 3076.255744] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000027
      [ 3076.255759] RDX: ffffb64283e3fc1e RSI: 0000000000000008 RDI: ffff8c7a87f60000
      [ 3076.255776] RBP: ffffb64283e3fc78 R08: ffff8c7d88518ac0 R09: ffffb64283e3fa60
      [ 3076.255791] R10: 0000000000000001 R11: 0000000000000001 R12: 000000000000000f
      [ 3076.255805] R13: ffff8c7bdcea5800 R14: ffff8c7a9f3f3000 R15: ffff8c7a8696bc00
      [ 3076.255820] FS:  0000000000000000(0000) GS:ffff8c7d88500000(0000) knlGS:0000000000000000
      [ 3076.255839] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 3076.255851] CR2: ffffffffffffffd6 CR3: 0000000109e3c000 CR4: 0000000000750ee0
      [ 3076.255866] PKRU: 55555554
      [ 3076.255873] Call Trace:
      [ 3076.255884]  dbgdev_wave_reset_wavefronts+0x72/0x160 [amdgpu]
      [ 3076.256025]  process_termination_cpsch.cold+0x26/0x2f [amdgpu]
      [ 3076.256182]  ? ktime_get_mono_fast_ns+0x4e/0xa0
      [ 3076.256196]  kfd_process_dequeue_from_all_devices+0x49/0x70 [amdgpu]
      [ 3076.256328]  kfd_process_notifier_release+0x187/0x2b0 [amdgpu]
      [ 3076.256451]  ? mn_itree_inv_end+0xdc/0x110
      [ 3076.256463]  __mmu_notifier_release+0x74/0x1f0
      [ 3076.256474]  exit_mmap+0x170/0x200
      [ 3076.256484]  ? __handle_mm_fault+0x677/0x920
      [ 3076.256496]  ? _cond_resched+0x19/0x30
      [ 3076.256507]  mmput+0x5d/0x130
      [ 3076.256518]  do_exit+0x332/0xaf0
      [ 3076.256526]  ? handle_mm_fault+0xd7/0x2b0
      [ 3076.256537]  do_group_exit+0x43/0xa0
      [ 3076.256548]  __x64_sys_exit_group+0x18/0x20
      [ 3076.256559]  do_syscall_64+0x38/0x90
      [ 3076.256569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      c8b0507f
  9. 15 2月, 2022 3 次提交
  10. 10 2月, 2022 2 次提交
  11. 08 2月, 2022 4 次提交
  12. 12 1月, 2022 1 次提交
  13. 29 12月, 2021 2 次提交
  14. 02 12月, 2021 1 次提交
  15. 23 11月, 2021 2 次提交
  16. 18 11月, 2021 9 次提交
  17. 06 11月, 2021 1 次提交
  18. 29 10月, 2021 1 次提交
  19. 12 8月, 2021 1 次提交
  20. 23 7月, 2021 3 次提交
  21. 19 6月, 2021 1 次提交