1. 03 3月, 2022 1 次提交
  2. 18 2月, 2022 2 次提交
  3. 15 2月, 2022 1 次提交
  4. 15 1月, 2022 2 次提交
  5. 05 10月, 2021 2 次提交
  6. 21 8月, 2021 2 次提交
    • M
      drm/amdgpu: Cancel delayed work when GFXOFF is disabled · 32bc8f83
      Michel Dänzer 提交于
      schedule_delayed_work does not push back the work if it was already
      scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
      after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
      was disabled and re-enabled again during those 100 ms.
      
      This resulted in frame drops / stutter with the upcoming mutter 41
      release on Navi 14, due to constantly enabling GFXOFF in the HW and
      disabling it again (for getting the GPU clock counter).
      
      To fix this, call cancel_delayed_work_sync when the disable count
      transitions from 0 to 1, and only schedule the delayed work on the
      reverse transition, not if the disable count was already 0. This makes
      sure the delayed work doesn't run at unexpected times, and allows it to
      be lock-free.
      
      v2:
      * Use cancel_delayed_work_sync & mutex_trylock instead of
        mod_delayed_work.
      v3:
      * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
      v4:
      * Fix race condition between amdgpu_gfx_off_ctrl incrementing
        adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
        checking for it to be 0 (Evan Quan)
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NEvan Quan <evan.quan@amd.com>
      Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
      Acked-by: Christian König <christian.koenig@amd.com> # v3
      Signed-off-by: NMichel Dänzer <mdaenzer@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      32bc8f83
    • M
      drm/amdgpu: Cancel delayed work when GFXOFF is disabled · 90a92662
      Michel Dänzer 提交于
      schedule_delayed_work does not push back the work if it was already
      scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
      after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
      was disabled and re-enabled again during those 100 ms.
      
      This resulted in frame drops / stutter with the upcoming mutter 41
      release on Navi 14, due to constantly enabling GFXOFF in the HW and
      disabling it again (for getting the GPU clock counter).
      
      To fix this, call cancel_delayed_work_sync when the disable count
      transitions from 0 to 1, and only schedule the delayed work on the
      reverse transition, not if the disable count was already 0. This makes
      sure the delayed work doesn't run at unexpected times, and allows it to
      be lock-free.
      
      v2:
      * Use cancel_delayed_work_sync & mutex_trylock instead of
        mod_delayed_work.
      v3:
      * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
      v4:
      * Fix race condition between amdgpu_gfx_off_ctrl incrementing
        adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
        checking for it to be 0 (Evan Quan)
      
      Cc: stable@vger.kernel.org
      Reviewed-by: NEvan Quan <evan.quan@amd.com>
      Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
      Acked-by: Christian König <christian.koenig@amd.com> # v3
      Signed-off-by: NMichel Dänzer <mdaenzer@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      90a92662
  7. 17 8月, 2021 1 次提交
  8. 20 5月, 2021 1 次提交
  9. 10 4月, 2021 5 次提交
  10. 24 3月, 2021 1 次提交
  11. 10 2月, 2021 1 次提交
  12. 14 11月, 2020 2 次提交
  13. 13 11月, 2020 1 次提交
  14. 17 10月, 2020 1 次提交
  15. 16 9月, 2020 1 次提交
  16. 25 8月, 2020 2 次提交
  17. 15 8月, 2020 2 次提交
  18. 05 8月, 2020 1 次提交
  19. 28 7月, 2020 1 次提交
    • D
      drm/amdgpu: fix system hang issue during GPU reset · df9c8d1a
      Dennis Li 提交于
      when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
      the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
      re-entering GPU recovery.
      
      During GPU reset and resume, it is unsafe that other threads access GPU,
      which maybe cause GPU reset failed. Therefore the new rw_semaphore
      adev->reset_sem is introduced, which protect GPU from being accessed by
      external threads during recovery.
      
      v2:
      1. add rwlock for some ioctls, debugfs and file-close function.
      2. change to use dqm->is_resetting and dqm_lock for protection in kfd
      driver.
      3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
      re-enter GPU recovery for the same GPU hang.
      
      v3:
      1. change back to use adev->reset_sem to protect kfd callback
      functions, because dqm_lock couldn't protect all codes, for example:
      free_mqd must be called outside of dqm_lock;
      
      [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
      [ 1230.177221] Call Trace:
      [ 1230.178249]  dump_stack+0x98/0xd5
      [ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
      [ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
      [ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
      [ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
      [ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
      [ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
      [ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
      [ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
      [ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
      [ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
      [ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
      [ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
      [ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
      [ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
      [ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
      [ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
      [ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
      [ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
      [ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
      [ 1230.202831]  ksys_ioctl+0x98/0xb0
      [ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
      [ 1230.205174]  do_syscall_64+0x5f/0x250
      [ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      2. remove try_lock and introduce atomic hive->in_reset, to avoid
      re-enter GPU recovery.
      
      v4:
      1. remove an unnecessary whitespace change in kfd_chardev.c
      2. remove comment codes in amdgpu_device.c
      3. add more detailed comment in commit message
      4. define a wrap function amdgpu_in_reset
      
      v5:
      1. Fix some style issues.
      Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
      Suggested-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
      Suggested-by: NChristian König <christian.koenig@amd.com>
      Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Suggested-by: NLijo Lazar <Lijo.Lazar@amd.com>
      Suggested-by: NLuben Tukov <luben.tuikov@amd.com>
      Signed-off-by: NDennis Li <Dennis.Li@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      df9c8d1a
  20. 22 7月, 2020 1 次提交
  21. 02 5月, 2020 2 次提交
  22. 24 4月, 2020 2 次提交
  23. 14 4月, 2020 1 次提交
  24. 09 4月, 2020 1 次提交
    • N
      drm/amdgpu: rework sched_list generation · 1c6d567b
      Nirmoy Das 提交于
      Generate HW IP's sched_list in amdgpu_ring_init() instead of
      amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
      ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
      This patch also stores sched_list for all HW IPs in one big
      array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
      much more leaner.
      
      v2:
      fix a coding style issue
      do not use drm hw_ip const to populate amdgpu_ring_type enum
      
      v3:
      remove ctx reference and move sched array and num_sched to a struct
      use num_scheds to detect uninitialized scheduler list
      
      v4:
      use array_index_nospec for user space controlled variables
      fix possible checkpatch.pl warnings
      Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      1c6d567b
  25. 11 3月, 2020 1 次提交
  26. 10 3月, 2020 1 次提交
  27. 27 2月, 2020 1 次提交