1. 26 9月, 2020 1 次提交
  2. 16 9月, 2020 1 次提交
  3. 27 8月, 2020 5 次提交
  4. 25 8月, 2020 1 次提交
  5. 15 8月, 2020 1 次提交
  6. 07 8月, 2020 1 次提交
  7. 06 8月, 2020 2 次提交
  8. 31 7月, 2020 1 次提交
  9. 30 7月, 2020 1 次提交
  10. 28 7月, 2020 1 次提交
    • D
      drm/amdgpu: fix system hang issue during GPU reset · df9c8d1a
      Dennis Li 提交于
      when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
      the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
      re-entering GPU recovery.
      
      During GPU reset and resume, it is unsafe that other threads access GPU,
      which maybe cause GPU reset failed. Therefore the new rw_semaphore
      adev->reset_sem is introduced, which protect GPU from being accessed by
      external threads during recovery.
      
      v2:
      1. add rwlock for some ioctls, debugfs and file-close function.
      2. change to use dqm->is_resetting and dqm_lock for protection in kfd
      driver.
      3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
      re-enter GPU recovery for the same GPU hang.
      
      v3:
      1. change back to use adev->reset_sem to protect kfd callback
      functions, because dqm_lock couldn't protect all codes, for example:
      free_mqd must be called outside of dqm_lock;
      
      [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
      [ 1230.177221] Call Trace:
      [ 1230.178249]  dump_stack+0x98/0xd5
      [ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
      [ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
      [ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
      [ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
      [ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
      [ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
      [ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
      [ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
      [ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
      [ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
      [ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
      [ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
      [ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
      [ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
      [ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
      [ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
      [ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
      [ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
      [ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
      [ 1230.202831]  ksys_ioctl+0x98/0xb0
      [ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
      [ 1230.205174]  do_syscall_64+0x5f/0x250
      [ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      2. remove try_lock and introduce atomic hive->in_reset, to avoid
      re-enter GPU recovery.
      
      v4:
      1. remove an unnecessary whitespace change in kfd_chardev.c
      2. remove comment codes in amdgpu_device.c
      3. add more detailed comment in commit message
      4. define a wrap function amdgpu_in_reset
      
      v5:
      1. Fix some style issues.
      Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
      Suggested-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
      Suggested-by: NChristian König <christian.koenig@amd.com>
      Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Suggested-by: NLijo Lazar <Lijo.Lazar@amd.com>
      Suggested-by: NLuben Tukov <luben.tuikov@amd.com>
      Signed-off-by: NDennis Li <Dennis.Li@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      df9c8d1a
  11. 01 7月, 2020 4 次提交
  12. 29 5月, 2020 1 次提交
  13. 09 5月, 2020 2 次提交
  14. 25 4月, 2020 1 次提交
  15. 09 4月, 2020 1 次提交
  16. 11 3月, 2020 1 次提交
  17. 05 3月, 2020 1 次提交
  18. 13 2月, 2020 2 次提交
  19. 12 12月, 2019 1 次提交
    • Y
      drm/amd/powerplay: enable pp one vf mode for vega10 · c9ffa427
      Yintian Tao 提交于
      Originally, due to the restriction from PSP and SMU, VF has
      to send message to hypervisor driver to handle powerplay
      change which is complicated and redundant. Currently, SMU
      and PSP can support VF to directly handle powerplay
      change by itself. Therefore, the old code about the handshake
      between VF and PF to handle powerplay will be removed and VF
      will use new the registers below to handshake with SMU.
      mmMP1_SMN_C2PMSG_101: register to handle SMU message
      mmMP1_SMN_C2PMSG_102: register to handle SMU parameter
      mmMP1_SMN_C2PMSG_103: register to handle SMU response
      
      v2: remove module parameter pp_one_vf
      v3: fix the parens
      v4: forbid vf to change smu feature
      v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute
      v6: change skip condition at vega10_copy_table_to_smc
      Signed-off-by: NYintian Tao <yttao@amd.com>
      Acked-by: NEvan Quan <evan.quan@amd.com>
      Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      c9ffa427
  20. 23 11月, 2019 1 次提交
  21. 21 11月, 2019 1 次提交
  22. 20 11月, 2019 3 次提交
  23. 19 11月, 2019 3 次提交
  24. 14 11月, 2019 1 次提交
  25. 07 11月, 2019 2 次提交