1. 19 10月, 2022 3 次提交
  2. 11 10月, 2022 2 次提交
  3. 29 9月, 2022 2 次提交
  4. 14 9月, 2022 2 次提交
  5. 17 8月, 2022 1 次提交
  6. 13 7月, 2022 1 次提交
  7. 06 7月, 2022 2 次提交
  8. 11 6月, 2022 2 次提交
  9. 02 6月, 2022 2 次提交
  10. 27 5月, 2022 1 次提交
  11. 11 5月, 2022 2 次提交
  12. 26 4月, 2022 1 次提交
  13. 23 4月, 2022 3 次提交
  14. 29 3月, 2022 1 次提交
  15. 16 3月, 2022 1 次提交
  16. 03 3月, 2022 2 次提交
  17. 24 2月, 2022 2 次提交
  18. 18 2月, 2022 3 次提交
  19. 15 2月, 2022 4 次提交
  20. 08 2月, 2022 3 次提交
    • Y
      Revert "drm/amdgpu: Add judgement to avoid infinite loop" · a50b0482
      yipechai 提交于
      The commit d5e8ff5f ("drm/amdgpu: Fixed the defect of soft lock caused by infinite loop")
      had fixed this defect.
      
      Revert workaround
      commit a2170b4a ("drm/amdgpu: Add judgement to avoid infinite loop").
      Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
      Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      a50b0482
    • Y
      drm/amdgpu: Fixed the defect of soft lock caused by infinite loop · d5e8ff5f
      yipechai 提交于
      1. The infinite loop case only occurs on multiple cards support
         ras functions.
      2. The explanation of root cause refer to commit 76641cbbf196
         ("drm/amdgpu: Add judgement to avoid infinite loop").
      3. Create new node to manage each unique ras instance to guarantee
         each device .ras_list is completely independent.
      4. Fixes: commit 7a6b8ab3231b51 ("drm/amdgpu: Unify ras block
         interface for each ras block").
      5. The soft locked logs are as follows:
      [  262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G           OE     5.13.0-27-generic #29~20.04.1-Ubuntu
      [  262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS T20200717143848 07/17/2020
      [  262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
      [  262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu]
      [  262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45
      [  262.166243] RSP: 0018:ffffac908fa87d80 EFLAGS: 00000202
      [  262.166247] RAX: ffffffffc1394248 RBX: ffff91e4ab8d6e20 RCX: ffffffffc1394248
      [  262.166249] RDX: ffff91e4aa356e20 RSI: 000000000000000e RDI: ffff91e4ab8c0000
      [  262.166252] RBP: ffffac908fa87da8 R08: 0000000000000007 R09: 0000000000000001
      [  262.166254] R10: ffff91e4930b64ec R11: 0000000000000000 R12: 000000000000000e
      [  262.166256] R13: ffff91e4aa356df8 R14: ffffffffc1394320 R15: 0000000000000003
      [  262.166258] FS:  0000000000000000(0000) GS:ffff92238fb40000(0000) knlGS:0000000000000000
      [  262.166261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  262.166264] CR2: 00000001004865d0 CR3: 000000406d796000 CR4: 0000000000350ee0
      [  262.166267] Call Trace:
      [  262.166272]  amdgpu_ras_do_recovery+0x130/0x290 [amdgpu]
      [  262.166529]  ? psi_task_switch+0xd2/0x250
      [  262.166537]  ? __switch_to+0x11d/0x460
      [  262.166542]  ? __switch_to_asm+0x36/0x70
      [  262.166549]  process_one_work+0x220/0x3c0
      [  262.166556]  worker_thread+0x4d/0x3f0
      [  262.166560]  ? process_one_work+0x3c0/0x3c0
      [  262.166563]  kthread+0x12b/0x150
      [  262.166568]  ? set_kthread_struct+0x40/0x40
      [  262.166571]  ret_from_fork+0x22/0x30
      Signed-off-by: Nyipechai <YiPeng.Chai@amd.com>
      Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      d5e8ff5f
    • L
      drm/amdgpu: Print once if RAS unsupported · afa37315
      Luben Tuikov 提交于
      MESA polls for errors every 2-3 seconds. Printing with dev_info() causes
      the dmesg log to fill up with the same message, e.g,
      
      [18028.206676] amdgpu 0000:0b:00.0: amdgpu: df doesn't config ras function.
      
      Make it dev_dbg_once(), as it isn't something correctible during boot or
      thereafter, so printing just once is sufficient. Also sanitize the message.
      
      Cc: Alex Deucher <Alexander.Deucher@amd.com>
      Cc: Hawking Zhang <Hawking.Zhang@amd.com>
      Cc: John Clements <john.clements@amd.com>
      Cc: Tao Zhou <tao.zhou1@amd.com>
      Cc: yipechai <YiPeng.Chai@amd.com>
      Fixes: 8b0fb0e9 ("drm/amdgpu: Modify gfx block to fit for the unified ras block data and ops")
      Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
      Reviewed-by: NAlex Deucher <Alexander.Deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      afa37315