1. 10 4月, 2021 5 次提交
  2. 24 3月, 2021 6 次提交
  3. 27 2月, 2021 2 次提交
  4. 19 2月, 2021 2 次提交
  5. 16 1月, 2021 1 次提交
  6. 07 1月, 2021 1 次提交
  7. 06 1月, 2021 1 次提交
  8. 09 12月, 2020 2 次提交
  9. 14 11月, 2020 2 次提交
  10. 03 11月, 2020 1 次提交
  11. 30 10月, 2020 3 次提交
  12. 22 10月, 2020 2 次提交
  13. 26 9月, 2020 3 次提交
  14. 23 9月, 2020 1 次提交
    • S
      drm/amdgpu: update athub interrupt harvesting handle · 3f975d0f
      Stanley.Yang 提交于
      GCEA/MMHUB EA error should not result to DF freeze, this is
      fixed in next generation, but for some reasons the GCEA/MMHUB
      EA error will result to DF freeze in previous generation,
      diver should avoid to indicate GCEA/MMHUB EA error as hw fatal
      error in kernel message by read GCEA/MMHUB err status registers.
      
      Changed from V1:
          make query_ras_error_status function more general
          make read mmhub er status register more friendly
      
      Changed from V2:
          move ras error status query function into do_recovery workqueue
      
      Changed from V3:
          remove useless code from V2, print GCEA error status
          instance number
      Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com>
      Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      3f975d0f
  15. 27 8月, 2020 1 次提交
  16. 25 8月, 2020 3 次提交
  17. 19 8月, 2020 1 次提交
    • G
      drm/amdgpu: fix NULL pointer access issue when unloading driver · 1a68d96f
      Guchun Chen 提交于
      When unloading driver by "modprobe -r amdgpu", one NULL pointer
      dereference bug occurs in ras debugfs releasing. The cause is the
      duplicated debugfs_remove, as drm debugfs_root dir has been cleaned
      up already by drm_minor_unregister.
      
      BUG: kernel NULL pointer dereference, address: 00000000000000a0
      PGD 0 P4D 0
      Oops: 0002 [#1] SMP PTI
      CPU: 11 PID: 1526 Comm: modprobe Tainted: G           OE     5.6.0-guchchen #1
      Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
      RIP: 0010:down_write+0x15/0x40
      Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92
      d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3
      RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000
      RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0
      RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0
      R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40
      R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000
      FS:  00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       simple_recursive_removal+0x63/0x370
       ? debugfs_remove+0x60/0x60
       debugfs_remove+0x40/0x60
       amdgpu_ras_fini+0x82/0x230 [amdgpu]
       ? __kernfs_remove.part.17+0x101/0x1f0
       ? kernfs_name_hash+0x12/0x80
       amdgpu_device_fini+0x1c0/0x580 [amdgpu]
       amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]
       amdgpu_pci_remove+0x36/0x60 [amdgpu]
       pci_device_remove+0x3b/0xb0
       device_release_driver_internal+0xe5/0x1c0
       driver_detach+0x46/0x90
       bus_remove_driver+0x58/0xd0
       pci_unregister_driver+0x29/0x90
       amdgpu_exit+0x11/0x25 [amdgpu]
       __x64_sys_delete_module+0x13d/0x210
       do_syscall_64+0x5f/0x250
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
      Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      1a68d96f
  18. 15 8月, 2020 3 次提交
    • G
      drm/amdgpu: guard ras debugfs creation/removal based on CONFIG_DEBUG_FS · ae2bf61f
      Guchun Chen 提交于
      It can avoid potential build warn/error when
      CONFIG_DEBUG_FS is not set.
      Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
      Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
      Reviewed-by: NDennis Li <Dennis.Li@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      ae2bf61f
    • G
      drm/amdgpu: fix NULL pointer access issue when unloading driver · 2e2f5dd5
      Guchun Chen 提交于
      When unloading driver by "modprobe -r amdgpu", one NULL pointer
      dereference bug occurs in ras debugfs releasing. The cause is the
      duplicated debugfs_remove, as drm debugfs_root dir has been cleaned
      up already by drm_minor_unregister.
      
      BUG: kernel NULL pointer dereference, address: 00000000000000a0
      PGD 0 P4D 0
      Oops: 0002 [#1] SMP PTI
      CPU: 11 PID: 1526 Comm: modprobe Tainted: G           OE     5.6.0-guchchen #1
      Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
      RIP: 0010:down_write+0x15/0x40
      Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92
      d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3
      RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000
      RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0
      RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0
      R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40
      R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000
      FS:  00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       simple_recursive_removal+0x63/0x370
       ? debugfs_remove+0x60/0x60
       debugfs_remove+0x40/0x60
       amdgpu_ras_fini+0x82/0x230 [amdgpu]
       ? __kernfs_remove.part.17+0x101/0x1f0
       ? kernfs_name_hash+0x12/0x80
       amdgpu_device_fini+0x1c0/0x580 [amdgpu]
       amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]
       amdgpu_pci_remove+0x36/0x60 [amdgpu]
       pci_device_remove+0x3b/0xb0
       device_release_driver_internal+0xe5/0x1c0
       driver_detach+0x46/0x90
       bus_remove_driver+0x58/0xd0
       pci_unregister_driver+0x29/0x90
       amdgpu_exit+0x11/0x25 [amdgpu]
       __x64_sys_delete_module+0x13d/0x210
       do_syscall_64+0x5f/0x250
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
      Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      2e2f5dd5
    • C
      drm/amdgpu: revert "fix system hang issue during GPU reset" · f1403342
      Christian König 提交于
      The whole approach wasn't thought through till the end.
      
      We already had a reset lock like this in the past and it caused the same problems like this one.
      
      Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.
      
      This reverts commit df9c8d1a.
      Signed-off-by: NChristian König <christian.koenig@amd.com>
      Acked-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      f1403342