- 24 3月, 2021 3 次提交
-
-
由 Hawking Zhang 提交于
Add psp v13 ip block to soc ip init list for aldebaran Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NLe Ma <Le.Ma@amd.com> Reviewed-by: NKevin Wang <kevin1.wang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
PMFW should be loaded before any operation that may toggling DF-Cstate. otherwsie, tOS has no choice but to locally toggle DF Cstate (i.e. disable DF-Cstate even it already enabled by VBIOS) Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NLe Ma <Le.Ma@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
Initialze psp ip function for aldebaran Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NLe Ma <Le.Ma@amd.com> Reviewed-by: NKevin Wang <kevin1.wang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 1月, 2021 1 次提交
-
-
由 Jinzhou Su 提交于
Add interface to load, unload, invoke command for secure display TA. v2: Add debugfs interface for secure display TA v3: fix warning in copy_from_user (Alex) Signed-off-by: NJinzhou.Su <Jinzhou.Su@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 1月, 2021 2 次提交
-
-
由 Jiawei Gu 提交于
Navi12 HDCP & DTM deinitialization needs continue to free bo if already created though initialized flag is not set. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NJiawei Gu <Jiawei.Gu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
ASD and TA share the same firmware in SIENNA_CICHLID and only TA firmware is requested during boot, so only need release TA firmware when remove device. [ 83.877150] general protection fault, probably for non-canonical address 0x1269f97e6ed04095: 0000 [#1] SMP PTI [ 83.888076] CPU: 0 PID: 1312 Comm: modprobe Tainted: G W OE 5.9.0-rc5-deli-amd-vangogh-0.0.6.6-114-gdd99d5669a96-dirty #2 [ 83.901160] Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 [ 83.912353] RIP: 0010:free_fw_priv+0xc/0x120 [ 83.917531] Code: e8 99 cd b0 ff b8 a1 ff ff ff eb 9f 4c 89 f7 e8 8a cd b0 ff b8 f4 ff ff ff eb 90 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <4c> 8b 67 18 48 89 fb 4c 89 e7 e8 45 94 41 00 b8 ff ff ff ff f0 0f [ 83.937576] RSP: 0018:ffffbc34c13a3ce0 EFLAGS: 00010206 [ 83.943699] RAX: ffffffffbb681850 RBX: ffffa047f117eb60 RCX: 0000000080800055 [ 83.951879] RDX: ffffbc34c1d5f000 RSI: 0000000080800055 RDI: 1269f97e6ed04095 [ 83.959955] RBP: ffffbc34c13a3cf0 R08: 0000000000000000 R09: 0000000000000001 [ 83.968107] R10: ffffbc34c13a3cc8 R11: 00000000ffffff00 R12: ffffa047d6b23378 [ 83.976166] R13: ffffa047d6b23338 R14: ffffa047d6b240c8 R15: 0000000000000000 [ 83.984295] FS: 00007f74f6712540(0000) GS:ffffa047fbe00000(0000) knlGS:0000000000000000 [ 83.993323] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 84.000056] CR2: 0000556a1cca4e18 CR3: 000000021faa8004 CR4: 00000000003706f0 [ 84.008128] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 84.016155] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 84.024174] Call Trace: [ 84.027514] release_firmware.part.11+0x4b/0x70 [ 84.033017] release_firmware+0x13/0x20 [ 84.037803] psp_sw_fini+0x77/0xb0 [amdgpu] [ 84.042857] amdgpu_device_fini+0x38c/0x5d0 [amdgpu] [ 84.048815] amdgpu_driver_unload_kms+0x43/0x70 [amdgpu] [ 84.055055] drm_dev_unregister+0x73/0xb0 [drm] [ 84.060499] drm_dev_unplug+0x28/0x30 [drm] [ 84.065598] amdgpu_dev_uninit+0x1b/0x40 [amdgpu] [ 84.071223] amdgpu_pci_remove+0x4e/0x70 [amdgpu] [ 84.076835] pci_device_remove+0x3e/0xc0 [ 84.081609] device_release_driver_internal+0xfb/0x1c0 [ 84.087558] driver_detach+0x4d/0xa0 [ 84.092041] bus_remove_driver+0x5f/0xe0 [ 84.096854] driver_unregister+0x2f/0x50 [ 84.101594] pci_unregister_driver+0x22/0xa0 [ 84.106806] amdgpu_exit+0x15/0x2b [amdgpu] Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 1月, 2021 3 次提交
-
-
由 Jiawei Gu 提交于
Navi12 HDCP & DTM deinitialization needs continue to free bo if already created though initialized flag is not set. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NJiawei Gu <Jiawei.Gu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 pengzhou 提交于
In the psp command submit logic, the function msleep(1) delayed too long, Changing it to usleep_range(10, 100) to have a better performance. Signed-off-by: NPeng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: NEmily.Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
ASD and TA share the same firmware in SIENNA_CICHLID and only TA firmware is requested during boot, so only need release TA firmware when remove device. [ 83.877150] general protection fault, probably for non-canonical address 0x1269f97e6ed04095: 0000 [#1] SMP PTI [ 83.888076] CPU: 0 PID: 1312 Comm: modprobe Tainted: G W OE 5.9.0-rc5-deli-amd-vangogh-0.0.6.6-114-gdd99d5669a96-dirty #2 [ 83.901160] Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 [ 83.912353] RIP: 0010:free_fw_priv+0xc/0x120 [ 83.917531] Code: e8 99 cd b0 ff b8 a1 ff ff ff eb 9f 4c 89 f7 e8 8a cd b0 ff b8 f4 ff ff ff eb 90 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <4c> 8b 67 18 48 89 fb 4c 89 e7 e8 45 94 41 00 b8 ff ff ff ff f0 0f [ 83.937576] RSP: 0018:ffffbc34c13a3ce0 EFLAGS: 00010206 [ 83.943699] RAX: ffffffffbb681850 RBX: ffffa047f117eb60 RCX: 0000000080800055 [ 83.951879] RDX: ffffbc34c1d5f000 RSI: 0000000080800055 RDI: 1269f97e6ed04095 [ 83.959955] RBP: ffffbc34c13a3cf0 R08: 0000000000000000 R09: 0000000000000001 [ 83.968107] R10: ffffbc34c13a3cc8 R11: 00000000ffffff00 R12: ffffa047d6b23378 [ 83.976166] R13: ffffa047d6b23338 R14: ffffa047d6b240c8 R15: 0000000000000000 [ 83.984295] FS: 00007f74f6712540(0000) GS:ffffa047fbe00000(0000) knlGS:0000000000000000 [ 83.993323] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 84.000056] CR2: 0000556a1cca4e18 CR3: 000000021faa8004 CR4: 00000000003706f0 [ 84.008128] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 84.016155] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 84.024174] Call Trace: [ 84.027514] release_firmware.part.11+0x4b/0x70 [ 84.033017] release_firmware+0x13/0x20 [ 84.037803] psp_sw_fini+0x77/0xb0 [amdgpu] [ 84.042857] amdgpu_device_fini+0x38c/0x5d0 [amdgpu] [ 84.048815] amdgpu_driver_unload_kms+0x43/0x70 [amdgpu] [ 84.055055] drm_dev_unregister+0x73/0xb0 [drm] [ 84.060499] drm_dev_unplug+0x28/0x30 [drm] [ 84.065598] amdgpu_dev_uninit+0x1b/0x40 [amdgpu] [ 84.071223] amdgpu_pci_remove+0x4e/0x70 [amdgpu] [ 84.076835] pci_device_remove+0x3e/0xc0 [ 84.081609] device_release_driver_internal+0xfb/0x1c0 [ 84.087558] driver_detach+0x4d/0xa0 [ 84.092041] bus_remove_driver+0x5f/0xe0 [ 84.096854] driver_unregister+0x2f/0x50 [ 84.101594] pci_unregister_driver+0x22/0xa0 [ 84.106806] amdgpu_exit+0x15/0x2b [amdgpu] Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 12月, 2020 1 次提交
-
-
由 Lee Jones 提交于
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:2576:5: warning: no previous prototype for ‘parse_ta_bin_descriptor’ [-Wmissing-prototypes] Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: NLee Jones <lee.jones@linaro.org> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 11月, 2020 1 次提交
-
-
由 Kenneth Feng 提交于
This has been confirmed that unload message is not needed from SIENNA_CICHLID in reset. Otherwise it will cause the fw wrong state after reset and no response for any messages. Signed-off-by: NKenneth Feng <kenneth.feng@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 11月, 2020 1 次提交
-
-
由 Bhawanpreet Lakha 提交于
Signed-off-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 11月, 2020 1 次提交
-
-
由 John Clements 提交于
updated fw header v2 parser to set asd fw memory Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 04 11月, 2020 1 次提交
-
-
由 John Clements 提交于
updated fw header v2 parser to set asd fw memory Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.9.x
-
- 03 11月, 2020 2 次提交
-
-
由 Deepak R Varma 提交于
When declaring pointer data, the "*" symbol should be used adjacent to the data name as per the coding standards. This resolves following issues reported by checkpatch script: ERROR: "foo * bar" should be "foo *bar" ERROR: "foo * bar" should be "foo *bar" ERROR: "foo* bar" should be "foo *bar" ERROR: "(foo*)" should be "(foo *)" Signed-off-by: NDeepak R Varma <mh12gx2825@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Deepak R Varma 提交于
General code indentation and alignment changes such as replace spaces by tabs or align function arguments as per the coding style guidelines. The patch corrects issues for various amdgpu_*.c files for this driver. Issue reported by checkpatch script. Signed-off-by: NDeepak R Varma <mh12gx2825@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 10月, 2020 1 次提交
-
-
由 John Clements 提交于
loaded fw can be queried from sys fs interface Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 22 10月, 2020 2 次提交
-
-
由 Andrey Grodzovsky 提交于
psp sysfs not cleaned up on driver unload for sienna_cichlid Fixes: ce87c98d ("drm/amdgpu: Include sienna_cichlid in USBC PD FW support.") Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.9.x
-
由 Likun Gao 提交于
Support to load RLC iram and dram ucode when RLC firmware struct use v2.2 Signed-off-by: NLikun Gao <Likun.Gao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 10月, 2020 1 次提交
-
-
由 Andrey Grodzovsky 提交于
psp sysfs not cleaned up on driver unload for sienna_cichlid Fixes: ce87c98d ("drm/amdgpu: Include sienna_cichlid in USBC PD FW support.") Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 10月, 2020 1 次提交
-
-
由 Likun Gao 提交于
Support to load RLC iram and dram ucode when RLC firmware struct use v2.2 Signed-off-by: NLikun Gao <Likun.Gao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 10月, 2020 2 次提交
-
-
由 Tao Zhou 提交于
General psp support for dimgrey_cavefish. v2: remove the checks for asd load and reroute ih. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
Increase fw_name string size so longer chip name can be stored. v2: define macro for the length of psp fw name. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 10月, 2020 2 次提交
-
-
由 Huang Rui 提交于
This patch is to enable psp support for vangogh Signed-off-by: NHuang Rui <ray.huang@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Huang Rui 提交于
APU needs load toc firmware for gfx10 series on psp front door loading. v2: rebase against latest code v3: clarify error message Signed-off-by: NHuang Rui <ray.huang@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 9月, 2020 1 次提交
-
-
由 Jingwen Chen 提交于
smc, sdma, sos, ta and asd fw is not used in SRIOV. Skip them to accelerate sw_init for navi12. v2: skip above fw in SRIOV for vega10 and sienna_cichlid v3: directly skip psp fw loading in SRIOV Signed-off-by: NJingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: NEmily.Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 9月, 2020 4 次提交
-
-
由 Andrey Grodzovsky 提交于
Create sysfs interface also for sienna_cichlid. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Create sysfs interface also for sienna_cichlid. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 John Clements 提交于
Output RAS init status If RAS init fails, teardown RAS context Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
At this point the ASIC is already post reset by the HW/PSP so the HW not in proper state to be configured for suspension, some blocks might be even gated and so best is to avoid touching it. v2: Rename in_dpc to more meaningful name Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 8月, 2020 2 次提交
-
-
由 Tao Zhou 提交于
asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NJiansong Chen <Jiansong.Chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NJiansong Chen <Jiansong.Chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 8月, 2020 2 次提交
-
-
由 Luben Tuikov 提交于
Get the amdgpu_device from the DRM device by use of an inline function, drm_to_adev(). The inline function resolves a pointer to struct drm_device to a pointer to struct amdgpu_device. v2: Use a typed visible static inline function instead of an invisible macro. Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dennis Li 提交于
if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NDennis Li <Dennis.Li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 8月, 2020 2 次提交
-
-
由 Christian König 提交于
The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1a. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Wenhui Sheng 提交于
Enable the RAP TA loading path and add RAP test trigger interface. v2: fix potential mem leak issue Signed-off-by: NWenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: NGuchun Chen <Guchun.Chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 8月, 2020 2 次提交
-
-
由 Liu ChengZhe 提交于
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation; 2. Check pointer before release firmware. v2: use CHIP_SIENNA_CICHLID instead v3: remove local "bool ret"; fix grammer issue v4: use my name instead of "root" v5: fix grammer issue and indent issue Signed-off-by: NLiu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Colin Ian King 提交于
There is a spelling mistake in a DRM_ERROR error message. Fix it. This got lost in a merge, restore the fix. Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4afaa61d)
-
- 31 7月, 2020 1 次提交
-
-
由 Liu ChengZhe 提交于
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation; 2. Check pointer before release firmware. v2: use CHIP_SIENNA_CICHLID instead v3: remove local "bool ret"; fix grammer issue v4: use my name instead of "root" v5: fix grammer issue and indent issue Signed-off-by: NLiu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 7月, 2020 1 次提交
-
-
由 Dennis Li 提交于
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Suggested-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: NChristian König <christian.koenig@amd.com> Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com> Suggested-by: NLijo Lazar <Lijo.Lazar@amd.com> Suggested-by: NLuben Tukov <luben.tuikov@amd.com> Signed-off-by: NDennis Li <Dennis.Li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-