- 15 8月, 2020 4 次提交
-
-
由 Christian König 提交于
The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1a. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nirmoy Das 提交于
Fixes: c030f2e4 ("drm/amdgpu: add amdgpu_ras.c to support ras (v2)") Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Before ras recovery is issued, user could operate this debugfs node to enable/disable the harvest of all RAS IPs' ras error count registers, which will help keep hardware's registers' status instead of cleaning up them. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NDennis Li <Dennis.Li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Once ras recovery is issued by ras sync flood interrupt or ras controller interrupt, add this guard to bypass or execute ras error count register harvest of all IPs. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NDennis Li <Dennis.Li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 8月, 2020 9 次提交
-
-
由 John Clements 提交于
enabled GECC error injection and query support Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
When amdgpu_bad_page_threshold = 0, bad page reservation stuffs are skipped in either UMC ECC irq or page retirement calling of sync flood isr. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Bad page information should not be exposed by sysfs when bad page retirement is disabled, so decouple it from ras sysfs group creating, and add one guard before creating. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Add one definition for the RAS module's FS name. It's used in both debugfs and sysfs cases. v2: Use static variable instead of macro definition. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
RAS flags needs to be cleaned as well when user requires one clean eeprom. v2: RAS flags shall be restored after eeprom reset succeeds. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness. v2: Refine warning message in threshold reaching case, and fix spelling typo. v3: Fix explicit calling of bad gpu. v4: Rename function names. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Once the ras recovery is issued from eeprom write itself, bad page reservation should be ignored, otherwise, recursive calling of writting to eeprom would happen. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check. v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message. v3: Refine function argument name. v4: Fix missing check of returning value of i2c initialization error case. v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
Bad page threshold value should be valid in the range between -1 and max records length of eeprom. It could determine when saved bad pages exceed threshold value, and proceed corresponding actions. v2: When using the default typical value, it should be min value between typical value and eeprom max records length. v3: drop the case of setting bad_page_cnt_threshold to be 0xFFFFFFFF, as it confuses user. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 7月, 2020 1 次提交
-
-
由 Dennis Li 提交于
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Suggested-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: NChristian König <christian.koenig@amd.com> Suggested-by: NFelix Kuehling <Felix.Kuehling@amd.com> Suggested-by: NLijo Lazar <Lijo.Lazar@amd.com> Suggested-by: NLuben Tukov <luben.tuikov@amd.com> Signed-off-by: NDennis Li <Dennis.Li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 7月, 2020 1 次提交
-
-
由 Guchun Chen 提交于
This will tell users if the faulty page has been written to external eeprom device in dmesg log. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 7月, 2020 1 次提交
-
-
由 Wenhui Sheng 提交于
If we are in RAS triggered situation and BACO isn't support, emergency restart is needed, and this code is only needed for some specific cases(vega20 with given smu fw version). After we add smu mode1 reset for sienna cichlid, we need to share AMD_RESET_METHOD_MODE1 with psp mode1 reset, so in amdgpu_device_gpu_recover, we need differentiate which mode1 reset we are using, then decide if it's a full reset and then decide if emergency restart is needed, the logic will become much more complex. After discussion with Hawking, move emergency restart logic to an independent function. Signed-off-by: NLikun Gao <Likun.Gao@amd.com> Signed-off-by: NWenhui Sheng <Wenhui.Sheng@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 7月, 2020 1 次提交
-
-
由 Nirmoy Das 提交于
Used sparse(make C=1) to find these loose ends. v2: removed unwanted extra line Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 03 6月, 2020 2 次提交
-
-
由 Guchun Chen 提交于
Module parameter amdgpu_ras_mask has been involved in the calculation of ras support capability, so drop this redundant code. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
RAS context memory needs to freed in failure case. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 5月, 2020 1 次提交
-
-
由 Guchun Chen 提交于
This will assist debug in error injection case. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 5月, 2020 1 次提交
-
-
由 John Clements 提交于
Disable XGMI link power down prior to issuing a XGMI RAS error Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 5月, 2020 1 次提交
-
-
由 Arnd Bergmann 提交于
After the structure was padded to 1024 bytes, it is no longer suitable for being a local variable, as the function surpasses the warning limit for 32-bit architectures: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:587:5: error: stack frame size of 1072 bytes in function 'amdgpu_ras_feature_enable' [-Werror,-Wframe-larger-than=] int amdgpu_ras_feature_enable(struct amdgpu_device *adev, ^ Use kzalloc() instead to get it from the heap. Fixes: a0d25482 ("drm/amdgpu: update RAS TA to Host interface") Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 5月, 2020 1 次提交
-
-
由 John Clements 提交于
Parse return status from TA to determine error severity Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 4月, 2020 2 次提交
-
-
由 Dennis Li 提交于
If set error query ready in amdgpu_ras_late_init, which will cause some IP blocks aren't initialized, but their error query is ready. Signed-off-by: NDennis Li <Dennis.Li@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
When running ras uncorrectable error injection and triggering GPU reset on sGPU, below issue is observed. It's caused by the list uninitialized when accessing. [ 80.047227] BUG: unable to handle page fault for address: ffffffffc0f4f750 [ 80.047300] #PF: supervisor write access in kernel mode [ 80.047351] #PF: error_code(0x0003) - permissions violation [ 80.047404] PGD 12c20e067 P4D 12c20e067 PUD 12c210067 PMD 41c4ee067 PTE 404316061 [ 80.047477] Oops: 0003 [#1] SMP PTI [ 80.047516] CPU: 7 PID: 377 Comm: kworker/7:2 Tainted: G OE 5.4.0-rc7-guchchen #1 [ 80.047594] Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018 [ 80.047888] Workqueue: events amdgpu_ras_do_recovery [amdgpu] Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NJohn Clements <John.Clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 4月, 2020 1 次提交
-
-
由 Guchun Chen 提交于
Prefix ras related kernel message logging with PCI device info by replacing DRM_INFO/WARN/ERROR with dev_info/warn/err. This can clearly tell user about GPU device information where ras is. And add some other ras message printing to make it more clear and friendly as well. Suggested-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 4月, 2020 1 次提交
-
-
由 John Clements 提交于
upon receiving uncorrectable error, query every GPU node for ras errors Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 4月, 2020 1 次提交
-
-
由 John Clements 提交于
upon receiving uncorrectable error, query every GPU node for ras errors Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 4月, 2020 2 次提交
-
-
由 Evan Quan 提交于
Backtrace on gpu recover test on Navi10. [ 1324.516681] RIP: 0010:amdgpu_ras_set_error_query_ready+0x15/0x20 [amdgpu] [ 1324.523778] Code: 4c 89 f7 e8 cd a2 a0 d8 e9 99 fe ff ff 45 31 ff e9 91 fe ff ff 0f 1f 44 00 00 55 48 85 ff 48 89 e5 74 0e 48 8b 87 d8 2b 01 00 <40> 88 b0 38 01 00 00 5d c3 66 90 0f 1f 44 00 00 55 31 c0 48 85 ff [ 1324.543452] RSP: 0018:ffffaa1040e4bd28 EFLAGS: 00010286 [ 1324.549025] RAX: 0000000000000000 RBX: ffff911198b20000 RCX: 0000000000000000 [ 1324.556217] RDX: 00000000000c0a01 RSI: 0000000000000000 RDI: ffff911198b20000 [ 1324.563514] RBP: ffffaa1040e4bd28 R08: 0000000000001000 R09: ffff91119d0028c0 [ 1324.570804] R10: ffffffff9a606b40 R11: 0000000000000000 R12: 0000000000000000 [ 1324.578413] R13: ffffaa1040e4bd70 R14: ffff911198b20000 R15: 0000000000000000 [ 1324.586464] FS: 00007f4441cbf540(0000) GS:ffff91119ed80000(0000) knlGS:0000000000000000 [ 1324.595434] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1324.601345] CR2: 0000000000000138 CR3: 00000003fcdf8004 CR4: 00000000003606e0 [ 1324.608694] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1324.616303] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1324.623678] Call Trace: [ 1324.626270] amdgpu_device_gpu_recover+0x6e7/0xc50 [amdgpu] [ 1324.632018] ? seq_printf+0x4e/0x70 [ 1324.636652] amdgpu_debugfs_gpu_recover+0x50/0x80 [amdgpu] [ 1324.643371] seq_read+0xda/0x420 [ 1324.647601] full_proxy_read+0x5c/0x90 [ 1324.652426] __vfs_read+0x1b/0x40 [ 1324.656734] vfs_read+0x8e/0x130 [ 1324.660981] ksys_read+0xa7/0xe0 [ 1324.665201] __x64_sys_read+0x1a/0x20 [ 1324.669907] do_syscall_64+0x57/0x1c0 [ 1324.674517] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1324.680654] RIP: 0033:0x7f44417cf081 Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NJohn Clements <John.Clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 John Clements 提交于
added flag to ras context to indicate if ras query functionality is ready Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 20 3月, 2020 1 次提交
-
-
由 John Clements 提交于
MMHub EDC becomes dirty after BACO reset EDC registers should be cleared early on in reset phase Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 3月, 2020 2 次提交
-
-
由 Stanley.Yang 提交于
Fix the warning "warn: variable dereferenced before check 'obj' (see line 1131)" by removing unnecessary checks as amdgpu_ras_debugfs_create_all() is only called from amdgpu_debugfs_init() where obj member in con->head list is not NULL. Use list_for_each_entry() instead list_for_each_entry_safe() as obj do not to be freeing or removing from list during this process. Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Guchun Chen 提交于
RAS support capability needs to be updated on top of different memeory ECC enablement, and remove redundant memory ecc check in gmc module for vega20 and arcturus. v2: check HBM ECC enablement and set ras mask accordingly. v3: avoid to invoke atomfirmware interface to query twice. Suggested-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 3月, 2020 2 次提交
-
-
由 Tao Zhou 提交于
and remove each ras IP's own debugfs creation this is required to fix ras when the driver does not use the drm load and unload callbacks due to ordering issues with the drm device node. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
centralize all debugfs creation in one place for ras this is required to fix ras when the driver does not use the drm load and unload callbacks due to ordering issues with the drm device node. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 3月, 2020 1 次提交
-
-
由 Hawking Zhang 提交于
Now driver will report XGMI/WAFL PCS error through sysfs xgmi_wafl_err_count node on Vega20 Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 2月, 2020 1 次提交
-
-
由 Hawking Zhang 提交于
centralize all the xgmi related function to amdgpu_xgmi.c Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Acked-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 2月, 2020 1 次提交
-
-
由 Guchun Chen 提交于
Once sync flood interrupt is triggered by RAS error, before actual GPU recovery job, it's necessary to log on and print non-zero error counter, this will help user knows where the RAS error source is from quickly. Signed-off-by: NGuchun Chen <guchun.chen@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 1月, 2020 1 次提交
-
-
由 John Clements 提交于
resolves issue with RAS error injection in mGPU configuration Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 1月, 2020 1 次提交
-
-
由 Hawking Zhang 提交于
To allow the flexibilty for user to disable gpu recovery in RAS recovery path by module parameter amdgpu_gpu_recovery Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NGuchun Chen <guchun.chen@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-