- 16 12月, 2022 1 次提交
-
-
由 Tao Zhou 提交于
Injection on guest is not allowed. v2: return directly in SRIOV environment. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 12月, 2022 1 次提交
-
-
由 ye xingchen 提交于
Replace the open-code with sysfs_emit() to simplify the code. Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com> Signed-off-by: Nye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 11月, 2022 2 次提交
-
-
由 Tao Zhou 提交于
Set support flag for VCN/JPEG 4.0. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 YiPeng Chai 提交于
The patch is enabling mode-1 reset for RAS recovery in fatal error mode. Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 10月, 2022 2 次提交
-
-
由 Tao Zhou 提交于
Make the code simpler. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
Make the code more readable. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 10月, 2022 3 次提交
-
-
由 YiPeng Chai 提交于
V2: Add sriov vf ras support in amdgpu_ras_asic_supported. Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 YiPeng Chai 提交于
V1: Enable ras support for CHIP_IP_DISCOVERY asic type. V2: 1. Change commit comment. 2. Enable ras support for mp0 v13_0_0 and v13_0_10. Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Victor Zhao 提交于
This reverts commit dac6b808. This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with the original design of reset handler. Will redesign it. Fixes: dac6b808 ("drm/amdgpu: let mode2 reset fallback to default when failure") Signed-off-by: NVictor Zhao <Victor.Zhao@amd.com> Reviewed-by: NLijo Lazar <lijo.lazar@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 10月, 2022 4 次提交
-
-
由 Hawking Zhang 提交于
RAS error address translation algorithm is common across dGPU and A + A platform as along as the SOC integrates the same generation of UMC IP. UMC RAS is managed by x86 MCA on A + A platform, umc_ras in GPU driver is not initialized at all on A + A platform. In such case, any umc_ras callback implemented for dGPU config shouldn't be invoked from A + A specific callback. The change moves convert_error_address out of dGPU umc_ras structure and makes it share between A + A and dGPU config. Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NStanley Yang <Stanley.Yang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 YiPeng Chai 提交于
V2: Add sriov vf ras support in amdgpu_ras_asic_supported. Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 YiPeng Chai 提交于
V1: Enable ras support for CHIP_IP_DISCOVERY asic type. V2: 1. Change commit comment. 2. Enable ras support for mp0 v13_0_0 and v13_0_10. Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Victor Zhao 提交于
This reverts commit dac6b808. This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with the original design of reset handler. Will redesign it. Fixes: dac6b808 ("drm/amdgpu: let mode2 reset fallback to default when failure") Signed-off-by: NVictor Zhao <Victor.Zhao@amd.com> Reviewed-by: NLijo Lazar <lijo.lazar@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 10月, 2022 2 次提交
-
-
由 Tao Zhou 提交于
Fix some issues found by checkpatch script. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
Make the code reusable and remove redundant code. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 9月, 2022 2 次提交
-
-
由 Tao Zhou 提交于
Use the convert interface to simplify code. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 YiPeng Chai 提交于
For the asic using smu v13_0_2, there is the following warning when uninstalling amdgpu: amdgpu: ras disable gfx failed poison:1 ret:-22. [Why]: For the asic using smu v13_0_2, the psp .suspend and mode1reset is called before amdgpu_ras_pre_fini during amdgpu uninstall, it has disabled all ras features and reset the psp. Since the psp is reset, calling amdgpu_ras_disable_all_features in amdgpu_ras_pre_fini to disable ras features will fail. [How]: If all ras features are disabled, amdgpu_ras_disable_all_features will not be called to disable all ras features again. Signed-off-by: NYiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 9月, 2022 2 次提交
-
-
由 Candice Li 提交于
No need to reset error status since only umc ras supported on psp v13_0_0. Signed-off-by: NCandice Li <candice.li@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Candice Li 提交于
No need to reset error status since only umc ras supported on psp v13_0_0. Signed-off-by: NCandice Li <candice.li@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 8月, 2022 1 次提交
-
-
由 Victor Zhao 提交于
- introduce AMDGPU_SKIP_MODE2_RESET flag - let mode2 reset fallback to default reset method if failed v2: move this part out from the asic specific part Signed-off-by: NVictor Zhao <Victor.Zhao@amd.com> Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 7月, 2022 1 次提交
-
-
由 Likun Gao 提交于
Move reset_context out of gpu recover function to make it configurable for different reset purpose. For the reset way of call gpu_recovery sysfs, force to use full reset method. Otherwise, try soft reset by default if the related ASIC supportted, if soft reset failed, will use full reset. Signed-off-by: NLikun Gao <Likun.Gao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 7月, 2022 2 次提交
-
-
由 Stanley.Yang 提交于
It should not init whole ras bad page framework on sriov guest side due to it is handled on host side. Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Stanley.Yang 提交于
GFX is the only IP block that RAS TA needs to program the hardware when receiving enable_feature command. Changed from V1: remove amdgpu_ras_need_send_ras_feature inline function, use GFX RAS block check directly. Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 6月, 2022 2 次提交
-
-
由 Andrey Grodzovsky 提交于
We removed the wrapper that was queueing the recover function into reset domain queue who was using this name. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Save the extra usless work schedule. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 6月, 2022 2 次提交
-
-
由 Candice Li 提交于
Adjust the sequence for ras late init and separate ras reset error status from query status. v2: squash in fix from Candice Signed-off-by: NCandice Li <candice.li@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Stanley.Yang 提交于
Fix aldebaran ras supported check on SRIOV guest side, the previous check conditicon block all ras feature on baremetal Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 5月, 2022 1 次提交
-
-
由 Stanley.Yang 提交于
support umc/gfx/sdma ras on guest side Changed from V1: move sriov judgment in amdgpu_ras_interrupt_fatal_error_handler Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 5月, 2022 2 次提交
-
-
由 Tao Zhou 提交于
Qeury ras status before ras poison consumption handling, add more comment and log. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-and-tested-by: NMohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
Enable RAS IH if poison consumption handler is implemented. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NMohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 4月, 2022 1 次提交
-
-
由 Haowen Bai 提交于
After alloc fail, we do not need to kfree. Signed-off-by: NHaowen Bai <baihaowen@meizu.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 4月, 2022 3 次提交
-
-
由 Tao Zhou 提交于
The fatal error handler is independent from general ras interrupt handler since there is no related IH ring. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
Add support for general RAS poison consumption handler. v2: remove callback function for poison consumption. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tao Zhou 提交于
Prepare for the implementation of poison consumption handler. v2: separate umc handler from poison creation. Signed-off-by: NTao Zhou <tao.zhou1@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 3月, 2022 1 次提交
-
-
由 Mohammad Zafar Ziya 提交于
Add vcn and jpeg ras support options V2: vcn and jpeg ras flag enabled for aldebaran asic only V3: vcn and jpeg ras flag disabled for error counter query Generic poison query interface added VCN and JPEG ras enabled based on IP version check V4: vcn and jpeg ras flag moved under ecc flag for dGPU Signed-off-by: NMohammad Zafar Ziya <Mohammadzafar.ziya@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 3月, 2022 1 次提交
-
-
由 Stanley.Yang 提交于
It should notice SMU to update bad channel info when detected uncorrectable error in UMC block Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 03 3月, 2022 2 次提交
-
-
由 yipechai 提交于
1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as .ras_fini common function, which is called when .ras_fini of ras block isn't initialized. 2. Remove the code of using amdgpu_ras_block_late_fini to initialize .ras_fini in ras blocks. Signed-off-by: Nyipechai <YiPeng.Chai@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 yipechai 提交于
centrally calls the .ras_fini function of all ras blocks. Signed-off-by: Nyipechai <YiPeng.Chai@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 2月, 2022 2 次提交
-
-
由 yipechai 提交于
Fixed warning reported by kernel test robot: 1.warning: variable 'ras_obj' is used uninitialized whenever '||' condition is true. Signed-off-by: Nyipechai <YiPeng.Chai@amd.com> Reviewed-by: NTao Zhou <tao.zhou1@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Maíra Canal 提交于
Turn previously global function into a static function to avoid the following Clang warning: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:5: warning: no previous prototype for function 'amdgpu_ras_block_late_init_default' [-Wmissing-prototypes] int amdgpu_ras_block_late_init_default(struct amdgpu_device *adev, ^ drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int amdgpu_ras_block_late_init_default(struct amdgpu_device *adev, ^ static Signed-off-by: NMaíra Canal <maira.canal@usp.br> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-