提交 · 059fe8296e0fb4b89d997ea0aa75996911b8f3aa · openeuler / Kernel

09 12月, 2020 1 次提交

drm/amdgpu: fix debugfs creation/removal, again · 2343e9d2

由 Arnd Bergmann 提交于 12月 04, 2020

There is still a warning when CONFIG_DEBUG_FS is disabled:

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1145:13: error: 'amdgpu_ras_debugfs_create_ctrl_node' defined but not used [-Werror=unused-function]
1145 | static void amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *adev)

Change the code again to make the compiler actually drop
this code but not warn about it.

Fixes: ae2bf61f ("drm/amdgpu: guard ras debugfs creation/removal based on CONFIG_DEBUG_FS")
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2343e9d2

15 8月, 2020 1 次提交

drm/amdgpu: bypass querying ras error count registers · f75e94d8

由 Guchun Chen 提交于 8月 04, 2020

Once ras recovery is issued by ras sync flood interrupt or
ras controller interrupt, add this guard to bypass or execute
ras error count register harvest of all IPs.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f75e94d8

05 8月, 2020 3 次提交

drm/amdgpu: break GPU recovery once it's in bad state(v4) · e8fbaf03

由 Guchun Chen 提交于 7月 23, 2020

When GPU executes recovery and retriving bad GPU tag
from external eerpom device, the recovery will be broken
and error message is printed as well for user's awareness.

v2: Refine warning message in threshold reaching case, and
    fix spelling typo.

v3: Fix explicit calling of bad gpu.

v4: Rename function names.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e8fbaf03

drm/amdgpu: skip bad page reservation once issuing from eeprom write · 35cd2cda

由 Guchun Chen 提交于 7月 23, 2020

Once the ras recovery is issued from eeprom write itself,
bad page reservation should be ignored, otherwise, recursive
calling of writting to eeprom would happen.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

35cd2cda

drm/amdgpu: validate bad page threshold in ras(v3) · c84d4670

由 Guchun Chen 提交于 7月 22, 2020

Bad page threshold value should be valid in the range between
-1 and max records length of eeprom. It could determine when
saved bad pages exceed threshold value, and proceed corresponding
actions.

v2: When using the default typical value, it should be min
value between typical value and eeprom max records length.

v3: drop the case of setting bad_page_cnt_threshold to be
    0xFFFFFFFF, as it confuses user.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c84d4670

16 7月, 2020 1 次提交

drm/amdgpu: RAS emergency restart logic refine · bb5c7235

由 Wenhui Sheng 提交于 7月 13, 2020

If we are in RAS triggered situation and
BACO isn't support, emergency restart is needed,
and this code is only needed for some specific
cases(vega20 with given smu fw version).

After we add smu mode1 reset for sienna cichlid, we
need to share AMD_RESET_METHOD_MODE1 with psp mode1 reset,
so in amdgpu_device_gpu_recover, we need differentiate
which mode1 reset we are using, then decide if it's
a full reset and then decide if emergency restart is needed,
the logic will become much more complex.

After discussion with Hawking, move emergency restart logic
to an independent function.
Signed-off-by: NLikun Gao <Likun.Gao@amd.com>
Signed-off-by: NWenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bb5c7235

02 4月, 2020 1 次提交

drm/amdgpu: disable ras query and iject during gpu reset · 61380faa

由 John Clements 提交于 3月 25, 2020

added flag to ras context to indicate if ras query functionality is ready
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NJohn Clements <john.clements@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

61380faa

11 3月, 2020 1 次提交

drm/amdgpu: add function to creat all ras debugfs node · f9317014

由 Tao Zhou 提交于 3月 06, 2020

centralize all debugfs creation in one place for ras

this is required to fix ras when the driver does not use the drm load
and unload callbacks due to ordering issues with the drm device node.
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NStanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f9317014

19 12月, 2019 1 次提交

drm/amdgpu: drop useless BACO arg in amdgpu_ras_reset_gpu · 61934624

由 Guchun Chen 提交于 12月 13, 2019

BACO reset mode strategy is determined by latter func when
calling amdgpu_ras_reset_gpu. So not to confuse audience, drop
it.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

61934624

06 12月, 2019 2 次提交

drm/amdgpu: clear err_event_athub flag after reset exit · 00eaa571

由 Le Ma 提交于 10月 25, 2019

Otherwise next err_event_athub error cannot call gpu reset. And following
resume sequence will not be affected by this flag.

v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

00eaa571

drm/amdgpu: export amdgpu_ras_find_obj to use externally · f2a79be1

由 Le Ma 提交于 11月 25, 2019

Change it to external interface.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f2a79be1

03 10月, 2019 1 次提交

drm/amdgpu: update parameter of ras_ih_cb · f5f06e21

由 Tao Zhou 提交于 9月 12, 2019

change struct ras_err_data *err_data to void *err_data, align with
umc code and the callback's declaration in each ras block could
pay no attention to the structure type
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5f06e21

17 9月, 2019 1 次提交

drm/amdgpu: fix ras ctrl debugfs node leak · 012dd14d

由 Guchun Chen 提交于 9月 16, 2019

Use debugfs_remove_recursive to remove the whole debugfs
directory instead of removing the node one by one.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

012dd14d

16 9月, 2019 1 次提交

drm/amdgpu: Fix mutex lock from atomic context. · 708901a6

由 Andrey Grodzovsky 提交于 9月 10, 2019

Problem:
amdgpu_ras_reserve_bad_pages was moved to amdgpu_ras_reset_gpu
because writing to EEPROM during ASIC reset was unstable.
But for ERREVENT_ATHUB_INTERRUPT amdgpu_ras_reset_gpu is called
directly from ISR context and so locking is not allowed. Also it's
irrelevant for this partilcular interrupt as this is generic RAS
interrupt and not memory errors specific.

Fix:
Avoid calling amdgpu_ras_reserve_bad_pages if not in task context.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

708901a6

14 9月, 2019 6 次提交

drm/amdgpu: move the call of ras recovery_init and bad page reserve to proper place · 1a6fc071

由 Tao Zhou 提交于 8月 30, 2019

ras recovery_init should be called after ttm init,
bad page reserve should be put in front of gpu reset since i2c
may be unstable during gpu reset.
add cleanup for recovery_init and recovery_fini

v2: add more comment and print.
    remove cancel_work_sync in recovery_init.
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1a6fc071

drm/amdgpu: save umc error records · 87d2b92f

由 Tao Zhou 提交于 8月 15, 2019

save umc error records to ras bad page array

v2: add bad pages before gpu reset
v3: add NULL check for adev->umc.funcs
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

87d2b92f

drm/amdgpu: change ras bps type to eeprom table record structure · 9dc23a63

由 Tao Zhou 提交于 8月 13, 2019

change bps type from retired page to eeprom table record, prepare for
saving umc error records to eeprom
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9dc23a63

dmr/amdgpu: Add system auto reboot to RAS. · d5ea093e

由 Andrey Grodzovsky 提交于 8月 22, 2019

In case of RAS error allow user configure auto system
reboot through ras_ctrl.
This is also part of the temproray work around for the RAS
hang problem.

v4: Use latest kernel API for disk sync.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d5ea093e

drm/amdgpu: Avoid HW GPU reset for RAS. · 7c6e68c7

由 Andrey Grodzovsky 提交于 9月 13, 2019

Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7c6e68c7

drm/amdgpu: add helper function to do common ras_late_init/fini (v3) · b293e891

由 Hawking Zhang 提交于 8月 30, 2019

In late_init for ras, the helper function will be used to
1). disable ras feature if the IP block is masked as disabled
2). send enable feature command if the ip block was masked as enabled
3). create debugfs/sysfs node per IP block
4). register interrupt handler

v2: check ih_info.cb to decide add interrupt handler or not

v3: add ras_late_fini for cleanup all the ras fs node and remove
interrupt handler
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b293e891

27 8月, 2019 1 次提交

drm/amdgpu: Add RAS EEPROM table. · 64f55e62

由 Andrey Grodzovsky 提交于 5月 30, 2019

Add RAS EEPROM table manager to eanble RAS errors to be stored
upon appearance and retrived on driver load.

v2: Fix some prints.

v3:
Fix checksum calculation.
Make table record and header structs packed to do correct byte value sum.
Fix record crossing EEPROM page boundry.

v4:
Fix byte sum val calculation for record - look at sizeof(record).
Fix some style comments.

v5: Add description to EEPROM_TABLE_RECORD_SIZE and syntax fixes.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NLuben Tuikov <Luben.Tuikov@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

64f55e62

24 8月, 2019 1 次提交

drm/amdgpu: correct ras error count type · 64cc5414

由 Guchun Chen 提交于 8月 16, 2019

Use unsigned long type for the same ras count variable.
This will avoid overflow on 64 bit system.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

64cc5414

01 8月, 2019 4 次提交

drm/amdgpu: add define for gfx ras subblock · dc23a08f

由 Dennis Li 提交于 7月 19, 2019

Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dc23a08f

drm/amdgpu: allow ras interrupt callback to return error data · cf04dfd0

由 Tao Zhou 提交于 7月 22, 2019

add error data as parameter for ras interrupt cb and process it
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NDennis Li <dennis.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cf04dfd0

drm/amdgpu: add support for recording ras error address · 6f102dba

由 Tao Zhou 提交于 7月 22, 2019

more than one error address may be recorded in one query
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NDennis Li <dennis.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6f102dba

drm/amdgpu: move some ras data structure to amdgpu_ras.h · 7af25d5b

由 Hawking Zhang 提交于 7月 17, 2019

These are common structures that can be included by IP specific
source files
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NDennis Li <dennis.li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7af25d5b

14 6月, 2019 1 次提交

amdgpu: no need to check return value of debugfs_create functions · 450f30ea

由 Greg Kroah-Hartman 提交于 6月 13, 2019

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: xinhui pan <xinhui.pan@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Feifei Xu <Feifei.Xu@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

450f30ea

13 6月, 2019 1 次提交

drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported() · 99f304be

由 Dan Carpenter 提交于 6月 08, 2019

The "block" variable can be set by the user through debugfs, so it can
be quite large which leads to shift wrapping here.  This means we report
a "block" as supported when it's not, and that leads to array overflows
later on.

This bug is not really a security issue in real life, because debugfs is
generally root only.

Fixes: 36ea1bd2 ("drm/amdgpu: add debugfs ctrl node")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

99f304be

12 6月, 2019 1 次提交

drm/amdgpu: Fix bounds checking in amdgpu_ras_is_supported() · 8252562d

由 Dan Carpenter 提交于 6月 08, 2019

The "block" variable can be set by the user through debugfs, so it can
be quite large which leads to shift wrapping here.  This means we report
a "block" as supported when it's not, and that leads to array overflows
later on.

This bug is not really a security issue in real life, because debugfs is
generally root only.

Fixes: 36ea1bd2 ("drm/amdgpu: add debugfs ctrl node")
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8252562d

25 5月, 2019 4 次提交

drm/amdgpu: ras support suspend/resume · 511fdbc3

由 xinhui pan 提交于 5月 09, 2019

add ras suspend function. rename ras_post_init to amdgpu_ras_resume.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NJames Zhu <James.Zhu@amd.com>
Tested-by: NJames Zhu <James.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

511fdbc3

drm/amdgpu: add badpages sysfs interafce · 466b1793

由 xinhui pan 提交于 5月 07, 2019

add badpages node.
it will output badpages list in format
gpu pfn : gpu page size : flags

example
0x00000000 : 0x00001000 : R
0x00000001 : 0x00001000 : R
0x00000002 : 0x00001000 : R
0x00000003 : 0x00001000 : R
0x00000004 : 0x00001000 : R
0x00000005 : 0x00001000 : R
0x00000006 : 0x00001000 : R
0x00000007 : 0x00001000 : P
0x00000008 : 0x00001000 : P
0x00000009 : 0x00001000 : P

flags can be one of below characters
R: reserved.
P: pending for reserve.
F: failed to reserve for some reasons.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

466b1793

drm/amdgpu: handle ras reset · a564808e

由 xinhui pan 提交于 5月 08, 2019

add another flag to allow IP do a gpu reset after device init.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a564808e

drm/amdgpu: Revert "drm/amdgpu: skip gpu reset when ras error occured" · b152e8e1

由 xinhui pan 提交于 5月 08, 2019

Enable this now to reset the GPU on RAS errors.

This reverts commit 138352e5.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b152e8e1

11 4月, 2019 1 次提交

drm/amdgpu: Introduce another ras enable function · 77de502b

由 xinhui pan 提交于 4月 08, 2019

Many parts of the whole SW stack can program the ras enablement state
during the boot. Now we handle that case by adding one function which
check the ras flags and choose different code path.
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

77de502b

28 3月, 2019 1 次提交

drm/amdgpu: Fix amdgpu ras to ta enums conversion · 828cfa29

由 xinhui pan 提交于 3月 21, 2019

Add helpes to transalte the two enums. And it will catch bugs
easily.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

828cfa29

20 3月, 2019 5 次提交

drm/amdgpu: add new ras workflow control flags · 108c6a63

由 xinhui pan 提交于 3月 11, 2019

add ras post init function.
Do some initialization after all IP have finished their late init.

Add new member flags which will control the ras work flow.
For now, vbios enable ras for us on boot. That might change in the
future.
So there should be a flag from vbios to tell us if ras is enabled or not
on boot. Looks like there is no such info now.

Other bits of the flags are reserved to control other parts of ras.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

108c6a63

drm/amdgpu: add new member hw_supported · 5caf466a

由 xinhui pan 提交于 3月 11, 2019

Currently, it is not clear how ras is supported. Both software and
hardware can set the supported. That is confusing.

Fix it by adding new member hw_supported.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5caf466a

drm/amdgpu: skip gpu reset when ras error occured · 138352e5

由 xinhui pan 提交于 3月 01, 2019

gpu reset is not stable on vega20 A1.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

138352e5

drm/amdgpu: add debugfs ctrl node · 36ea1bd2

由 xinhui pan 提交于 1月 31, 2019

allow userspace enable/disable ras
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36ea1bd2

drm/amdgpu: add amdgpu_ras.c to support ras (v2) · c030f2e4

由 xinhui pan 提交于 10月 31, 2018

add obj management.
add feature control.
add debugfs infrastructure.
add sysfs infrastructure.
add IH infrastructure.
add recovery infrastructure.

It is a framework. Other IPs need call amdgpu_ras_xxx function instead of
psp_ras_xxx functions.

v2: squash in warning fixes
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c030f2e4

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功