提交 · b3ac17667f115e64c67ea6101fc814f47134b530 · openeuler / Kernel

30 10月, 2019 1 次提交

drm/amdgpu: Fix SDMA hang when performing VKexample test · e5574f61

由 chen gong 提交于 10月 23, 2019

VKexample test hang during Occlusion/SDMA/Varia runs.
Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue.
Signed-off-by: Nchen gong <curry.gong@amd.com>
Reviewed-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

e5574f61

26 10月, 2019 1 次提交

drm/amdgpu: Fix SDMA hang when performing VKexample test · 5aed95bb

由 chen gong 提交于 10月 23, 2019

VKexample test hang during Occlusion/SDMA/Varia runs.
Clear XNACK_WATERMK in reg SDMA0_UTCL1_WATERMK to fix this issue.
Signed-off-by: Nchen gong <curry.gong@amd.com>
Reviewed-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5aed95bb

11 10月, 2019 1 次提交

drm/amdgpu: Do not implement power-on for SDMA after do mode2 reset on Renoir · 6696b8ad

由 chen gong 提交于 9月 29, 2019

Find that ring sdma0 test failed if turn on SDMA powergating after do
mode2 reset.

Perhaps the mode2 reset does not reset the SDMA PG state, SDMA is
already powered up so there is no need to ask the SMU to power it up
again. So I skip this function for a moment.
Signed-off-by: Nchen gong <curry.gong@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6696b8ad

03 10月, 2019 5 次提交

drm/amdgpu: add comments in ras interrupt callback · 3d8361b1

由 Tao Zhou 提交于 9月 23, 2019

add comments to clarify why checking GFX IP BLOCK for each ras interrupt callback
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3d8361b1

drm/amdgpu: add common sdma_ras_fini function · e536c818

由 Tao Zhou 提交于 9月 12, 2019

sdma_ras_fini can be shared among all generations of sdma
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e536c818

drm/amdgpu: refine sdma4 ras_data_cb · fc04e6b4

由 Tao Zhou 提交于 9月 17, 2019

simplify code logic and refine return value

v2: remove unused error source code
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fc04e6b4

drm/amdgpu: move sdma ecc functions to generic sdma file · 4c65dd10

由 Tao Zhou 提交于 9月 12, 2019

sdma ras ecc functions can be reused among all sdma generations
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4c65dd10

drm/amdgpu: update parameter of ras_ih_cb · f5f06e21

由 Tao Zhou 提交于 9月 12, 2019

change struct ras_err_data *err_data to void *err_data, align with
umc code and the callback's declaration in each ras block could
pay no attention to the structure type
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5f06e21

18 9月, 2019 1 次提交

drm/amd/amdgpu: power up sdma engine when S3 resume back · 4f3a2c10

由 Prike Liang 提交于 9月 11, 2019

The sdma_v4 should be ungated when the IP resume back,
otherwise it will hang up and resume time out error.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4f3a2c10

16 9月, 2019 1 次提交

drm/amd/amdgpu: power up sdma engine when S3 resume back · a90a24d5

由 Prike Liang 提交于 9月 11, 2019

The sdma_v4 should be ungated when the IP resume back,
otherwise it will hang up and resume time out error.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a90a24d5

14 9月, 2019 6 次提交

drm/amdgpu/sdma: switch to amdgpu_sdma_ras_late_init helper function · bfcf62c2

由 Hawking Zhang 提交于 9月 03, 2019

amdgpu_sdma_ras_late_init is used to init sdma specfic
ras debugfs/sysfs node and sdma specific interrupt handler.
It can be shared among sdma generations
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bfcf62c2

drm/amdgpu: set ip specific ras interface pointer to NULL after free it · d094aea3

由 Hawking Zhang 提交于 9月 03, 2019

to prevent access to dangling pointers
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d094aea3

drm/amdgpu: Avoid HW GPU reset for RAS. · 7c6e68c7

由 Andrey Grodzovsky 提交于 9月 13, 2019

Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7c6e68c7

drm/amdgpu: fix memory leak when ras is not supported on specific ip block · 8bf2485a

由 Hawking Zhang 提交于 8月 31, 2019

free ras_if if ras is not supported
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8bf2485a

drm/amdgpu: switch to amdgpu_ras_late_init for sdma v4 block (v2) · 7d0a31e8

由 Hawking Zhang 提交于 8月 29, 2019

call helper function in late init phase to handle ras init
for sdma ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7d0a31e8

drm/amdgpu: switch to new amdgpu_nbio structure · bebc0762

由 Hawking Zhang 提交于 8月 23, 2019

no functional change, just switch to new structures
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bebc0762

30 8月, 2019 1 次提交

drm/amdgpu: Initialize and update SDMA power gating · 334ffd0d

由 Prike Liang 提交于 8月 26, 2019

Init SDMA HW base configuration and enable idle INT for rn.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

334ffd0d

27 8月, 2019 1 次提交

Revert "drm/amdgpu: free up the first paging queue v2" · 250af743

由 Gang Ba 提交于 8月 23, 2019

This reverts commit 4f8bc72f.

It turned out that a single reserved queue wouldn't be
sufficient for page fault handling.
Signed-off-by: NGang Ba <gaba@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

250af743

23 8月, 2019 2 次提交

drm/amdgpu: update gc/sdma goldensetting for rn · f13580a9

由 Aaron Liu 提交于 8月 07, 2019

This patch updates gc/sdma goldensetting for renoir
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f13580a9

drm/amdgpu/sdma4: set sdma clock gating for rn · 91c5b6b3

由 Prike Liang 提交于 8月 12, 2019

Add support for SDMA clockgating on RN.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

91c5b6b3

13 8月, 2019 4 次提交

drm/amdgpu: add sdma golden settings for renoir · a46e1716

由 Huang Rui 提交于 7月 24, 2019

This patch adds sdma golden settings for renoir asic.
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a46e1716

drm/amdgpu: add sdma support for renoir · 2d49738a

由 Huang Rui 提交于 8月 08, 2019

Add renoir checks to appropriate places.
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2d49738a

drm/amdgpu: add sdma clock gating for Arcturus · 8dc7e07c

由 Le Ma 提交于 8月 07, 2019

Add ARCTURUS case in sdma set clockgating function
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8dc7e07c

drm/amdgpu: support sdma clock gating for more instances · 78864760

由 Le Ma 提交于 8月 07, 2019

Shorten the code with RREG32_SDMA/WREG32_SDMA macro in CG part.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

78864760

02 8月, 2019 4 次提交

drm/amdgpu: update SDMA V4 microcode init · 8c2ef8ca

由 John Clements 提交于 8月 01, 2019

Removed loading duplicate instances of SDMA FW for Arcturus.
We use a single image for all instances.
Signed-off-by: NJohn Clements <john.clements@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8c2ef8ca

drm/amdgpu: replace AMDGPU_RAS_UE with AMDGPU_RAS_SUCCESS · bd2280da

由 Tao Zhou 提交于 8月 01, 2019

ce can also trigger interrupt, and even both ce and ue error can be
found in one ras query, distinguishing between ce and ue in interrupt
handler is uncessary.
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Suggested-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bd2280da

drm/amdgpu: fix unsigned variable instance compared to less than zero · ac4bf4a1

由 Colin Ian King 提交于 8月 01, 2019

Currenly the error check on variable instance is always false because
it is a uint32_t type and this is never less than zero. Fix this by
making it an int type.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 7d0e6329 ("drm/amdgpu: update more sdma instances irq support")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ac4bf4a1

drm/amdgpu: cleanup vega10 SRIOV code path · 4cd4c5c0

由 Monk Liu 提交于 7月 30, 2019

we can simplify all those unnecessary function under
SRIOV for vega10 since:
1) PSP L1 policy is by force enabled in SRIOV
2) original logic always set all flags which make itself
   a dummy step

besides,
1) the ih_doorbell_range set should also be skipped
for VEGA10 SRIOV.
2) the gfx_common registers should also be skipped
for VEGA10 SRIOV.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NEmily Deng <Emily.Deng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4cd4c5c0

01 8月, 2019 1 次提交

drm/amdgpu: update interrupt callback for all ras clients · 81e02619

由 Tao Zhou 提交于 7月 22, 2019

add err_data parameter in interrupt cb for ras clients
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NDennis Li <dennis.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

81e02619

31 7月, 2019 2 次提交

drm/amdgpu: correct irq type used for sdma ecc · 86132498

由 Hawking Zhang 提交于 7月 25, 2019

we should pass irq type, instead of irq client id,
to irq_get/put interface
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

86132498

drm/amdgpu: update more sdma instances irq support · 7d0e6329

由 Le Ma 提交于 7月 16, 2019

Update for sdma ras ecc_irq and other minors.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7d0e6329

23 7月, 2019 1 次提交

drm/amdgpu: set sdma irq src num according to sdma instances · d52d6de2

由 Hawking Zhang 提交于 7月 19, 2019

Otherwise, it will cause driver access non-existing sdma registers
in gpu reset code path
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NLe Ma <Le.Ma@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d52d6de2

19 7月, 2019 8 次提交

drm/amdgpu: enable all 8 sdma instances for Arcturus silicon · 69d4de94

由 Le Ma 提交于 7月 02, 2019

The more 6 sdma instances work fine now with DF fix in vbios:
  * mmDF_PIE_AON_MiscClientsEnable(0x1c728)=0x3fe(DF_ALL_INSTANCE)
       [9:4]MmhubsEnable=3f (change from 0)
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

69d4de94

drm/amdgpu: limit sdma instances to 2 for Arcturus in BU phase · fc1e272e

由 Le Ma 提交于 6月 30, 2019

Another 6 sdma instances do not work at present. Disable them to unblock KFD
for silicon bringup as a workaround
Signed-off-by: NLe Ma <le.ma@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fc1e272e

drm/amdgpu: add arct sdma golden settings · ca1961a2

由 Hawking Zhang 提交于 6月 27, 2019

Golden SDMA register settings from the hw team.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NLe Ma <Le.Ma@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ca1961a2

drm/amdgpu: declare sdma firmware binary files for Arcturus · eec28ef0

由 Le Ma 提交于 5月 21, 2019

So that they are properly picked up as a driver dependency.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eec28ef0

drm/amdgpu: add paging queue support for 8 SDMA instances on Arcturus · f864e3e6

由 Le Ma 提交于 7月 16, 2019

Properly enable all 8 instances for paging queue.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f864e3e6

drm/amdgpu: add Arcturus chip_name for init sdma microcode · 5ce40fd8

由 Le Ma 提交于 11月 15, 2018

So we load the proper firmware for arcturus.
Signed-off-by: NLe Ma <le.ma@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5ce40fd8

drm/amdgpu: enable 8 SDMA instances for Arcturus · 121d8599

由 Le Ma 提交于 11月 20, 2018

All the 8 SDMA instances work fine on the latest Gopher build model.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NSnow Zhang <Snow.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

121d8599

drm/amdgpu: correct Arcturus SDMA address space base index · 5cd54ab8

由 Le Ma 提交于 11月 15, 2018

Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5cd54ab8

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功