提交 · f83f5a1e115c8dc382a5abaaf0c10374fbcf1038 · openeuler / Kernel

06 12月, 2019 1 次提交

drm/amdgpu/gfx: Improvement on EDC GPR workarounds · f83f5a1e

由 James Zhu 提交于 12月 03, 2019

SPI limits total CS waves in flight per SE to no more than 32 * num_cu and
we need to stuff 40 waves on a CU to completely clean the SGPR. This is
accomplished in the WR by cleaning the SE in two steps, half of the CU per
step.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NYong Zhao <Yong.Zhao@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f83f5a1e

03 12月, 2019 3 次提交

drm/amdgpu: fix calltrace during kmd unload(v3) · 82a829dc

由 Monk Liu 提交于 11月 26, 2019

issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo

why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.

fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3

v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3

v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever

take care of gfx7/8 as well
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NXiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

82a829dc

drm/amdgpu/gfx: Increase dispatch packet number · 45317d5f

由 James Zhu 提交于 11月 26, 2019

For Arcturus, increase dispatch packet number to stress scheduler.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

45317d5f

drm/amdgpu/gfx: Clear more EDC cnt · 2255d7f3

由 James Zhu 提交于 11月 26, 2019

Clear SDMA and HDP EDC counter in GPR workarounds.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2255d7f3

27 11月, 2019 1 次提交

drm/amdgpu: apply gpr/gds workaround before enabling GFX EDC mode · be3e73ea

由 Hawking Zhang 提交于 11月 20, 2019

gfx memory should be initialized before enabling
DED and FUE field in mmGB_EDC_MODE
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

be3e73ea

23 11月, 2019 2 次提交

drm/amdgpu: Update Arcturus golden registers · 6e04b224

由 Jay Cornwall 提交于 11月 20, 2019

Signed-off-by: NJay Cornwall <jay.cornwall@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6e04b224

drm/amdgpu: define soc15_ras_field_entry for reuse · 46f71969

由 Dennis Li 提交于 11月 19, 2019

The struct soc15_ras_field_entry will be reused by
other IPs, such as mmhub and gc

v2: rename ras_subblock_regs to gc_ras_fields_vg20,
because the future asic maybe have a different table.
Signed-off-by: NDennis Li <dennis.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

46f71969

19 11月, 2019 1 次提交

drm/amdgpu: disable gfxoff on original raven · 3f2a06ac

由 Alex Deucher 提交于 11月 15, 2019

There are still combinations of sbios and firmware that
are not stable.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3f2a06ac

09 11月, 2019 1 次提交

drm/amdgpu: allow direct upload save restore list for raven2 · eebc7f4d

由 changzhu 提交于 11月 07, 2019

It will cause modprobe atombios stuck problem in raven2 if it doesn't
allow direct upload save restore list from gfx driver.
So it needs to allow direct upload save restore list for raven2
temporarily.
Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eebc7f4d

07 11月, 2019 4 次提交

drm/amdgpu/renoir: move gfxoff handling into gfx9 module · ad4d81dc

由 Alex Deucher 提交于 10月 29, 2019

To properly handle the option parsing ordering.
Reviewed-by: NYong Zhao <yong.zhao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ad4d81dc

drm/amdgpu: change read of GPU clock counter on Vega10 VF · f88e2d1f

由 Eric Huang 提交于 11月 05, 2019

Using unified VBIOS has performance drop in sriov environment.
The fix is switching to another register instead.
Signed-off-by: NEric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f88e2d1f

drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9 · 11c61089

由 changzhu 提交于 11月 05, 2019

It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.
Signed-off-by: Nchangzhu <Changfeng.Zhu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11c61089

drm/amdgpu: disallow direct upload save restore list from gfx driver · 58f46d4b

由 Hawking Zhang 提交于 11月 04, 2019

Direct uploading save/restore list via mmio register writes breaks the security
policy. Instead, the driver should pass s&r list to psp.

For all the ASICs that use rlc v2_1 headers, the driver actually upload s&r list
twice, in non-psp ucode front door loading phase and gfx pg initialization phase.
The latter is not allowed.

VG12 is the only exception where the driver still keeps legacy approach for S&R
list uploading. In theory, this can be elimnated if we have valid srcntl ucode
for VG12.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NCandice Li <Candice.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

58f46d4b

30 10月, 2019 2 次提交

drm/amdgpu: fix no ACK from LDS read during stress test for Arcturus · 361d66ed

由 Le Ma 提交于 10月 30, 2019

Set mmSQ_CONFIG.DISABLE_SMEM_SOFT_CLAUSE as W/R.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

361d66ed

drm/amdgpu: bypass some cleanup work after err_event_athub (v2) · bff77e86

由 Le Ma 提交于 10月 25, 2019

PSP lost connection when err_event_athub occurs. These cleanup work can be
skipped in BACO reset.

v2: squash in missing include (Alex)
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <hawking.zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bff77e86

26 10月, 2019 1 次提交

drm/amdgpu: remove unused parameter in amdgpu_gfx_kiq_free_ring · 9f0256da

由 Nirmoy Das 提交于 10月 23, 2019

Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9f0256da

18 10月, 2019 2 次提交

drm/amdgpu: fix S3 failed as RLC safe mode entry stucked in polloing gfx acq · f8391101

由 Prike Liang 提交于 10月 15, 2019

Fix gfx cgpg setting sequence for RLC deadlock at safe mode entry in polling gfx response.
The patch can fix VCN IB test failed and DAL get dispaly count failed issue.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f8391101

drm/amdgpu: add GFX_PIPELINE capacity check for updating gfx cgpg · c8486eef

由 Prike Liang 提交于 10月 15, 2019

Before disable gfx pipeline power gating need check the flag AMD_PG_SUPPORT_GFX_PIPELINE.
Signed-off-by: NPrike Liang <Prike.Liang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c8486eef

16 10月, 2019 2 次提交

drm/amdgpu: add RAS support for VML2 and ATCL2 · 82092474

由 Dennis Li 提交于 9月 29, 2019

v1: Add codes to query the EDC count of VML2 & ATCL2
v2: Rename VML2/ATCL2 registers and drop their mask define
v3: Add back the ECC mask for VML2 registers
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NHawking Zhang <hawking.zhang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

82092474

drm/amdgpu: change to query the actual EDC counter · 13ba0344

由 Dennis Li 提交于 10月 12, 2019

For the potential request in the future, change to
query the actual EDC counter.
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NHawking Zhang <hawking.zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

13ba0344

03 10月, 2019 6 次提交

drm/amdgpu: remove ih_info parameter of gfx_ras_late_init · 41190cd7

由 Tao Zhou 提交于 9月 19, 2019

gfx_ras_late_init can get the info by itself
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

41190cd7

drm/amdgpu: add common gfx_ras_fini function · 3b7b7647

由 Tao Zhou 提交于 9月 12, 2019

gfx_ras_fini can be shared among all generations of gfx
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3b7b7647

drm/amdgpu: move gfx ecc functions to generic gfx file · 725253ab

由 Tao Zhou 提交于 9月 12, 2019

gfx ras ecc common functions could be reused among all gfx generations
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

725253ab

drm/amdgpu: update parameter of ras_ih_cb · f5f06e21

由 Tao Zhou 提交于 9月 12, 2019

change struct ras_err_data *err_data to void *err_data, align with
umc code and the callback's declaration in each ras block could
pay no attention to the structure type
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5f06e21

drm/amdgpu: remove gfx9 NGG · 6de088a0

由 Marek Olšák 提交于 9月 19, 2019

Never used.
Signed-off-by: NMarek Olšák <marek.olsak@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6de088a0

drm/amdgpu: do not init mec2 jt for renoir · fec6a08a

由 Hawking Zhang 提交于 9月 18, 2019

For ASICs like renoir/arct, driver doesn't need to load mec2 jt.
when mec1 jt is loaded, mec2 jt will be loaded automatically
since the write is actaully broadcasted to both.

We need to more time to test other gfx9 asic. but for now we should
be able to draw conclusion that mec2 jt is not needed for renoir and
arct.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fec6a08a

18 9月, 2019 1 次提交

drm/amdgpu: remove program of lbpw for renoir · df794f67

由 Aaron Liu 提交于 9月 16, 2019

These is no LBPW on Renoir. So removing program of lbpw for renoir.
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

df794f67

17 9月, 2019 1 次提交

drm/amdgpu: remove program of lbpw for renoir · 28faa17e

由 Aaron Liu 提交于 9月 16, 2019

These is no LBPW on Renoir. So removing program of lbpw for renoir.
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

28faa17e

16 9月, 2019 1 次提交

drm/amdgpu: fix CPDMA hang in PRT mode for VEGA10 · ff9d0971

由 Tianci.Yin 提交于 9月 10, 2019

add and_mask since the programming logic of golden setting changed
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ff9d0971

14 9月, 2019 7 次提交

drm/amdgpu/gfx: switch to amdgpu_gfx_ras_late_init helper function · 6caeee7a

由 Hawking Zhang 提交于 9月 03, 2019

amdgpu_gfx_ras_late_init is used to init gfx specfic
ras debugfs/sysfs node and gfx specific interrupt handler.
It can be shared among gfx generations
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6caeee7a

drm/amdgpu: set ip specific ras interface pointer to NULL after free it · d094aea3

由 Hawking Zhang 提交于 9月 03, 2019

to prevent access to dangling pointers
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d094aea3

drm/amdgpu: Avoid HW GPU reset for RAS. · 7c6e68c7

由 Andrey Grodzovsky 提交于 9月 13, 2019

Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7c6e68c7

drm/amdgpu: only apply gds clearing workaround when ras is supported · 39857252

由 Hawking Zhang 提交于 8月 31, 2019

gds clearing workaround should only be applied on asics that support gfx ras
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

39857252

drm/amdgpu: fix memory leak when ras is not supported on specific ip block · 8bf2485a

由 Hawking Zhang 提交于 8月 31, 2019

free ras_if if ras is not supported
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8bf2485a

drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2) · 63fa48db

由 Hawking Zhang 提交于 8月 29, 2019

call helper function in late init phase to handle ras init
for gfx ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

63fa48db

drm/amdgpu: switch to new amdgpu_nbio structure · bebc0762

由 Hawking Zhang 提交于 8月 23, 2019

no functional change, just switch to new structures
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bebc0762

28 8月, 2019 1 次提交

drm/amdgpu: fix GFXOFF on Picasso and Raven2 · 41940ff5

由 Aaron Liu 提交于 8月 27, 2019

For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8),
firmware is sufficient to support gfxoff.
In commit 98f58ada, for picasso&raven2,
return directly and cause gfxoff disabled.

Fixes: 98f58ada ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible")
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

41940ff5

27 8月, 2019 1 次提交

drm/amdgpu: fix GFXOFF on Picasso and Raven2 · c072b0c2

由 Aaron Liu 提交于 8月 27, 2019

For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8),
firmware is sufficient to support gfxoff.
In commit 98f58ada, for picasso&raven2,
return directly and cause gfxoff disabled.

Fixes: 98f58ada ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible")
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c072b0c2

23 8月, 2019 2 次提交

drm/amdgpu: update gc/sdma goldensetting for rn · f13580a9

由 Aaron Liu 提交于 8月 07, 2019

This patch updates gc/sdma goldensetting for renoir
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f13580a9

drm/amdgpu: add set_gfx_cgpg implement (v2) · 12687955

由 Aaron Liu 提交于 7月 16, 2019

add set_gfx_cgpg implement

v2: check if using sw_smu (Alex)
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

12687955

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功