提交 · 6de088a08ddc4876947e3319c98df116257e6ea5 · openeuler / Kernel

03 10月, 2019 2 次提交

由 Marek Olšák 提交于 9月 19, 2019

Never used.
Signed-off-by: NMarek Olšák <marek.olsak@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6de088a0

drm/amdgpu: do not init mec2 jt for renoir · fec6a08a

由 Hawking Zhang 提交于 9月 18, 2019

For ASICs like renoir/arct, driver doesn't need to load mec2 jt.
when mec1 jt is loaded, mec2 jt will be loaded automatically
since the write is actaully broadcasted to both.

We need to more time to test other gfx9 asic. but for now we should
be able to draw conclusion that mec2 jt is not needed for renoir and
arct.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fec6a08a

17 9月, 2019 1 次提交

drm/amdgpu: remove program of lbpw for renoir · 28faa17e

由 Aaron Liu 提交于 9月 16, 2019

These is no LBPW on Renoir. So removing program of lbpw for renoir.
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

28faa17e

16 9月, 2019 1 次提交

drm/amdgpu: fix CPDMA hang in PRT mode for VEGA10 · ff9d0971

由 Tianci.Yin 提交于 9月 10, 2019

add and_mask since the programming logic of golden setting changed
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ff9d0971

14 9月, 2019 7 次提交

drm/amdgpu/gfx: switch to amdgpu_gfx_ras_late_init helper function · 6caeee7a

由 Hawking Zhang 提交于 9月 03, 2019

amdgpu_gfx_ras_late_init is used to init gfx specfic
ras debugfs/sysfs node and gfx specific interrupt handler.
It can be shared among gfx generations
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6caeee7a

drm/amdgpu: set ip specific ras interface pointer to NULL after free it · d094aea3

由 Hawking Zhang 提交于 9月 03, 2019

to prevent access to dangling pointers
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d094aea3

drm/amdgpu: Avoid HW GPU reset for RAS. · 7c6e68c7

由 Andrey Grodzovsky 提交于 9月 13, 2019

Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt  will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7c6e68c7

drm/amdgpu: only apply gds clearing workaround when ras is supported · 39857252

由 Hawking Zhang 提交于 8月 31, 2019

gds clearing workaround should only be applied on asics that support gfx ras
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

39857252

drm/amdgpu: fix memory leak when ras is not supported on specific ip block · 8bf2485a

由 Hawking Zhang 提交于 8月 31, 2019

free ras_if if ras is not supported
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8bf2485a

drm/amdgpu: switch to amdgpu_ras_late_init for gfx v9 block (v2) · 63fa48db

由 Hawking Zhang 提交于 8月 29, 2019

call helper function in late init phase to handle ras init
for gfx ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

63fa48db

drm/amdgpu: switch to new amdgpu_nbio structure · bebc0762

由 Hawking Zhang 提交于 8月 23, 2019

no functional change, just switch to new structures
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bebc0762

27 8月, 2019 1 次提交

drm/amdgpu: fix GFXOFF on Picasso and Raven2 · c072b0c2

由 Aaron Liu 提交于 8月 27, 2019

For picasso(adev->pdev->device == 0x15d8)&raven2(adev->rev_id >= 0x8),
firmware is sufficient to support gfxoff.
In commit 98f58ada, for picasso&raven2,
return directly and cause gfxoff disabled.

Fixes: 98f58ada ("drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible")
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c072b0c2

23 8月, 2019 2 次提交

drm/amdgpu: update gc/sdma goldensetting for rn · f13580a9

由 Aaron Liu 提交于 8月 07, 2019

This patch updates gc/sdma goldensetting for renoir
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f13580a9

drm/amdgpu: add set_gfx_cgpg implement (v2) · 12687955

由 Aaron Liu 提交于 7月 16, 2019

add set_gfx_cgpg implement

v2: check if using sw_smu (Alex)
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

12687955

22 8月, 2019 2 次提交

drm/amdgpu: remove duplicated include from gfx_v9_0.c · 252d2a52

由 YueHaibing 提交于 7月 10, 2019

Remove duplicated include.
Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

252d2a52

drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible · b05f65d7

由 Alex Deucher 提交于 8月 15, 2019

We need to set certain power gating flags after we determine
if the firmware version is sufficient to support gfxoff.
Previously we set the pg flags in early init, but we later
we might have disabled gfxoff if the firmware versions didn't
support it.  Move adding the additional pg flags after we
determine whether or not to support gfxoff.

Fixes: 00544006 ("drm/amdgpu: enable gfxoff again on raven series (v2)")
Tested-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: NTom St Denis <tom.stdenis@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>

b05f65d7

13 8月, 2019 10 次提交

drm/amdgpu: update lbpw for renoir · 40c8a329

由 Aaron Liu 提交于 7月 16, 2019

enable gfx_v9_0_init_lbpw for renoir
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

40c8a329

drm/amdgpu: enable power gating for renoir · 95f9e74c

由 Aaron Liu 提交于 7月 16, 2019

enable gfx power gating for renoir
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

95f9e74c

drm/amdgpu: enable clock gating for renoir · f78e007f

由 Aaron Liu 提交于 8月 12, 2019

enable gfx&common clock gating for renoir
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NEvan Quan <evan.quan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f78e007f

drm/amdgpu: add gfx golden settings for renoir (v2) · 33294eb8

由 Huang Rui 提交于 6月 23, 2019

This patch adds gfx golden settings for renoir real asic.

v2: update settings (Alex)
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

33294eb8

drm/amdgpu: set rlc funcs for renoir · 6b3ad3b2

由 Aaron Liu 提交于 7月 24, 2019

add gfx_v9_0_rlc_funcs for renoir
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6b3ad3b2

drm/amdgpu: add gfx support for renoir · 1aafd447

由 Huang Rui 提交于 7月 24, 2019

Add Renoir checks to gfx9 code.
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1aafd447

drm/amdgpu: fix gfx9 soft recovery · 62cfcb9e

由 Pierre-Eric Pelloux-Prayer 提交于 8月 06, 2019

The SOC15_REG_OFFSET() macro wasn't used, making the soft recovery fail.

v2: use WREG32_SOC15 instead of WREG32 + SOC15_REG_OFFSET
Signed-off-by: NPierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

62cfcb9e

drm/amdgpu: increase CGCG gfx idle threshold for Arcturus · 15e2f43a

由 Le Ma 提交于 8月 09, 2019

Follow the hw spec, and no need to consider gfxoff on Arcturus
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NKevin Wang <kevin1.wang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

15e2f43a

drm/amdgpu: add gfx clock gating for Arcturus · f60481a9

由 Le Ma 提交于 8月 07, 2019

Add ARCTURUS case in gfx set clockgating function. No 3d clock on Arcturus.
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f60481a9

drm/amdgpu: add check to avoid array bound issue · a2b45994

由 Guchun Chen 提交于 8月 08, 2019

Sub_block_index can be passed from user level, so
add one check before accessing the array first to
prevent array index out of bound problem.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a2b45994

02 8月, 2019 6 次提交

drm/amdgpu: disable MEC2 JT context init for Arcturus · 8fda90e8

由 John Clements 提交于 7月 31, 2019

We don't need to handle it like other asics.
Signed-off-by: NJohn Clements <john.clements@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8fda90e8

drm/amdgpu: removed duplicate line · c0dac3c9

由 John Clements 提交于 7月 31, 2019

Remove duplicate break.
Signed-off-by: NJohn Clements <john.clements@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c0dac3c9

drm/amdgpu: replace AMDGPU_RAS_UE with AMDGPU_RAS_SUCCESS · bd2280da

由 Tao Zhou 提交于 8月 01, 2019

ce can also trigger interrupt, and even both ce and ue error can be
found in one ras query, distinguishing between ce and ue in interrupt
handler is uncessary.
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Suggested-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bd2280da

drm/amdkfd: Extend CU mask to 8 SEs (v3) · 5145d57e

由 Jay Cornwall 提交于 7月 18, 2019

Following bitmap layout logic introduced by:
"drm/amdgpu: support get_cu_info for Arcturus".

v2: squash in fixup for gfx_v9_0.c (Alex)
v3: squash in debug print output fix
Signed-off-by: NJay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5145d57e

drm/amdgpu: support get_cu_info for Arcturus · 857b82d0

由 Le Ma 提交于 7月 08, 2019

This change is because SE/SH layout on Arcturus is 8*1, different from
4*2(or 4*1) on Vega ASICs.

Currently the cu bitmap array is 4x4 size, and besides the bitmap is used widely
across SW stack. To mostly reduce the scale of impact, we make the cu bitmap
array compatible with SE/SH layout on Arcturus. Then the store of cu bits of
each shader array for Arcturus will be like below:
    SE0,SH0 --> bitmap[0][0]
    SE1,SH0 --> bitmap[1][0]
    SE2,SH0 --> bitmap[2][0]
    SE3,SH0 --> bitmap[3][0]
    SE4,SH0 --> bitmap[0][1]
    SE5,SH0 --> bitmap[1][1]
    SE6,SH0 --> bitmap[2][1]
    SE7,SH0 --> bitmap[3][1]
Signed-off-by: NLe Ma <le.ma@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

857b82d0

drm/amdgpu: cleanup vega10 SRIOV code path · 4cd4c5c0

由 Monk Liu 提交于 7月 30, 2019

we can simplify all those unnecessary function under
SRIOV for vega10 since:
1) PSP L1 policy is by force enabled in SRIOV
2) original logic always set all flags which make itself
   a dummy step

besides,
1) the ih_doorbell_range set should also be skipped
for VEGA10 SRIOV.
2) the gfx_common registers should also be skipped
for VEGA10 SRIOV.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NEmily Deng <Emily.Deng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4cd4c5c0

01 8月, 2019 5 次提交

drm/amdgpu: disable inject for failed subblocks of gfx · dc4d716d

由 Dennis Li 提交于 7月 23, 2019

some subblocks of gfx fail in inject test, disable them
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dc4d716d

drm/amdgpu: support gfx ras error injection and err_cnt query · 83b0582c

由 Dennis Li 提交于 7月 31, 2019

check gfx error count in both ras querry function and
ras interrupt handler.

gfx ras is still disabled by default due to known stability
issue found in gpu reset.
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

83b0582c

drm/amdgpu: add RAS callback for gfx · 2c960ea0

由 Dennis Li 提交于 7月 31, 2019

Add functions for RAS error inject and query error counter
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2c960ea0

drm/amdgpu: add define for gfx ras subblock · dc23a08f

由 Dennis Li 提交于 7月 19, 2019

Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dc23a08f

drm/amdgpu: update interrupt callback for all ras clients · 81e02619

由 Tao Zhou 提交于 7月 22, 2019

add err_data parameter in interrupt cb for ras clients
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NDennis Li <dennis.li@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

81e02619

31 7月, 2019 1 次提交

drm/amdgpu: Default disable GDS for compute+gfx · 2c897318

由 Joseph Greathouse 提交于 7月 26, 2019

Units in the GDS block default to allowing all VMIDs access to all
entries. Disable shader access to the GDS, GWS, and OA blocks from all
compute and gfx VMIDs by default. For compute, HWS firmware will set
up the access bits for the appropriate VMID when a compute queue
requires access to these blocks.
The driver will handle enabling access on-demand for graphics VMIDs.

Leaving VMID0 with full access because otherwise HWS cannot save or
restore values during task switch.

v2: Fixed code and comment styling.
Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2c897318

19 7月, 2019 2 次提交

drm/amdgpu: Default disable GDS for compute VMIDs · fbdc5d8d

由 Joseph Greathouse 提交于 7月 17, 2019

The GDS and GWS blocks default to allowing all VMIDs to
access all entries. Graphics VMIDs can handle setting
these limits when the driver launches work. However,
compute workloads under HWS control don't go through the
kernel driver. Instead, HWS firmware should set these
limits when a process is put into a VMID slot.

Disable access to these devices by default by turning off
all mask bits (for OA) and setting BASE=SIZE=0 (for GDS
and GWS) for all compute VMIDs. If a process wants to use
these resources, they can request this from the HWS
firmware (when such capabilities are enabled). HWS will
then handle setting the base and limit for the process when
it is assigned to a VMID.

This will also prevent user kernels from getting 'stuck' in
GWS by accident if they write GWS-using code but HWS
firmware is not set up to handle GWS reset. Until HWS is
enabled to handle GWS properly, all GWS accesses will
MEM_VIOL fault the kernel.

v2: Move initialization outside of SRBM mutex
Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fbdc5d8d

drm/amdgpu: skip gfx 9 common golden settings for arct · f9cf36fc

由 Hawking Zhang 提交于 6月 29, 2019

They are not needed by arct
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f9cf36fc

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功