提交 · 9b54d2017687df9fa827faf9e4022973b87fc0ff · openeuler / Kernel

20 3月, 2019 40 次提交

drm/amdkfd: add RAS ECC event support (v3) · 9b54d201

由 Eric Huang 提交于 1月 11, 2019

RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.

v2: Fix misleading-indentation warning
v3: fix build with CONFIG_HSA_AMD disabled
Signed-off-by: NEric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9b54d201

drm/amdgpu: add human readable debugfs control support (v2) · 96ebb307

由 xinhui pan 提交于 3月 01, 2019

Currently, the debugfs control node can't parse bash-like commands.
Now add such support for any tester that uses scripts.

v2: squash in fixes for input validation
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

96ebb307

drm/amdgpu: skip gpu reset when ras error occured · 138352e5

由 xinhui pan 提交于 3月 01, 2019

gpu reset is not stable on vega20 A1.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

138352e5

drm/amdgpu: add ioctl query for enabled ras features (v2) · 5cb77114

由 xinhui pan 提交于 12月 17, 2018

Add a query for userspace to check which RAS features
are enabled.

v2: squash in warning fix
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5cb77114

drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2 · ae363a21

由 xinhui pan 提交于 12月 17, 2018

Add AMDGPU_CTX_QUERY2_FLAGS_RAS_CE/UE which indicate if any error happened
between previous query and this query.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ae363a21

drm/amdgpu: enable ras on gmc9 · 791c4769

由 xinhui pan 提交于 1月 23, 2019

Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

791c4769

drm/amdgpu: enable ras on gfx9 (v2) · 760a1d55

由 Feifei Xu 提交于 12月 07, 2018

Register ecc interrupts and ecc interrupt handler on gfx9.
Add ras support on gfx9

v2: squash in warning fix
Signed-off-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

760a1d55

drm/amdgpu: enable ras on sdma4 · 8cf12507

由 xinhui pan 提交于 11月 28, 2018

register IH, enable ras features on sdma.
create sysfs debugfs file for sdma.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Signed-off-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NEric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8cf12507

drm/amdgpu: reserve bad pages during recovery · 2be4c4a9

由 xinhui pan 提交于 1月 21, 2019

Mark vram pages with errors as bad and prevent the driver
from using them.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2be4c4a9

drm/amdgpu: add debugfs ctrl node · 36ea1bd2

由 xinhui pan 提交于 1月 31, 2019

allow userspace enable/disable ras
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36ea1bd2

drm/amdgpu: add amdgpu_ras.c to support ras (v2) · c030f2e4

由 xinhui pan 提交于 10月 31, 2018

add obj management.
add feature control.
add debugfs infrastructure.
add sysfs infrastructure.
add IH infrastructure.
add recovery infrastructure.

It is a framework. Other IPs need call amdgpu_ras_xxx function instead of
psp_ras_xxx functions.

v2: squash in warning fixes
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c030f2e4

drm/amdgpu: add psp cmd submit timeout · ea114213

由 xinhui pan 提交于 1月 23, 2019

Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ea114213

drm/amdgpu: add psp v11 ras callback · 3ea8fb8c

由 xinhui pan 提交于 11月 14, 2018

Add trigger_error and cure_posion.
Acked-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3ea8fb8c

drm/amdgpu: add psp ras subsystem infrastructure (v2) · 5e5d3154

由 xinhui pan 提交于 11月 21, 2018

Add ras fw loading, init, terminate.
Add ras cmd submit helper.
Add ras feature enable/disable common function.

v2: squash in unused variable warning fix
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5e5d3154

drm/amdgpu: add psp ras callback func and macro · 7da67453

由 xinhui pan 提交于 10月 30, 2018

Define the driver side interface for ras ta.
Acked-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7da67453

drm/amdgpu: add ta_ras_if.h · 58b22e0b

由 xinhui pan 提交于 10月 30, 2018

Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

58b22e0b

drm/amdgpu: add module parameters for ras · 1218252f

由 xinhui pan 提交于 10月 25, 2018

Allow RAS feature enable/disable via boot parameter.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1218252f

drm/amdgpu: export ta fw info · 9b9ca62d

由 xinhui pan 提交于 11月 20, 2018

Output the ta fw, aka xgmi/ras, via debugfs.
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9b9ca62d

drm/amdgpu: add ta ras fw info (v2) · c6eec902

由 xinhui pan 提交于 11月 20, 2018

Add ras fw part, xgmi and ras fw are combined together in ta binary.
Reading the data from the info is not implemented yet.

v2: squash in "drm/amdgpu: fix NULL pointer when ta is missing"
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c6eec902

drm/amdgpu: Cosmetic change for calling func amdgpu_gmc_vram_location · 83afe835

由 Oak Zeng 提交于 3月 07, 2019

Use function parameter mc as the second parameter of amdgpu_gmc_vram_location,
so codes look more consistent.
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
Reviewed-by: NChristian Konig <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

83afe835

drm/amdgpu: Move IB pool init and fini v2 · 533aed27

由 Andrey Grodzovsky 提交于 3月 06, 2019

Problem:
Using SDMA for TLB invalidation in certain ASICs exposed a problem
of IB pool not being ready while SDMA already up on Init and already
shutt down while SDMA still running on Fini. This caused
IB allocation failure. Temproary fix was commited into a
bringup branch but this is the generic fix.

Fix:
Init IB pool rigth after GMC is ready but before SDMA is ready.
Do th opposite for Fini.

v2: Remove restriction on SDMA early init and move amdgpu_ib_pool_fini
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

533aed27

drm/amd/powerplay: apply Vega20 BACO workaround · 0c5ccf14

由 Evan Quan 提交于 3月 07, 2019

Applied vdci flush workaround for Vega20 BACO.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0c5ccf14

drm/amdgpu: XGMI pstate switch initial support · 9b638f97

由 shaoyunl 提交于 2月 21, 2019

Driver vote low to high pstate switch whenever there is an outstanding
XGMI mapping request. Driver vote high to low pstate when all the
outstanding XGMI mapping is terminated.
Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9b638f97

drm/amdgpu: Enable XGMI mapping for peer device · a690aa0f

由 shaoyunl 提交于 2月 22, 2019

Adjust vram base offset for XGMI mapping when update the PT entry so
the address will fall into correct XGMI aperture for peer device
Signed-off-by: Nshaoyunl <shaoyun.liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a690aa0f

drm/amdgpu: Add sysfs entries for xgmi hive v2. · b1fa8c89

由 Andrey Grodzovsky 提交于 3月 05, 2019

For each device a file xgmi_device_id is created.
On the first device a subdirectory named xgmi_hive_info is created,
It contains  a file named hive_id and symlinks named node 1-4 linking
to each device in the hive.

v2: Return error codes instead of '-1' and few misspellings.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b1fa8c89

drm/amdgpu: allow huge invalid mappings on GMC8 · 8ce1f7e7

由 Christian König 提交于 2月 04, 2019

Only GMC9 supports true huge pages, but we can still free invalid mappings
on GMC8.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8ce1f7e7

drm/amdgpu: drop the huge page flag · adc7bfe5

由 Christian König 提交于 2月 01, 2019

Not needed any more since we now free PDs/PTs on demand.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

adc7bfe5

drm/amdgpu: free PDs/PTs on demand · e35fb064

由 Christian König 提交于 2月 01, 2019

When something is unmapped we now free the affected PDs/PTs again.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e35fb064

drm/amdgpu: allocate VM PDs/PTs on demand · 0ce15d6f

由 Christian König 提交于 1月 30, 2019

Let's start to allocate VM PDs/PTs on demand instead of pre-allocating
them during mapping.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0ce15d6f

drm/amdgpu: let amdgpu_vm_clear_bo figure out ats status v2 · 780637cb

由 Christian König 提交于 8月 16, 2018

Instead of providing it from outside figure out the ats status in the
function itself from the data structures.

v2: simplify finding the right level
v3: partially revert changes from v2, more cleanup and split code
    into more functions.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

780637cb

drm/amdgpu: rework shadow handling during PD clear v3 · 83cd8397

由 Christian König 提交于 8月 16, 2018

This way we only deal with the real BO in here.

v2: use a do { ... } while loop instead
v3: fix NULL pointer in v2
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

83cd8397

drm/amdgpu: fix missing assignment of error return code to variable ret · d38ca8f0

由 Colin Ian King 提交于 3月 02, 2019

An earlier commit replaced ttm_bo_wait with amdgpu_bo_sync_wait and
removed the error return assignment to variable ret. Fix this by adding
the assignment back. Also break line to clean up checkpatch overly
long line warning.

Detected by CoverityScan, CID#1477327 ("Logically dead code")

Fixes: c60cd590 ("drm/amdgpu: Replace ttm_bo_wait with amdgpu_bo_sync_wait")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d38ca8f0

drm/amdgpu: also reroute VMC and UMD to IH ring 1 on Vega 20 · b849aaa4

由 Christian König 提交于 3月 04, 2019

Same patch we alredy did for Vega10. Just re-route page faults to a separate
ring to avoid drowning in interrupts.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b849aaa4

drm/amdgpu: reroute VMC and UMD to IH ring 1 · 516bc3d8

由 Christian König 提交于 11月 02, 2018

Page faults can easily overwhelm the interrupt handler.

So to make sure that we never lose valuable interrupts on the primary ring
we re-route page faults to IH ring 1.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

516bc3d8

drm/amdgpu: add thick tile mode settings for Oland of gfx6 · a427a886

由 Tao Zhou 提交于 3月 01, 2019

Adding thick tile mode for Oland to prevent UMD from getting mode value 0
Signed-off-by: NTao Zhou <tao.zhou1@amd.com>
Tested-by: NHui.Deng <hui.deng@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a427a886

drm/amdgpu/gfx_v8_0: Mark expected switch fall-through · a7dc289b

由 Gustavo A. R. Silva 提交于 3月 01, 2019

In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.

This patch fixes the following warning:

drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c: In function ‘gfx_v8_0_tiling_mode_table_init’:
./include/linux/device.h:1487:2: warning: this statement may fall through [-Wimplicit-fallthrough=]
  _dev_warn(dev, dev_fmt(fmt), ##__VA_ARGS__)
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3236:3: note: in expansion of macro ‘dev_warn’
   dev_warn(adev->dev,
   ^~~~~~~~
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c:3240:2: note: here
  case CHIP_CARRIZO:
  ^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a7dc289b

drm/amdgpu: Bump amdgpu version for per-flip plane tiling updates · df8368be

由 Nicholas Kazlauskas 提交于 2月 27, 2019

To help xf86-video-amdgpu and mesa know DC supports updating the
tiling attributes for a framebuffer per-flip.

Cc: Michel Dänzer <michel@daenzer.net>
Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

df8368be

drm/amdgpu: Add sysfs files for returning VRAM/GTT info v2 · 55c374e9

由 Kent Russell 提交于 2月 28, 2019

Add 6 files that return (in bytes):
The total amount of VRAM/visible VRAM/GTT
and the current total used VRAM/visible VRAM/GTT

v2: Split used and total into separate files
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NKent Russell <kent.russell@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

55c374e9

drm/amd/powerplay: add limit of pp_feature for smu (v3) · 3b94fb10

由 Likun Gao 提交于 1月 31, 2019

Move pp_feature from the struct of amd_powerplay to amdgpu_device.
Add pp_feature limit for overdrive interface.

v2: put pp_feature into struct amdgpu_pm.
v3: merge feature_mask with pp_feature.
Signed-off-by: NLikun Gao <Likun.Gao@amd.com>
Reviewed-by: NKevin Wang <kevin1.wang@amd.com>
Suggested-by: NAlex Deucher <alexander.deucher@amd.com>
Suggested-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3b94fb10

drm/amd/powerplay: support sysfs to set socclk, fclk, dcefclk · 4b77faaf

由 Likun Gao 提交于 2月 20, 2019

Add sys interface to set socclk, fclk and dcefclk for smu.
Add feature_mask parameter for smu_upload_dpm_level as socclk, fclk and
dcefclk have dependency, without feature_mask to point out specific clk
will make it fail to set some clk.
Fix the function of smu_unforce_dpm_levels.
Signed-off-by: NLikun Gao <Likun.Gao@amd.com>
Reviewed-by: NGui Chengming <Jack.Gui@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4b77faaf

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功