提交 · c9977dffcc7e0f728e82209c140991c5bd27c491 · openeuler / Kernel

19 8月, 2020 6 次提交

drm/amdgpu/jpeg: remove redundant check when it returns · cdab4211

由 Leo Liu 提交于 8月 14, 2020

Fix warning from kernel test robot
v2: remove the local variable as well
Signed-off-by: NLeo Liu <leo.liu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cdab4211

drm/amdgpu: Limit the error info print rate · 8e1d88f9

由 jqdeng 提交于 7月 30, 2020

Use function printk_ratelimit to limit the print rate.
Signed-off-by: Njqdeng <Emily.Deng@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8e1d88f9

drm/amdgpu: Fix repeatly flr issue · 9a1cddd6

由 jqdeng 提交于 8月 07, 2020

Only for no job running test case need to do recover in
flr notification.
For having job in mirror list, then let guest driver to
hit job timeout, and then do recover.
Signed-off-by: Njqdeng <Emily.Deng@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9a1cddd6

Revert "drm/amdgpu: disable gfxoff for navy_flounder" · 332d7903

由 Jiansong Chen 提交于 8月 17, 2020

This reverts commit ba4e049e.
Newly released sdma fw (51.52) provides a fix for the issue.
Signed-off-by: NJiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

332d7903

drm/scheduler: Remove priority macro INVALID (v2) · 9af5e21d

由 Luben Tuikov 提交于 8月 11, 2020

Remove DRM_SCHED_PRIORITY_INVALID. We no longer
carry around an invalid priority and cut it off
at the source.

Backwards compatibility behaviour of AMDGPU CTX
IOCTL passing in garbage for context priority
from user space and then mapping that to
DRM_SCHED_PRIORITY_NORMAL is preserved.

v2: Revert "res"  --> "r" and
           "prio" --> "priority".
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9af5e21d

drm/scheduler: Scheduler priority fixes (v2) · e2d732fd

由 Luben Tuikov 提交于 8月 11, 2020

Remove DRM_SCHED_PRIORITY_LOW, as it was used
in only one place.

Rename and separate by a line
DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT
as it represents a (total) count of said
priorities and it is used as such in loops
throughout the code. (0-based indexing is the
the count number.)

Remove redundant word HIGH in priority names,
and rename *KERNEL* to *HIGH*, as it really
means that, high.

v2: Add back KERNEL and remove SW and HW,
    in lieu of a single HIGH between NORMAL and KERNEL.
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e2d732fd

18 8月, 2020 3 次提交

drm/amdgpu: add condition check for trace_amdgpu_cs() · 44444574

由 Kevin Wang 提交于 8月 17, 2020

v1:
add trace event enabled check to avoid nop loop when submit multi ibs
in amdgpu_cs_ioctl() function.

v2:
add a new wrapper function to trace all amdgpu cs ibs.
Signed-off-by: NKevin Wang <kevin1.wang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

44444574

drm/amdgpu: fix amdgpu_bo_release_notify() comment error · 736b1729

由 Kevin Wang 提交于 8月 17, 2020

fix amdgpu_bo_release_notify() comment error.
Signed-off-by: NKevin Wang <kevin1.wang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

736b1729

drm/amdkfd: fix the wrong sdma instance query for renoir · d95c42a1

由 Huang Rui 提交于 8月 11, 2020

Renoir only has one sdma instance, it will get failed once query the
sdma1 registers. So use switch-case instead of static register array.
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d95c42a1

15 8月, 2020 25 次提交

drm/amdgpu: note what type of reset we are using · 11043b7a

由 Alex Deucher 提交于 8月 11, 2020

When we reset the GPU, note what type of reset will be
used.  This makes debugging different reset scenarios
more clear as the driver may use different reset
methods depending on conditions on the system.
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11043b7a

drm/amdgpu: print where we get the vbios image from · bddbacc9

由 Alex Deucher 提交于 8月 10, 2020

ACPI, ROM, PCI BAR, etc.
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bddbacc9

drm/amdgpu: parse ta firmware for navy_flounder · 31e726ca

由 Bhawanpreet Lakha 提交于 7月 28, 2020

Use the same case as sienna_cichlid
Signed-off-by: NBhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

31e726ca

drm/amdgpu/vcn3.0: only SIENNA_CICHLID need specify instance for dec/enc · ac1128c9

由 James Zhu 提交于 8月 13, 2020

Only SIENNA_CICHLID(VCN3) has 2 unsymmetrical instances, there're less
codecs on instance 1, we use 0 for decode and 1 for encode.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ac1128c9

drm/amd/pm: optimize the power related source code layout · e098bc96

由 Evan Quan 提交于 8月 13, 2020

The target is to provide a clear entry point(for power routines).
Also this can help to maintain a clear view about the frameworks
used on different ASICs. Hopefully all these can make power part
more friendly to play with.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e098bc96

drm/amd/powerplay: put those exposed power interfaces in amdgpu_dpm.c · e9372d23

由 Evan Quan 提交于 8月 13, 2020

As other power interfaces.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e9372d23

drm/amd/powerplay: optimize i2c bus access implementation · 20d3c28c

由 Evan Quan 提交于 8月 13, 2020

The caller needs not care about the internal details how the powerplay
API implemented.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

20d3c28c

drm/amd/powerplay: drop unnecessary pp_funcs checker · 70bdb6ed

由 Evan Quan 提交于 8月 12, 2020

It's redundant. Also, the callers should not care about
the implementation details.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

70bdb6ed

drm/amd/powerplay: optimize amdgpu_dpm_set_clockgating_by_smu() implementation · b89e9eb6

由 Evan Quan 提交于 8月 12, 2020

Cover the implementation details from outside(of power part).
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b89e9eb6

drm/amdgpu: guard ras debugfs creation/removal based on CONFIG_DEBUG_FS · ae2bf61f

由 Guchun Chen 提交于 8月 13, 2020

It can avoid potential build warn/error when
CONFIG_DEBUG_FS is not set.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ae2bf61f

drm/amdgpu: fix NULL pointer access issue when unloading driver · 2e2f5dd5

由 Guchun Chen 提交于 8月 13, 2020

When unloading driver by "modprobe -r amdgpu", one NULL pointer
dereference bug occurs in ras debugfs releasing. The cause is the
duplicated debugfs_remove, as drm debugfs_root dir has been cleaned
up already by drm_minor_unregister.

BUG: kernel NULL pointer dereference, address: 00000000000000a0
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 11 PID: 1526 Comm: modprobe Tainted: G           OE     5.6.0-guchchen #1
Hardware name: System manufacturer System Product Name/TUF Z370-PLUS GAMING II, BIOS 0411 09/21/2018
RIP: 0010:down_write+0x15/0x40
Code: eb de e8 7e 17 72 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 53 48 89 fb e8 92
d8 ff ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 43 08 5b c3
RSP: 0018:ffffb1590386fcd0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000000a0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff85b2fcc2 RDI: 00000000000000a0
RBP: ffffb1590386fd30 R08: ffffffff85b2fcc2 R09: 000000000002b3c0
R10: ffff97a330618c40 R11: 00000000000005f6 R12: ffff97a3481beb40
R13: 00000000000000a0 R14: ffff97a3481beb40 R15: 0000000000000000
FS:  00007fb11a717540(0000) GS:ffff97a376cc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000a0 CR3: 00000004066d6006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 simple_recursive_removal+0x63/0x370
 ? debugfs_remove+0x60/0x60
 debugfs_remove+0x40/0x60
 amdgpu_ras_fini+0x82/0x230 [amdgpu]
 ? __kernfs_remove.part.17+0x101/0x1f0
 ? kernfs_name_hash+0x12/0x80
 amdgpu_device_fini+0x1c0/0x580 [amdgpu]
 amdgpu_driver_unload_kms+0x3e/0x70 [amdgpu]
 amdgpu_pci_remove+0x36/0x60 [amdgpu]
 pci_device_remove+0x3b/0xb0
 device_release_driver_internal+0xe5/0x1c0
 driver_detach+0x46/0x90
 bus_remove_driver+0x58/0xd0
 pci_unregister_driver+0x29/0x90
 amdgpu_exit+0x11/0x25 [amdgpu]
 __x64_sys_delete_module+0x13d/0x210
 do_syscall_64+0x5f/0x250
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2e2f5dd5

drm/amdgpu: revert "fix system hang issue during GPU reset" · f1403342

由 Christian König 提交于 8月 12, 2020

The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1a.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f1403342

drm/amd/powerplay: enable swSMU mgpu fan boost support · 9f979a49

由 Evan Quan 提交于 8月 12, 2020

Enable mgpu fan boost feature on swSMU routines.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9f979a49

drm/amd/powerplay: optimize the interface for mgpu fan boost enablement · f10bb940

由 Evan Quan 提交于 8月 12, 2020

Cover the implementation details from outside(of power). Also preparing
for expanding this to swSMU.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f10bb940

drm/amdgpu: disable gfxoff for navy_flounder · ba4e049e

由 Jiansong Chen 提交于 8月 12, 2020

gfxoff is temporarily disabled for navy_flounder,
since at present the feature has broken some basic
amdgpu test.
Signed-off-by: NJiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: NTao Zhou <tao.zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ba4e049e

drm/amdgpu: Use function pointer for some mmhub functions · 9fb1506e

由 Oak Zeng 提交于 8月 06, 2020

Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC. Simplify the code by deleting duplicate functions
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9fb1506e

drm/amdgpu: pass NULL pointer instead of 0 · 2f530724

由 Nirmoy Das 提交于 8月 11, 2020

Fixes: c030f2e4 ("drm/amdgpu: add amdgpu_ras.c to support ras (v2)")
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2f530724

drm/amdgpu: annotate a false positive recursive locking · 72e14ebf

由 Dennis Li 提交于 8月 06, 2020

[  584.110304] ============================================
[  584.110590] WARNING: possible recursive locking detected
[  584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G           OE
[  584.111164] --------------------------------------------
[  584.111456] kworker/38:1/553 is trying to acquire lock:
[  584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.112112]
               but task is already holding lock:
[  584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.113068]
               other info that might help us debug this:
[  584.113689]  Possible unsafe locking scenario:

[  584.114350]        CPU0
[  584.114685]        ----
[  584.115014]   lock(&adev->reset_sem);
[  584.115349]   lock(&adev->reset_sem);
[  584.115678]
                *** DEADLOCK ***

[  584.116624]  May be due to missing lock nesting notation

[  584.117284] 4 locks held by kworker/38:1/553:
[  584.117616]  #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630
[  584.117967]  #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630
[  584.118358]  #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu]
[  584.118786]  #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.119222]
               stack backtrace:
[  584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G           OE     5.6.0-deli-v5.6-2848-g3f3109b0e75f #1
[  584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[  584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[  584.121638] Call Trace:
[  584.122050]  dump_stack+0x98/0xd5
[  584.122499]  __lock_acquire+0x1139/0x16e0
[  584.122931]  ? trace_hardirqs_on+0x3b/0xf0
[  584.123358]  ? cancel_delayed_work+0xa6/0xc0
[  584.123771]  lock_acquire+0xb8/0x1c0
[  584.124197]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.124599]  down_write+0x49/0x120
[  584.125032]  ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125472]  amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu]
[  584.125910]  ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[  584.126367]  amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[  584.126789]  process_one_work+0x29e/0x630
[  584.127208]  worker_thread+0x3c/0x3f0
[  584.127621]  ? __kthread_parkme+0x61/0x90
[  584.128014]  kthread+0x12f/0x150
[  584.128402]  ? process_one_work+0x630/0x630
[  584.128790]  ? kthread_park+0x90/0x90
[  584.129174]  ret_from_fork+0x3a/0x50

Each adev has owned lock_class_key to avoid false positive
recursive locking.

v2:
1. register adev->lock_key into lockdep, otherwise lockdep will
report the below warning

[ 1216.705820] BUG: key ffff890183b647d0 has not been registered!
[ 1216.705924] ------------[ cut here ]------------
[ 1216.705972] DEBUG_LOCKS_WARN_ON(1)
[ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210

v3:
change to use down_write_nest_lock to annotate the false dead-lock
warning.
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

72e14ebf

drm/amdgpu: add debugfs interface for RAP test · a4322e18

由 Wenhui Sheng 提交于 8月 11, 2020

After amdgpu driver loading successfully, we can use
RAP debugfs interface <debugfs_dir>/dri/xxx/rap_test
to trigger RAP test.

Currently only L0 validate test is supported.

v2: refine amdgpu_rap.h
Signed-off-by: NWenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: NGuchun Chen <Guchun.Chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a4322e18

drm/amdgpu: enable RAP TA load · 8602692b

由 Wenhui Sheng 提交于 7月 17, 2020

Enable the RAP TA loading path and add RAP test
trigger interface.

v2: fix potential mem leak issue
Signed-off-by: NWenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: NGuchun Chen <Guchun.Chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8602692b

drm/amdgpu: add RAP TA header file · a189d0ae

由 Wenhui Sheng 提交于 7月 23, 2020

The RAP TA contains tests used to verify if
RAP(Register Access Policy), or otherwise known
as Security Policy is applied correctly
by PSP BL&TOS.

The RAP test is a measure to ensure that we reduce
the avenue of complexity and mistakes when dealing
with RAP in post-si execution, where debugging failures
related to RAP is quite difficult and expensive.

v2: add introduction for RAP TA
Signed-off-by: NWenhui Sheng <Wenhui.Sheng@amd.com>
Reviewed-by: NGuchun Chen <Guchun.Chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a189d0ae

drm/amdgpu: reconfigure spm golden settings on Navi1x after GFXOFF exit(v3) · 425a78f4

由 Tianci.Yin 提交于 7月 20, 2020

On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfigure the golden settings after GFXOFF
exit.
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

425a78f4

drm/amdgpu: add interface amdgpu_gfx_init_spm_golden for Navi1x · d58fe3cf

由 Tianci.Yin 提交于 6月 19, 2020

On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfiguration is needed. Make the
configuration code as an interface for future use.
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com>
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d58fe3cf

drm/amdgpu: add debugfs node to toggle ras error cnt harvest · 66459e1d

由 Guchun Chen 提交于 8月 04, 2020

Before ras recovery is issued, user could operate this debugfs
node to enable/disable the harvest of all RAS IPs' ras error
count registers, which will help keep hardware's registers'
status instead of cleaning up them.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

66459e1d

drm/amdgpu: bypass querying ras error count registers · f75e94d8

由 Guchun Chen 提交于 8月 04, 2020

Once ras recovery is issued by ras sync flood interrupt or
ras controller interrupt, add this guard to bypass or execute
ras error count register harvest of all IPs.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f75e94d8

11 8月, 2020 5 次提交

drm/amdgpu: Enable P2P dmabuf over XGMI · 0cf0ee98

由 Arunpravin 提交于 8月 06, 2020

Access the exported P2P dmabuf over XGMI, if available.
Otherwise, fall back to the existing PCIe method.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NArunpravin <apaneers@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0cf0ee98

drm/amdgpu: fix reload KMD hang on GFX10 KIQ · bcca6298

由 Monk Liu 提交于 8月 10, 2020

GFX10 KIQ will hang if we try below steps:
modprobe amdgpu
rmmod amdgpu
modprobe amdgpu sched_hw_submission=4

Due to KIQ is always living there even after KMD unloaded
thus when doing the realod KIQ will crash upon its register
being programed by different values with the previous loading
(the config like HQD addr, ring size, is easily changed if we alter
the sched_hw_submission)

the fix is we must inactive KIQ first before touching any
of its registgers
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NEmily Deng <Emily.Deng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bcca6298

drm/amdgpu: update gc golden register for arcturus · 5a58abf5

由 shiwu.zhang 提交于 8月 07, 2020

Update golden setting to improve performance on HPC
and ML apps
Signed-off-by: Nshiwu.zhang <shiwu.zhang@amd.com>
Tested-by: Ngang.long <gang.long@amd.com>
Reviewed-by: Nguchun.chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5a58abf5

drm/amdgpu: Skip some registers config for SRIOV · 1d447326

由 Liu ChengZhe 提交于 8月 06, 2020

Some registers are not accessible to virtual function setup, so
skip their initialization when in VF-SRIOV mode.

v2: move SRIOV VF check into specify functions;
modify commit description and comment.
Signed-off-by: NLiu ChengZhe <ChengZhe.Liu@amd.com>
Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1d447326

drm: amdgpu: Use the correct size when allocating memory · 5068ed57

由 Christophe JAILLET 提交于 8月 09, 2020

When '*sgt' is allocated, we must allocated 'sizeof(**sgt)' bytes instead
of 'sizeof(*sg)'.

The sizeof(*sg) is bigger than sizeof(**sgt) so this wastes memory but
it won't lead to corruption.

Fixes: f44ffd67 ("drm/amdgpu: add support for exporting VRAM using DMA-buf v3")
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5068ed57

08 8月, 2020 1 次提交

drm/amdgpu: unlock mutex on error · 94561899

由 Dennis Li 提交于 8月 04, 2020

Make sure to unlock the mutex when error happen

v2:
1. correct syntax error in the commit comments
2. remove change-Id
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NLuben Tuikov <luben.tuikov@amd.com>
Signed-off-by: NDennis Li <Dennis.Li@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

94561899

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功