提交 · 47bdd1db16e67ebfde6f77eaf7625b2292ae6d58 · openeuler / Kernel

22 6月, 2021 4 次提交

drm/amdgpu: wait for moving fence after pinning · 8ddf5b9b

由 Christian König 提交于 6月 21, 2021

We actually need to wait for the moving fence after pinning
the BO to make sure that the pin is completed.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
References: https://lore.kernel.org/dri-devel/20210621151758.2347474-1-daniel.vetter@ffwll.ch/
CC: stable@kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20210622114506.106349-3-christian.koenig@amd.com

8ddf5b9b

Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue." · ee5468b9

由 Yifan Zhang 提交于 6月 19, 2021

This reverts commit 4cbbe348.

Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.
Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

ee5468b9

Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell." · baacf52a

由 Yifan Zhang 提交于 6月 19, 2021

This reverts commit 1c0b0efd.

Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may
cause some APUs fail to enter gfxoff in certain user cases.
Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

baacf52a

drm/amdgpu: Call drm_framebuffer_init last for framebuffer init · 4c6a2318

由 Michel Dänzer 提交于 6月 16, 2021

Once drm_framebuffer_init has returned 0, the framebuffer is hooked up
to the reference counting machinery and can no longer be destroyed with
a simple kfree. Therefore, it must be called last.

If drm_framebuffer_init returns 0 but its caller then returns non-0,
there will likely be memory corruption fireworks down the road.
The following lead me to this fix:

[   12.891228] kernel BUG at lib/list_debug.c:25!
[...]
[   12.891263] RIP: 0010:__list_add_valid+0x4b/0x70
[...]
[   12.891324] Call Trace:
[   12.891330]  drm_framebuffer_init+0xb5/0x100 [drm]
[   12.891378]  amdgpu_display_gem_fb_verify_and_init+0x47/0x120 [amdgpu]
[   12.891592]  ? amdgpu_display_user_framebuffer_create+0x10d/0x1f0 [amdgpu]
[   12.891794]  amdgpu_display_user_framebuffer_create+0x126/0x1f0 [amdgpu]
[   12.891995]  drm_internal_framebuffer_create+0x378/0x3f0 [drm]
[   12.892036]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892075]  drm_mode_addfb2+0x34/0xd0 [drm]
[   12.892115]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892153]  drm_ioctl_kernel+0xe2/0x150 [drm]
[   12.892193]  drm_ioctl+0x3da/0x460 [drm]
[   12.892232]  ? drm_internal_framebuffer_create+0x3f0/0x3f0 [drm]
[   12.892274]  amdgpu_drm_ioctl+0x43/0x80 [amdgpu]
[   12.892475]  __se_sys_ioctl+0x72/0xc0
[   12.892483]  do_syscall_64+0x33/0x40
[   12.892491]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: f258907f "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Signed-off-by: NMichel Dänzer <mdaenzer@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4c6a2318

17 6月, 2021 2 次提交

drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell. · 1c0b0efd

由 Yifan Zhang 提交于 6月 10, 2021

If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.
Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

1c0b0efd

drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue. · 4cbbe348

由 Yifan Zhang 提交于 6月 10, 2021

If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.
Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

4cbbe348

09 6月, 2021 3 次提交

drm/amdgpu: Fix incorrect register offsets for Sienna Cichlid · c247c021

由 Rohit Khaire 提交于 6月 04, 2021

RLC_CP_SCHEDULERS and RLC_SPARE_INT0 have different
offsets for Sienna Cichlid
Signed-off-by: NRohit Khaire <rohit.khaire@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c247c021

drm/amdgpu: Use drm_dbg_kms for reporting failure to get a GEM FB · b71a52f4

由 Michel Dänzer 提交于 6月 02, 2021

drm_err meant broken user space could spam dmesg.

Fixes: f258907f "drm/amdgpu: Verify bo size can fit framebuffer size on init."
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NMichel Dänzer <mdaenzer@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b71a52f4

drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create · 2a48b591

由 Changfeng 提交于 6月 02, 2021

It will cause error when alloc memory larger than 128KB in
amdgpu_bo_create->kzalloc. So it needs to switch kzalloc to kvzalloc.

Call Trace:
   alloc_pages_current+0x6a/0xe0
   kmalloc_order+0x32/0xb0
   kmalloc_order_trace+0x1e/0x80
   __kmalloc+0x249/0x2d0
   amdgpu_bo_create+0x102/0x500 [amdgpu]
   ? xas_create+0x264/0x3e0
   amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
   amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
   amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]
Signed-off-by: NChangfeng <Changfeng.Zhu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2a48b591

03 6月, 2021 5 次提交

drm/amdgpu: make sure we unpin the UVD BO · 07438603

由 Nirmoy Das 提交于 5月 28, 2021

Releasing pinned BOs is illegal now. UVD 6 was missing from:
commit 2f40801d ("drm/amdgpu: make sure we unpin the UVD BO")

Fixes: 2f40801d ("drm/amdgpu: make sure we unpin the UVD BO")
Cc: stable@vger.kernel.org
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

07438603

drm/amd/amdgpu:save psp ring wptr to avoid attack · 2370eba9

由 Victor Zhao 提交于 3月 18, 2021

[Why]
When some tools performing psp mailbox attack, the readback value
of register can be a random value which may break psp.

[How]
Use a psp wptr cache machanism to aovid the change made by attack.

v2: unify change and add detailed reason
Signed-off-by: NVictor Zhao <Victor.Zhao@amd.com>
Signed-off-by: NJingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: NMonk Liu <monk.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2370eba9

drm/amdgpu: Don't query CE and UE errors · dce3d8e1

由 Luben Tuikov 提交于 5月 12, 2021

On QUERY2 IOCTL don't query counts of correctable
and uncorrectable errors, since when RAS is
enabled and supported on Vega20 server boards,
this takes insurmountably long time, in O(n^3),
which slows the system down to the point of it
being unusable when we have GUI up.

Fixes: ae363a21 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2")
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NAlexander Deucher <Alexander.Deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dce3d8e1

drm/amdgpu: refine amdgpu_fru_get_product_info · 5cfc9125

由 Jiansong Chen 提交于 5月 25, 2021

1. eliminate potential array index out of bounds.
2. return meaningful value for failure.
Signed-off-by: NJiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: NJack Gui <Jack.Gui@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5cfc9125

drm/amdgpu: add judgement for dc support · 147feb00

由 Asher Song 提交于 5月 21, 2021

Drop DC initialization when DCN is harvested in VBIOS. The way
doesn't affect virtual display ip initialization.
Signed-off-by: NLikun Gao  <Likun.Gao@amd.com>
Signed-off-by: NAsher Song <Asher.Song@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

147feb00

21 5月, 2021 8 次提交

drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate · 20ebbfd2

由 James Zhu 提交于 5月 19, 2021

Add cancel_delayed_work_sync before set power gating state
to avoid race condition issue when power gating.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

20ebbfd2

drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gate · 23f10a57

由 James Zhu 提交于 5月 19, 2021

Add cancel_delayed_work_sync before set power gating state
to avoid race condition issue when power gating.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

23f10a57

drm/amdgpu/jpeg2.0: add cancel_delayed_work_sync before power gate · ff48f6db

由 James Zhu 提交于 5月 19, 2021

Add cancel_delayed_work_sync before set power gating state
to avoid race condition issue when power gating.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

ff48f6db

drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gate · 4a62542a

由 James Zhu 提交于 5月 17, 2021

Add cancel_delayed_work_sync before set power gating state
to avoid race condition issue when power gating.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

4a62542a

drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gate · 2fb536ea

由 James Zhu 提交于 5月 19, 2021

Add cancel_delayed_work_sync before set power gating state
to avoid race condition issue when power gating.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

2fb536ea

drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gate · 0c601337

由 James Zhu 提交于 5月 19, 2021

Add cancel_delayed_work_sync before set power gating state
to avoid race condition issue when power gating.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

0c601337

drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gate · b95f045e

由 James Zhu 提交于 5月 18, 2021

Add cancel_delayed_work_sync before set power gating state
to avoid race condition issue when power gating.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

b95f045e

drm/amdkfd: correct sienna_cichlid SDMA RLC register offset error · ba515a58

由 Kevin Wang 提交于 5月 19, 2021

1.correct KFD SDMA RLC queue register offset error.
(all sdma rlc register offset is base on SDMA0.RLC0_RLC0_RB_CNTL)
2.HQD_N_REGS (19+6+7+12)
  12: the 2 more resgisters than navi1x (SDMAx_RLCy_MIDCMD_DATA{9,10})

the patch also can be fixed NULL pointer issue when read
/sys/kernel/debug/kfd/hqds on sienna_cichlid chip.
Signed-off-by: NKevin Wang <kevin1.wang@amd.com>
Reviewed-by: NLikun Gao <Likun.Gao@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

ba515a58

20 5月, 2021 9 次提交

drm/amdgpu: stop touching sched.ready in the backend · a2b4785f

由 Christian König 提交于 5月 18, 2021

This unfortunately comes up in regular intervals and breaks
GPU reset for the engine in question.

The sched.ready flag controls if an engine can't get working
during hw_init, but should never be set to false during hw_fini.

v2: squash in unused variable fix (Alex)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a2b4785f

drm/amd/amdgpu: fix a potential deadlock in gpu reset · 9c2876d5

由 Lang Yu 提交于 5月 17, 2021

When amdgpu_ib_ring_tests failed, the reset logic called
amdgpu_device_ip_suspend twice, then deadlock occurred.
Deadlock log:

[  805.655192] amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110).
[  806.290952] [drm] free PSP TMR buffer

[  806.319406] ============================================
[  806.320315] WARNING: possible recursive locking detected
[  806.321225] 5.11.0-custom #1 Tainted: G        W  OEL
[  806.322135] --------------------------------------------
[  806.323043] cat/2593 is trying to acquire lock:
[  806.323825] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.325668]
               but task is already holding lock:
[  806.326664] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.328430]
               other info that might help us debug this:
[  806.329539]  Possible unsafe locking scenario:

[  806.330549]        CPU0
[  806.330983]        ----
[  806.331416]   lock(&adev->dm.dc_lock);
[  806.332086]   lock(&adev->dm.dc_lock);
[  806.332738]
                *** DEADLOCK ***

[  806.333747]  May be due to missing lock nesting notation

[  806.334899] 3 locks held by cat/2593:
[  806.335537]  #0: ffff888100d3f1b8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_read+0x4e/0x110
[  806.337009]  #1: ffff888136b1fd78 (&adev->reset_sem){++++}-{3:3}, at: amdgpu_device_lock_adev+0x42/0x94 [amdgpu]
[  806.339018]  #2: ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.340869]
               stack backtrace:
[  806.341621] CPU: 6 PID: 2593 Comm: cat Tainted: G        W  OEL    5.11.0-custom #1
[  806.342921] Hardware name: AMD Celadon-CZN/Celadon-CZN, BIOS WLD0C23N_Weekly_20_12_2 12/23/2020
[  806.344413] Call Trace:
[  806.344849]  dump_stack+0x93/0xbd
[  806.345435]  __lock_acquire.cold+0x18a/0x2cf
[  806.346179]  lock_acquire+0xca/0x390
[  806.346807]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.347813]  __mutex_lock+0x9b/0x930
[  806.348454]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.349434]  ? amdgpu_device_indirect_rreg+0x58/0x70 [amdgpu]
[  806.350581]  ? _raw_spin_unlock_irqrestore+0x47/0x50
[  806.351437]  ? dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.352437]  ? rcu_read_lock_sched_held+0x4f/0x80
[  806.353252]  ? rcu_read_lock_sched_held+0x4f/0x80
[  806.354064]  mutex_lock_nested+0x1b/0x20
[  806.354747]  ? mutex_lock_nested+0x1b/0x20
[  806.355457]  dm_suspend+0xb8/0x1d0 [amdgpu]
[  806.356427]  ? soc15_common_set_clockgating_state+0x17d/0x19 [amdgpu]
[  806.357736]  amdgpu_device_ip_suspend_phase1+0x78/0xd0 [amdgpu]
[  806.360394]  amdgpu_device_ip_suspend+0x21/0x70 [amdgpu]
[  806.362926]  amdgpu_device_pre_asic_reset+0xb3/0x270 [amdgpu]
[  806.365560]  amdgpu_device_gpu_recover.cold+0x679/0x8eb [amdgpu]
Signed-off-by: NLang Yu <Lang.Yu@amd.com>
Acked-by: NChristian KÃnig <christian.koenig@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9c2876d5

drm/amdgpu: update sdma golden setting for Navi12 · 77194d86

由 Guchun Chen 提交于 5月 17, 2021

Current golden setting is out of date.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

77194d86

drm/amdgpu: update gc golden setting for Navi12 · 99c45ba5

由 Guchun Chen 提交于 5月 17, 2021

Current golden setting is out of date.
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NKenneth Feng <kenneth.feng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

99c45ba5

drm/amdgpu: Fix a use-after-free · 1e5c3738

由 xinhui pan 提交于 5月 18, 2021

looks like we forget to set ttm->sg to NULL.
Hit panic below

[ 1235.844104] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b7b4b: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[ 1235.989074] Call Trace:
[ 1235.991751]  sg_free_table+0x17/0x20
[ 1235.995667]  amdgpu_ttm_backend_unbind.cold+0x4d/0xf7 [amdgpu]
[ 1236.002288]  amdgpu_ttm_backend_destroy+0x29/0x130 [amdgpu]
[ 1236.008464]  ttm_tt_destroy+0x1e/0x30 [ttm]
[ 1236.013066]  ttm_bo_cleanup_memtype_use+0x51/0xa0 [ttm]
[ 1236.018783]  ttm_bo_release+0x262/0xa50 [ttm]
[ 1236.023547]  ttm_bo_put+0x82/0xd0 [ttm]
[ 1236.027766]  amdgpu_bo_unref+0x26/0x50 [amdgpu]
[ 1236.032809]  amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x7aa/0xd90 [amdgpu]
[ 1236.040400]  kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
[ 1236.046912]  kfd_ioctl+0x463/0x690 [amdgpu]
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1e5c3738

drm/amdgpu: add video_codecs query support for aldebaran · ab95cb3e

由 James Zhu 提交于 5月 18, 2021

Add video_codecs query support for aldebaran.
Signed-off-by: NJames Zhu <James.Zhu@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ab95cb3e

drm/amd/amdgpu: fix refcount leak · fa7e6abc

由 Jingwen Chen 提交于 5月 17, 2021

[Why]
the gem object rfb->base.obj[0] is get according to num_planes
in amdgpufb_create, but is not put according to num_planes

[How]
put rfb->base.obj[0] in amdgpu_fbdev_destroy according to num_planes
Signed-off-by: NJingwen Chen <Jingwen.Chen2@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fa7e6abc

drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang · dbd1003d

由 Changfeng 提交于 5月 14, 2021

There is problem with 3DCGCG firmware and it will cause compute test
hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid
compute hang.
Signed-off-by: NChangfeng <Changfeng.Zhu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

dbd1003d

drm/amdgpu: Fix GPU TLB update error when PAGE_SIZE > AMDGPU_PAGE_SIZE · d5375156

由 Yi Li 提交于 5月 14, 2021

When PAGE_SIZE is larger than AMDGPU_PAGE_SIZE, the number of GPU TLB
entries which need to update in amdgpu_map_buffer() should be multiplied
by AMDGPU_GPU_PAGES_IN_CPU_PAGE (PAGE_SIZE / AMDGPU_PAGE_SIZE).
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NYi Li <liyi@loongson.cn>
Signed-off-by: NHuacai Chen <chenhuacai@loongson.cn>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

d5375156

13 5月, 2021 4 次提交

drm/amdgpu: update vcn1.0 Non-DPG suspend sequence · 5c1efb5f

由 Sathishkumar S 提交于 5月 03, 2021

update suspend register settings in Non-DPG mode.
Signed-off-by: NSathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5c1efb5f

drm/amdgpu: set vcn mgcg flag for picasso · 3666f83a

由 Sathishkumar S 提交于 5月 03, 2021

enable vcn mgcg flag for picasso.
Signed-off-by: NSathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: NLeo Liu <leo.liu@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3666f83a

drm/amdgpu: update the method for harvest IP for specific SKU · 5c1a3768

由 Likun Gao 提交于 5月 07, 2021

Update the method of disabling VCN IP for specific SKU for navi1x ASIC,
it will judge whether should add the related IP at the function of
amdgpu_device_ip_block_add().
Signed-off-by: NLikun Gao <Likun.Gao@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5c1a3768

drm/amdgpu: add judgement when add ip blocks (v2) · 83a0b863

由 Likun GAO 提交于 4月 29, 2021

Judgement whether to add an sw ip according to the harvest info.

v2: fix indentation (Alex)
Signed-off-by: NLikun Gao <Likun.Gao@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

83a0b863

06 5月, 2021 2 次提交

drm/amdgpu: Use device specific BO size & stride check. · 234055fd

由 Bas Nieuwenhuizen 提交于 5月 04, 2021

The builtin size check isn't really the right thing for AMD
modifiers due to a couple of reasons:

1) In the format structs we don't do set any of the tilesize / blocks
etc. to avoid having format arrays per modifier/GPU
2) The pitch on the main plane is pixel_pitch * bytes_per_pixel even
for tiled ...
3) The pitch for the DCC planes is really the pixel pitch of the main
surface that would be covered by it ...

Note that we only handle GFX9+ case but we do this after converting
the implicit modifier to an explicit modifier, so on GFX9+ all
framebuffers should be checked here.

There is a TODO about DCC alignment, but it isn't worse than before
and I'd need to dig a bunch into the specifics. Getting this out in
a reasonable timeframe to make sure it gets the appropriate testing
seemed more important.

Finally as I've found that debugging addfb2 failures is a pita I was
generous adding explicit error messages to every failure case.

Fixes: f258907f ("drm/amdgpu: Verify bo size can fit framebuffer size on init.")
Tested-by: NSimon Ser <contact@emersion.fr>
Signed-off-by: NBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

234055fd

drm/amdgpu: Init GFX10_ADDR_CONFIG for VCN v3 in DPG mode. · 8bf073ca

由 Bas Nieuwenhuizen 提交于 5月 05, 2021

Otherwise tiling modes that require the values form this field
(In particular _*_X) would be corrupted upon video decode.

Copied from the VCN v2 code.

Fixes: 99541f39 ("drm/amdgpu: add mc resume DPG mode for VCN3.0")
Reviewed-and-Tested by: Leo Liu <leo.liu@amd.com>
Signed-off-by: NBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

8bf073ca

05 5月, 2021 1 次提交

drm/amdgpu: add new MC firmware for Polaris12 32bit ASIC · c83c4e19

由 Evan Quan 提交于 4月 28, 2021

Polaris12 32bit ASIC needs a special MC firmware.
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

c83c4e19

29 4月, 2021 2 次提交

amdgpu: fix GEM obj leak in amdgpu_display_user_framebuffer_create · e0c16eb4

由 Simon Ser 提交于 4月 21, 2021

This error code-path is missing a drm_gem_object_put call. Other
error code-paths are fine.
Signed-off-by: NSimon Ser <contact@emersion.fr>
Fixes: 1769152a ("drm/amdgpu: Fail fb creation from imported dma-bufs. (v2)")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

e0c16eb4

drm/amdgpu: Register VGA clients after init can no longer fail · 8c3dd61c

由 Kai-Heng Feng 提交于 4月 26, 2021

When an amdgpu device fails to init, it makes another VGA device cause
kernel splat:
kernel: amdgpu 0000:08:00.0: amdgpu: amdgpu_device_ip_init failed
kernel: amdgpu 0000:08:00.0: amdgpu: Fatal error during GPU init
kernel: amdgpu: probe of 0000:08:00.0 failed with error -110
...
kernel: amdgpu 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
kernel: BUG: kernel NULL pointer dereference, address: 0000000000000018
kernel: #PF: supervisor read access in kernel mode
kernel: #PF: error_code(0x0000) - not-present page
kernel: PGD 0 P4D 0
kernel: Oops: 0000 [#1] SMP NOPTI
kernel: CPU: 6 PID: 1080 Comm: Xorg Tainted: G W 5.12.0-rc8+ #12
kernel: Hardware name: HP HP EliteDesk 805 G6/872B, BIOS S09 Ver. 02.02.00 12/30/2020
kernel: RIP: 0010:amdgpu_device_vga_set_decode+0x13/0x30 [amdgpu]
kernel: Code: 06 31 c0 c3 b8 ea ff ff ff 5d c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 8b 87 90 06 00 00 48 89 e5 53 89 f3 <48> 8b 40 18 40 0f b6 f6 e8 40 58 39 fd 80 fb 01 5b 5d 19 c0 83 e0
kernel: RSP: 0018:ffffae3c0246bd68 EFLAGS: 00010002
kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
kernel: RDX: ffff8dd1af5a8560 RSI: 0000000000000000 RDI: ffff8dce8c160000
kernel: RBP: ffffae3c0246bd70 R08: ffff8dd1af5985c0 R09: ffffae3c0246ba38
kernel: R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000246
kernel: R13: 0000000000000000 R14: 0000000000000003 R15: ffff8dce81490000
kernel: FS: 00007f9303d8fa40(0000) GS:ffff8dd1af580000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000000000000018 CR3: 0000000103cfa000 CR4: 0000000000350ee0
kernel: Call Trace:
kernel: vga_arbiter_notify_clients.part.0+0x4a/0x80
kernel: vga_get+0x17f/0x1c0
kernel: vga_arb_write+0x121/0x6a0
kernel: ? apparmor_file_permission+0x1c/0x20
kernel: ? security_file_permission+0x30/0x180
kernel: vfs_write+0xca/0x280
kernel: ksys_write+0x67/0xe0
kernel: __x64_sys_write+0x1a/0x20
kernel: do_syscall_64+0x38/0x90
kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f93041e02f7
kernel: Code: 75 05 48 83 c4 58 c3 e8 f7 33 ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
kernel: RSP: 002b:00007fff60e49b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
kernel: RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007f93041e02f7
kernel: RDX: 000000000000000b RSI: 00007fff60e49b40 RDI: 000000000000000f
kernel: RBP: 00007fff60e49b40 R08: 00000000ffffffff R09: 00007fff60e499d0
kernel: R10: 00007f93049350b5 R11: 0000000000000246 R12: 000056111d45e808
kernel: R13: 0000000000000000 R14: 000056111d45e7f8 R15: 000056111d46c980
kernel: Modules linked in: nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_seq input_leds snd_seq_device snd_timer snd soundcore joydev kvm_amd serio_raw k10temp mac_hid hp_wmi ccp kvm sparse_keymap wmi_bmof ucsi_acpi efi_pstore typec_ucsi rapl typec video wmi sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log hid_generic usbhid hid amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched i2c_algo_bit drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul sysimgblt crc32_pclmul fb_sys_fops ghash_clmulni_intel cec rc_core aesni_intel crypto_simd psmouse cryptd r8169 i2c_piix4 drm ahci xhci_pci realtek libahci xhci_pci_renesas gpio_amdpt gpio_generic
kernel: CR2: 0000000000000018
kernel: ---[ end trace 76d04313d4214c51 ]---

Commit 4192f7b5 ("drm/amdgpu: unmap register bar on device init
failure") makes amdgpu_driver_unload_kms() skips amdgpu_device_fini(),
so the VGA clients remain registered. So when
vga_arbiter_notify_clients() iterates over registered clients, it causes
NULL pointer dereference.

Since there's no reason to register VGA clients that early, so solve
the issue by putting them after all the goto cleanups.

v2:
- Remove redundant vga_switcheroo cleanup in failed: label.

Fixes: 4192f7b5 ("drm/amdgpu: unmap register bar on device init failure")
Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8c3dd61c

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功