提交 · 1f1b4f343dda54233826e388709b908b6c394bcf · openeuler / Kernel

14 7月, 2022 1 次提交

drm/amdgpu: Fix recursive locking warning · 1f1b4f34

由 Rajneesh Bhardwaj 提交于 7月 14, 2022

stable inclusion
from stable-v5.10.111
commit 6694b8643bde4b940f6b410507960793b922a77d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5GL1Z

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6694b8643bde4b940f6b410507960793b922a77d

--------------------------------

[ Upstream commit 447c7997 ]

Noticed the below warning while running a pytorch workload on vega10
GPUs. Change to trylock to avoid conflicts with already held reservation
locks.

[  +0.000003] WARNING: possible recursive locking detected
[  +0.000003] 5.13.0-kfd-rajneesh #1030 Not tainted
[  +0.000004] --------------------------------------------
[  +0.000002] python/4822 is trying to acquire lock:
[  +0.000004] ffff932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000203]
              but task is already holding lock:
[  +0.000003] ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.000017]
              other info that might help us debug this:
[  +0.000002]  Possible unsafe locking scenario:

[  +0.000003]        CPU0
[  +0.000002]        ----
[  +0.000002]   lock(reservation_ww_class_mutex);
[  +0.000004]   lock(reservation_ww_class_mutex);
[  +0.000003]
               *** DEADLOCK ***

[  +0.000002]  May be due to missing lock nesting notation

[  +0.000003] 7 locks held by python/4822:
[  +0.000003]  #0: ffff932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at:
kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu]
[  +0.000232]  #1: ffff932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu]
[  +0.000241]  #2: ffff932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu]
[  +0.000236]  #3: ffffb2b35606fd28
(reservation_ww_class_acquire){+.+.}-{0:0}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu]
[  +0.000235]  #4: ffff932cbb7181f8
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.000015]  #5: ffffffffc045f700 (*(sspp++)){....}-{0:0}, at:
drm_dev_enter+0x5/0xa0 [drm]
[  +0.000038]  #6: ffff932c52da7078 (&vm->eviction_lock){+.+.}-{3:3},
at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu]
[  +0.000195]
              stack backtrace:
[  +0.000003] CPU: 11 PID: 4822 Comm: python Not tainted
5.13.0-kfd-rajneesh #1030
[  +0.000005] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02
08/29/2018
[  +0.000003] Call Trace:
[  +0.000003]  dump_stack+0x6d/0x89
[  +0.000010]  __lock_acquire+0xb93/0x1a90
[  +0.000009]  lock_acquire+0x25d/0x2d0
[  +0.000005]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  ? lock_is_held_type+0xa2/0x110
[  +0.000006]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  __ww_mutex_lock.constprop.17+0xca/0x1060
[  +0.000007]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  ? lock_release+0x13f/0x270
[  +0.000005]  ? lock_is_held_type+0xa2/0x110
[  +0.000006]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000185]  ttm_bo_release+0x4c6/0x580 [ttm]
[  +0.000010]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
[  +0.000183]  amdgpu_vm_free_table+0x76/0xa0 [amdgpu]
[  +0.000189]  amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu]
[  +0.000189]  amdgpu_vm_update_ptes+0x411/0x770 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update+0x251/0x610 [amdgpu]
[  +0.000191]  update_gpuvm_pte+0xcc/0x290 [amdgpu]
[  +0.000229]  ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu]
[  +0.000190]  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60
[amdgpu]
[  +0.000234]  kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu]
[  +0.000218]  kfd_ioctl+0x2b9/0x600 [amdgpu]
[  +0.000216]  ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu]
[  +0.000216]  ? lock_release+0x13f/0x270
[  +0.000006]  ? __fget_files+0x107/0x1e0
[  +0.000007]  __x64_sys_ioctl+0x8b/0xd0
[  +0.000007]  do_syscall_64+0x36/0x70
[  +0.000004]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.000007] RIP: 0033:0x7fbff90a7317
[  +0.000004] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[  +0.000005] RSP: 002b:00007fbe301fe648 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  +0.000006] RAX: ffffffffffffffda RBX: 00007fbcc402d820 RCX:
00007fbff90a7317
[  +0.000003] RDX: 00007fbe301fe690 RSI: 00000000c0184b18 RDI:
0000000000000004
[  +0.000003] RBP: 00007fbe301fe690 R08: 0000000000000000 R09:
00007fbcc402d880
[  +0.000003] R10: 0000000002001000 R11: 0000000000000246 R12:
00000000c0184b18
[  +0.000003] R13: 0000000000000004 R14: 00007fbf689593a0 R15:
00007fbcc402d820

Cc: Christian König <christian.koenig@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NSasha Levin <sashal@kernel.org>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: NWei Li <liwei391@huawei.com>

1f1b4f34

21 10月, 2021 1 次提交

drm/amdgpu: Fix BUG_ON assert · 0e542c48

由 Andrey Grodzovsky 提交于 10月 21, 2021

stable inclusion
from stable-5.10.67
commit 7b1abace16a9dff6804d4eb94750beb60d9502b4
bugzilla: 182619 https://gitee.com/openeuler/kernel/issues/I4EWO7

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7b1abace16a9dff6804d4eb94750beb60d9502b4

--------------------------------

commit ea7acd7c upstream.

With added CPU domain to placement you can have
now 3 placemnts at once.

CC: stable@kernel.org
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-5-andrey.grodzovsky@amd.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Acked-by: NWeilong Chen <chenweilong@huawei.com>
Signed-off-by: NChen Jun <chenjun102@huawei.com>
Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>

0e542c48

11 9月, 2020 1 次提交

drm/ttm: nuke memory type flags · 48e07c23

由 Christian König 提交于 9月 10, 2020

It's not supported to specify more than one of those flags.
So it never made sense to make this a flag in the first place.

Nuke the flags and specify directly which memory type to use.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389826/?series=81551&rev=1

48e07c23

25 8月, 2020 1 次提交

drm/amdgpu: Get DRM dev from adev by inline-f · 4a580877

由 Luben Tuikov 提交于 8月 24, 2020

Add a static inline adev_to_drm() to obtain
the DRM device pointer from an amdgpu_device pointer.
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a580877

18 8月, 2020 1 次提交

drm/amdgpu: fix amdgpu_bo_release_notify() comment error · 736b1729

由 Kevin Wang 提交于 8月 17, 2020

fix amdgpu_bo_release_notify() comment error.
Signed-off-by: NKevin Wang <kevin1.wang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDennis Li <Dennis.Li@amd.com>
Acked-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

736b1729

12 8月, 2020 1 次提交

drm/ttm: give resource functions their own [ch] files · b2458726

由 Christian König 提交于 8月 03, 2020

This is a separate object we work within TTM.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/384338/?series=80346&rev=1

b2458726

06 8月, 2020 3 次提交

drm/ttm: rename ttm_mem_reg to ttm_resource. · 2966141a

由 Dave Airlie 提交于 8月 04, 2020

This name better reflects what the object does. I didn't rename
all the pointers it seemed too messy.
Signed-off-by: NDave Airlie <airlied@redhat.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NBen Skeggs <bskeggs@redhat.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-60-airlied@gmail.com

2966141a

drm/ttm: rename ttm_mem_type_manager -> ttm_resource_manager. · 9de59bc2

由 Dave Airlie 提交于 8月 04, 2020

This name makes a lot more sense, since these are about managing
driver resources rather than just memory ranges.
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NBen Skeggs <bskeggs@redhat.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-59-airlied@gmail.com

9de59bc2

drm/amdgfx/ttm: use wrapper to get ttm memory managers · 6c28aed6

由 Dave Airlie 提交于 8月 04, 2020

Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804025632.3868079-38-airlied@gmail.com

6c28aed6

05 8月, 2020 1 次提交

drm/amdgpu: handle bo size 0 in amdgpu_bo_create_kernel_at (v2) · 37912e96

由 Alex Deucher 提交于 7月 28, 2020

Just return early to match other bo_create functions.

v2: check if the bo_ptr is NULL rather than checking the size.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1)
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

37912e96

25 6月, 2020 1 次提交

drm/amdgpu: move ttm bo->offset to amdgpu_bo · b1a8ef95

由 Nirmoy Das 提交于 6月 24, 2020

GPU address should belong to driver not in memory management.
This patch moves ttm bo.offset and gpu_offset calculation to amdgpu driver.
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/372930/

b1a8ef95

29 4月, 2020 1 次提交

drm/amdgpu: expand amdgpu_copy_buffer interface with tmz parameter · c9dc9cfe

由 Aaron Liu 提交于 10月 15, 2019

This patch expands amdgpu_copy_buffer interface with tmz parameter.
Signed-off-by: NAaron Liu <aaron.liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c9dc9cfe

11 3月, 2020 1 次提交

drm/amdgpu: Correct the condition of warning while bo release · 9fe58d0b

由 xinhui pan 提交于 3月 09, 2020

Only kernel bo has kfd eviction fence.
This warning is to give a notice that kfd only remove eviction fence on
individual bos.
Tested-by: NNicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9fe58d0b

27 2月, 2020 3 次提交

drm/amdgpu: implement amdgpu_gem_prime_move_notify v2 · a448cb00

由 Christian König 提交于 6月 07, 2018

Implement the importer side of unpinned DMA-buf handling.

v2: update page tables immediately
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/353998/?series=73646&rev=1

a448cb00

drm/amdgpu: add amdgpu_dma_buf_pin/unpin v2 · 2d4dad27

由 Christian König 提交于 5月 30, 2018

This implements the exporter side of unpinned DMA-buf handling.

v2: fix minor coding style issues
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/353999/?series=73646&rev=1

2d4dad27

drm/amdgpu: Remove kfd eviction fence before release bo (v2) · f4a3c42b

由 xinhui pan 提交于 2月 11, 2020

No need to trigger eviction as the memory mapping will not be used
anymore.

All pt/pd bos share same resv, hence the same shared eviction fence.
Everytime page table is freed, the fence will be signled and that cuases
kfd unexcepted evictions.

v2: squash in 32 bit fix

CC: Christian König <christian.koenig@amd.com>
CC: Felix Kuehling <felix.kuehling@amd.com>
CC: Alex Deucher <alexander.deucher@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Nxinhui pan <xinhui.pan@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f4a3c42b

05 2月, 2020 2 次提交

drm/amdgpu: rework synchronization of VM updates v4 · 9f3cc18d

由 Christian König 提交于 1月 23, 2020

If provided we only sync to the BOs reservation
object and no longer to the root PD.

v2: update comment, cleanup amdgpu_bo_sync_wait_resv
v3: use correct reservation object while clearing
v4: fix typo in amdgpu_bo_sync_wait_resv
Signed-off-by: NChristian König <christian.koenig@amd.com>
Tested-by: NTom St Denis <tom.stdenis@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9f3cc18d

drm/amdgpu: rework job synchronization v2 · 5d319660

由 Christian König 提交于 12月 16, 2019

For unlocked page table updates we need to be able
to sync to fences of a specific VM.

v2: use SYNC_ALWAYS in the UVD code
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5d319660

18 10月, 2019 1 次提交

drm/amdgpu: fix potential VM faults · 3122051e

由 Christian König 提交于 9月 19, 2019

When we allocate new page tables under memory
pressure we should not evict old ones.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3122051e

17 10月, 2019 1 次提交

drm/ttm: rename ttm_fbdev_mmap · 12067e0e

由 Gerd Hoffmann 提交于 10月 16, 2019

Rename ttm_fbdev_mmap to ttm_bo_mmap_obj.  Move the vm_pgoff sanity
check to amdgpu_bo_fbdev_mmap (only ttm_fbdev_mmap user in tree).

The ttm_bo_mmap_obj function can now be used to map any buffer object.
This allows to implement &drm_gem_object_funcs.mmap in gem ttm helpers.

v3: patch added to series
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Acked-by: NThomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20191016115203.20095-8-kraxel@redhat.com

12067e0e

16 10月, 2019 1 次提交

drm/amdgpu: fix potential VM faults · b2c18f0a

由 Christian König 提交于 9月 19, 2019

When we allocate new page tables under memory
pressure we should not evict old ones.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b2c18f0a

03 10月, 2019 1 次提交

drm/amdgpu: once more fix amdgpu_bo_create_kernel_at · 4a246528

由 Christian König 提交于 9月 24, 2019

When CPU access is needed we should tell that to
amdgpu_bo_create_reserved() or otherwise the access is denied later on.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NEmily Deng <Emily.Deng@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a246528

17 9月, 2019 1 次提交

drm/amdgpu: cleanup creating BOs at fixed location (v2) · de7b45ba

由 Christian König 提交于 9月 13, 2019

The placement is something TTM/BO internal and the RAS code should
avoid touching that directly.

Add a helper to create a BO at a fixed location and use that instead.

v2: squash in fixes (Alex)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

de7b45ba

16 9月, 2019 1 次提交

drm/amdgpu: allocate PDs/PTs with no_gpu_wait in a page fault · 061468c4

由 Christian König 提交于 9月 16, 2019

While handling a page fault we can't wait for other ongoing GPU
operations or we can potentially run into deadlocks.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

061468c4

22 8月, 2019 1 次提交

drm/amdgpu/psp: move TMR to cpu invisible vram region · 828d6fde

由 Tianci.Yin 提交于 8月 19, 2019

so that more visible vram can be available for umd.

Reviewed-by: Christian König <christian.koenig@amd.com>.
Signed-off-by: NTianci.Yin <tianci.yin@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

828d6fde

13 8月, 2019 1 次提交

dma-buf: rename reservation_object to dma_resv · 52791eee

由 Christian König 提交于 8月 11, 2019

Be more consistent with the naming of the other DMA-buf objects.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/323401/

52791eee

06 8月, 2019 2 次提交

drm/amdgpu: switch driver from bo->resv to bo->base.resv · 5a5011a7

由 Gerd Hoffmann 提交于 8月 05, 2019

Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190805140119.7337-14-kraxel@redhat.com

5a5011a7

drm/amdgpu: use embedded gem object · c105de28

由 Gerd Hoffmann 提交于 8月 05, 2019

Drop drm_gem_object from amdgpu_bo, use the
ttm_buffer_object.base instead.

Build tested only.
Signed-off-by: NGerd Hoffmann <kraxel@redhat.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190805140119.7337-6-kraxel@redhat.com

c105de28

05 8月, 2019 1 次提交

dma-buf: add more reservation object locking wrappers · 0dbd555a

由 Christian König 提交于 7月 31, 2019

Complete the abstraction of the ww_mutex inside the reservation object.

This allows us to add more handling and debugging to the reservation
object in the future.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/320761/

0dbd555a

02 8月, 2019 1 次提交

drm/amdgpu: Implement VRAM wipe on release · ab2f7a5c

由 Felix Kuehling 提交于 7月 09, 2019

Wipe VRAM memory containing sensitive data when moving or releasing
BOs. Clearing the memory is pipelined to minimize any impact on
subsequent memory allocation latency. Use of a poison value should
help debug future use-after-free bugs.

When moving BOs, the existing ttm_bo_pipelined_move ensures that the
memory won't be reused before being wiped.

When releasing BOs, the BO is fenced with the memory fill operation,
which results in queuing the BO for a delayed delete.

v2: Move amdgpu_amdkfd_unreserve_memory_limit into
amdgpu_bo_release_notify so that KFD can use memory that's still
being cleared in the background
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ab2f7a5c

31 7月, 2019 2 次提交

drm/amdgpu: Fill out gem_object->resv · b2ad978f

由 Daniel Vetter 提交于 7月 25, 2019

That way we can ditch our gem_prime_res_obj implementation. Since ttm
absolutely needs the right reservation object all the boilerplate is
already there and we just have to wire it up correctly.

Note that gem/prime doesn't care when we do this, as long as we do it
before the bo is registered and someone can call the handle2fd ioctl
on it.

Aside: ttm_buffer_object.ttm_resv could probably be ditched in favour
of always passing a non-NULL resv to ttm_bo_init(). At least for gem
drivers that would avoid having two of these, on in ttm_buffer_object
and the other in drm_gem_object, one just there for confusion.
Acked-by: NGerd Hoffmann <kraxel@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: NEmil Velikov <emil.velikov@collabora.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Michel Dänzer" <michel.daenzer@amd.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Amber Lin <Amber.Lin@amd.com>
Cc: "Michał Mirosław" <mirq-linux@rere.qmqm.pl>
Cc: Junwei Zhang <Jerry.Zhang@amd.com>
Cc: Thomas Zimmermann <contact@tzimmermann.org>
Cc: Samuel Li <Samuel.Li@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190725132655.11951-4-daniel.vetter@ffwll.ch

b2ad978f

drm/amdgpu: Create helper to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC · 3d1b8ec7

由 Andrey Grodzovsky 提交于 7月 24, 2019

Move the logic to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC in
amdgpu_bo_do_create into standalone helper so it can be reused
in other functions.

v4:
Switch to return bool.

v5: Fix typos.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3d1b8ec7

21 6月, 2019 1 次提交

drm/amdgpu: Add GDDR6 in vram_name arrary · 5228fe30

由 Hawking Zhang 提交于 5月 02, 2018

For printing vram type.
Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NTao Zhou <Tao.Zhou1@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5228fe30

12 6月, 2019 1 次提交

drm/amdgpu: create GDS, GWS and OA in system domain · 94de7349

由 Christian König 提交于 5月 16, 2019

And only move them in on validation. This allows for better control
when multiple processes are fighting over those resources.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Tested-by: NPierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

94de7349

11 6月, 2019 1 次提交

drm/amd: drop use of drmP.h in amdgpu/amdgpu* · fdf2f6c5

由 Sam Ravnborg 提交于 6月 10, 2019

Drop use of drmP.h in all files named amdgpu*
in drm/amd/amdgpu/

Fix fallout.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190609220757.10862-10-sam@ravnborg.org

fdf2f6c5

20 4月, 2019 1 次提交

drm/amdgpu: amdgpu_device_recover_vram got NULL of shadow->parent · 36e499b2

由 wentalou 提交于 4月 16, 2019

amdgpu_bo_destroy had a bug by calling amdgpu_bo_unref outside mutex_lock.
If amdgpu_device_recover_vram executed between amdgpu_bo_unref and list_del_init,
it would get NULL of shadow->parent, then caused Call Trace and GPU reset failed.
Signed-off-by: NWentao Lou <Wentao.Lou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36e499b2

14 2月, 2019 1 次提交

drm/amdgpu: Add helper to wait for BO fences using a sync object · e8e32426

由 Felix Kuehling 提交于 2月 04, 2019

Creates a temporary sync object to wait for the BO reservation. This
generalizes amdgpu_vm_wait_pd.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e8e32426

01 2月, 2019 1 次提交

drm/amdgpu: clean up memory/GDS/GWS/OA alignment code · fe57085a

由 Marek Olšák 提交于 1月 22, 2019

- move all adjustments into one place
- specify GDS/GWS/OA alignment in basic units of the heaps
- it looks like GDS alignment was 1 instead of 4
Signed-off-by: NMarek Olšák <marek.olsak@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fe57085a

15 12月, 2018 1 次提交

drm/amdgpu: WARN once if amdgpu_bo_unpin is called for an unpinned BO · a3a0ebd1

由 Michel Dänzer 提交于 12月 13, 2018

It indicates a pin/unpin imbalance bug somewhere. While the bug isn't
necessarily in the call chain hitting this, it's at least one part
involved.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a3a0ebd1

08 12月, 2018 1 次提交

drm/amdgpu: Add KFD VRAM limit checking · 611736d8

由 Felix Kuehling 提交于 11月 19, 2018

We don't want KFD processes evicting each other over VRAM usage.
Therefore prevent overcommitting VRAM among KFD applications with
a per-GPU limit. Also leave enough room for page tables on top
of the application memory usage.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NEric Huang <JinHuiEric.Huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

611736d8

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功