提交 · 5504eb164eecdcc1fcb7d7a3d05c29b3bbbcfa78 · openeuler / Kernel

15 12月, 2022 3 次提交

drm/amdgpu: revert "generally allow over-commit during BO allocation" · 47722220

由 Christian König 提交于 12月 12, 2022

This reverts commit f9d00a4a.

This causes problem for KFD because when we overcommit we accidentially
bind the BO to GTT for moving it into VRAM. We also need to make sure
that this is done only as fallback after trying to evict first.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

47722220

drm/amdgpu: Remove unnecessary domain argument · 3273f116

由 Luben Tuikov 提交于 12月 14, 2022

Remove the "domain" argument to amdgpu_bo_create_kernel_at() since this
function takes an "offset" argument which is the offset off of VRAM, and as
such allocation always takes place in VRAM. Thus, the "domain" argument is
unnecessary.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3273f116

drm/amdgpu: Fix size validation for non-exclusive domains (v4) · 7554886d

由 Luben Tuikov 提交于 12月 10, 2022

Fix amdgpu_bo_validate_size() to check whether the TTM domain manager for the
requested memory exists, else we get a kernel oops when dereferencing "man".

v2: Make the patch standalone, i.e. not dependent on local patches.
v3: Preserve old behaviour and just check that the manager pointer is not
    NULL.
v4: Complain if GTT domain requested and it is uninitialized--most likely a
    bug.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: AMD Graphics <amd-gfx@lists.freedesktop.org>
Signed-off-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7554886d

14 12月, 2022 1 次提交

drm/amdgpu: WARN when freeing kernel memory during suspend · 4d2ccd96

由 Christian König 提交于 11月 16, 2022

When buffers are freed during suspend there is no guarantee that
they can be re-allocated during resume.

The PSP subsystem seems to be quite buggy regarding this, so add
a WARN_ON() to point out those bugs.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexdeucher@amd.com>
Tested-by: NGuilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4d2ccd96

10 12月, 2022 1 次提交

drm/amdgpu: make display pinning more flexible (v2) · 81d0bcf9

由 Alex Deucher 提交于 12月 07, 2022

Only apply the static threshold for Stoney and Carrizo.
This hardware has certain requirements that don't allow
mixing of GTT and VRAM.  Newer asics do not have these
requirements so we should be able to be more flexible
with where buffers end up.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2270
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2291
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2255Acked-by: NLuben Tuikov <luben.tuikov@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

81d0bcf9

06 12月, 2022 1 次提交

drm/amdgpu: generally allow over-commit during BO allocation · f9d00a4a

由 Christian König 提交于 11月 24, 2022

We already fallback to a dummy BO with no backing store when we
allocate GDS,GWS and OA resources and to GTT when we allocate VRAM.

Drop all those workarounds and generalize this for GTT as well. This
fixes ENOMEM issues with runaway applications which try to allocate/free
GTT in a loop and are otherwise only limited by the CPU speed.

The CS will wait for the cleanup of freed up BOs to satisfy the
various domain specific limits and so effectively throttle those
buggy applications down to a sane allocation behavior again.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NArunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f9d00a4a

27 10月, 2022 1 次提交

drm/ttm: rework on ttm_resource to use size_t type · e3c92eb4

由 Somalapuram Amaranath 提交于 10月 27, 2022

Change ttm_resource structure from num_pages to size_t size in bytes.
v1 -> v2: change PFN_UP(dst_mem->size) to ttm->num_pages
v1 -> v2: change bo->resource->size to bo->base.size at some places
v1 -> v2: remove the local variable
v1 -> v2: cleanup cmp_size_smaller_first()
v2 -> v3: adding missing PFN_UP in ttm_bo_vm_fault_reserved
Signed-off-by: NSomalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221027091237.983582-1-Amaranath.Somalapuram@amd.comReviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>

e3c92eb4

07 10月, 2022 1 次提交

drm/amdgpu: Set vmbo destroy after pt bo is created · 9a3c6067

由 Philip Yang 提交于 10月 03, 2022

Under VRAM usage pression, map to GPU may fail to create pt bo and
vmbo->shadow_list is not initialized, then ttm_bo_release calling
amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below
dmesg and NULL pointer access backtrace:

Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt
bo successfully, otherwise use default callback amdgpu_bo_destroy.

amdgpu: amdgpu_vm_bo_update failed
amdgpu: update_gpuvm_pte() failed
amdgpu: Failed to map bo to gpuvm
amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2
BUG: kernel NULL pointer dereference, address:
 RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu]
 Call Trace:
  <TASK>
  ttm_bo_release+0x207/0x320 [amdttm]
  amdttm_bo_init_reserved+0x1d6/0x210 [amdttm]
  amdgpu_bo_create+0x1ba/0x520 [amdgpu]
  amdgpu_bo_create_vm+0x3a/0x80 [amdgpu]
  amdgpu_vm_pt_create+0xde/0x270 [amdgpu]
  amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu]
  amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu]
  amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu]
  update_gpuvm_pte+0x160/0x420 [amdgpu]
  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu]
  kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu]
  kfd_ioctl+0x24a/0x5b0 [amdgpu]
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9a3c6067

14 7月, 2022 1 次提交

drm/amdgpu: Check BO's requested pinning domains against its preferred_domains · f5ba1404

由 Leo Li 提交于 7月 12, 2022

When pinning a buffer, we should check to see if there are any
additional restrictions imposed by bo->preferred_domains. This will
prevent the BO from being moved to an invalid domain when pinning.

For example, this can happen if the user requests to create a BO in GTT
domain for display scanout. amdgpu_dm will allow pinning to either VRAM
or GTT domains, since DCN can scanout from either or. However, in
amdgpu_bo_pin_restricted(), pinning to VRAM is preferred if there is
adequate carveout. This can lead to pinning to VRAM despite the user
requesting GTT placement for the BO.

v2: Allow the kernel to override the domain, which can happen when
    exporting a BO to a V4L camera (for example).
Signed-off-by: NLeo Li <sunpeng.li@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

f5ba1404

11 7月, 2022 2 次提交

drm/amdgpu: audit bo->resource usage · 63af82cf

由 Christian König 提交于 12月 17, 2021

Make sure we can at least move and release BOs without backing store.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220707102453.3633-3-christian.koenig@amd.com

63af82cf

drm/ttm: rename and cleanup ttm_bo_init · 347987a2

由 Christian König 提交于 2月 18, 2022

Rename ttm_bo_init to ttm_bo_init_validate since that better matches
what the function is actually doing.

Remove the unused size parameter, move the function's kerneldoc to the
implementation and cleanup the whole error handling.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220707102453.3633-2-christian.koenig@amd.comReviewed-by: NMichael J. Ruhl <michael.j.ruhl@intel.com>

347987a2

27 5月, 2022 2 次提交

drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE · fab2cc83

由 Christian König 提交于 5月 06, 2022

Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO
doesn't needs to be preserved during eviction.

KFD was already using a similar functionality for SVM BOs so replace the
internal flag with the new UAPI.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fab2cc83

drm/amdgpu: differentiate between LP and non-LP DDR memory · d534ca71

由 Alex Deucher 提交于 5月 23, 2022

Some applications want to know whether the memory is LP or
not.
Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d534ca71

07 4月, 2022 4 次提交

drm/ttm: remove bo->moving · 8bb31587

由 Christian König 提交于 11月 23, 2021

This is now handled by the DMA-buf framework in the dma_resv obj.

Also remove the workaround inside VMWGFX to update the moving fence.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-14-christian.koenig@amd.com

8bb31587

drm/amdgpu: use DMA_RESV_USAGE_KERNEL · c35fcfa3

由 Christian König 提交于 4月 04, 2022

Wait only for kernel fences before kmap or UVD direct submission.

This also makes sure that we always wait in amdgpu_bo_kmap() even when
returning a cached pointer.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-6-christian.koenig@amd.com

c35fcfa3

dma-buf: specify usage while adding fences to dma_resv obj v7 · 73511edf

由 Christian König 提交于 11月 09, 2021

Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.

Rework all drivers to use this interface instead and deprecate the old one.

v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
    disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com

73511edf

dma-buf: add enum dma_resv_usage v4 · 7bc80a54

由 Christian König 提交于 11月 09, 2021

This change adds the dma_resv_usage enum and allows us to specify why a
dma_resv object is queried for its containing fences.

Additional to that a dma_resv_usage_rw() helper function is added to aid
retrieving the fences for a read or write userspace submission.

This is then deployed to the different query functions of the dma_resv
object and all of their users. When the write paratermer was previously
true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise.

v2: add KERNEL/OTHER in separate patch
v3: some kerneldoc suggestions by Daniel
v4: some more kerneldoc suggestions by Daniel, fix missing cases lost in
the rebase pointed out by Bas.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-2-christian.koenig@amd.com

7bc80a54

06 4月, 2022 1 次提交

dma-buf/drivers: make reserving a shared slot mandatory v4 · c8d4c18b

由 Christian König 提交于 11月 16, 2021

Audit all the users of dma_resv_add_excl_fence() and make sure they
reserve a shared slot also when only trying to add an exclusive fence.

This is the next step towards handling the exclusive fence like a
shared one.

v2: fix missed case in amdgpu
v3: and two more radeon, rename function
v4: add one more case to TTM, fix i915 after rebase
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com

c8d4c18b

01 4月, 2022 1 次提交

drm/amdgpu: fix incorrect size printing in error msg · 4499c90e

由 Christian König 提交于 3月 11, 2022

That are bytes not pages.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4499c90e

26 3月, 2022 1 次提交

drm/amdgpu: prevent memory wipe in suspend/shutdown stage · 32f90e65

由 Guchun Chen 提交于 3月 15, 2022

On GPUs with RAS enabled, below call trace is observed when
suspending or shutting down device. The cause is we have enabled
memory wipe flag for BOs on such GPUs by default, and such BOs
will go to memory wipe by amdgpu_fill_buffer, however, because
ring is off already, it fails to clean up the memory and throw
this error message. So add a suspend/shutdown check before
wipping memory.

[drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off.

v2: fix coding style issue

Fixes: fc6ea4be ("drm/amdgpu: Wipe all VRAM on free when RAS is enabled")
Signed-off-by: NGuchun Chen <guchun.chen@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

32f90e65

14 2月, 2022 2 次提交

drm/amdgpu: remove VRAM accounting v2 · 7db47b83

由 Christian König 提交于 7月 12, 2021

This is provided by TTM now.

Also switch man->size to bytes instead of pages and fix the double
printing of size and usage in debugfs.

v2: fix size checking as well
Signed-off-by: NChristian König <christian.koenig@amd.com>
Tested-by: NBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-8-christian.koenig@amd.com

7db47b83

drm/amdgpu: remove GTT accounting v2 · dfa714b8

由 Christian König 提交于 7月 12, 2021

This is provided by TTM now.

Also switch man->size to bytes instead of pages and fix the double
printing of size and usage in debugfs.

v2: fix size checking as well
Signed-off-by: NChristian König <christian.koenig@amd.com>
Tested-by: NBas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: NMatthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220214093439.2989-6-christian.koenig@amd.com

dfa714b8

08 2月, 2022 1 次提交

drm/amdgpu: Fix recursive locking warning · 447c7997

由 Rajneesh Bhardwaj 提交于 2月 03, 2022

Noticed the below warning while running a pytorch workload on vega10
GPUs. Change to trylock to avoid conflicts with already held reservation
locks.

[  +0.000003] WARNING: possible recursive locking detected
[  +0.000003] 5.13.0-kfd-rajneesh #1030 Not tainted
[  +0.000004] --------------------------------------------
[  +0.000002] python/4822 is trying to acquire lock:
[  +0.000004] ffff932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000203]
              but task is already holding lock:
[  +0.000003] ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3},
at: ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.000017]
              other info that might help us debug this:
[  +0.000002]  Possible unsafe locking scenario:

[  +0.000003]        CPU0
[  +0.000002]        ----
[  +0.000002]   lock(reservation_ww_class_mutex);
[  +0.000004]   lock(reservation_ww_class_mutex);
[  +0.000003]
               *** DEADLOCK ***

[  +0.000002]  May be due to missing lock nesting notation

[  +0.000003] 7 locks held by python/4822:
[  +0.000003]  #0: ffff932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at:
kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu]
[  +0.000232]  #1: ffff932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu]
[  +0.000241]  #2: ffff932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu]
[  +0.000236]  #3: ffffb2b35606fd28
(reservation_ww_class_acquire){+.+.}-{0:0}, at:
amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu]
[  +0.000235]  #4: ffff932cbb7181f8
(reservation_ww_class_mutex){+.+.}-{3:3}, at:
ttm_eu_reserve_buffers+0x270/0x470 [ttm]
[  +0.000015]  #5: ffffffffc045f700 (*(sspp++)){....}-{0:0}, at:
drm_dev_enter+0x5/0xa0 [drm]
[  +0.000038]  #6: ffff932c52da7078 (&vm->eviction_lock){+.+.}-{3:3},
at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu]
[  +0.000195]
              stack backtrace:
[  +0.000003] CPU: 11 PID: 4822 Comm: python Not tainted
5.13.0-kfd-rajneesh #1030
[  +0.000005] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02
08/29/2018
[  +0.000003] Call Trace:
[  +0.000003]  dump_stack+0x6d/0x89
[  +0.000010]  __lock_acquire+0xb93/0x1a90
[  +0.000009]  lock_acquire+0x25d/0x2d0
[  +0.000005]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  ? lock_is_held_type+0xa2/0x110
[  +0.000006]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000184]  __ww_mutex_lock.constprop.17+0xca/0x1060
[  +0.000007]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  ? lock_release+0x13f/0x270
[  +0.000005]  ? lock_is_held_type+0xa2/0x110
[  +0.000006]  ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000183]  amdgpu_bo_release_notify+0xc4/0x160 [amdgpu]
[  +0.000185]  ttm_bo_release+0x4c6/0x580 [ttm]
[  +0.000010]  amdgpu_bo_unref+0x1a/0x30 [amdgpu]
[  +0.000183]  amdgpu_vm_free_table+0x76/0xa0 [amdgpu]
[  +0.000189]  amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu]
[  +0.000189]  amdgpu_vm_update_ptes+0x411/0x770 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu]
[  +0.000191]  amdgpu_vm_bo_update+0x251/0x610 [amdgpu]
[  +0.000191]  update_gpuvm_pte+0xcc/0x290 [amdgpu]
[  +0.000229]  ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu]
[  +0.000190]  amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60
[amdgpu]
[  +0.000234]  kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu]
[  +0.000218]  kfd_ioctl+0x2b9/0x600 [amdgpu]
[  +0.000216]  ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu]
[  +0.000216]  ? lock_release+0x13f/0x270
[  +0.000006]  ? __fget_files+0x107/0x1e0
[  +0.000007]  __x64_sys_ioctl+0x8b/0xd0
[  +0.000007]  do_syscall_64+0x36/0x70
[  +0.000004]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.000007] RIP: 0033:0x7fbff90a7317
[  +0.000004] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[  +0.000005] RSP: 002b:00007fbe301fe648 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[  +0.000006] RAX: ffffffffffffffda RBX: 00007fbcc402d820 RCX:
00007fbff90a7317
[  +0.000003] RDX: 00007fbe301fe690 RSI: 00000000c0184b18 RDI:
0000000000000004
[  +0.000003] RBP: 00007fbe301fe690 R08: 0000000000000000 R09:
00007fbcc402d880
[  +0.000003] R10: 0000000002001000 R11: 0000000000000246 R12:
00000000c0184b18
[  +0.000003] R13: 0000000000000004 R14: 00007fbf689593a0 R15:
00007fbcc402d820

Cc: Christian König <christian.koenig@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

447c7997

28 1月, 2022 1 次提交

drm/amdgpu: Wipe all VRAM on free when RAS is enabled · fc6ea4be

由 Felix Kuehling 提交于 1月 25, 2022

On GPUs with RAS, poison can propagate between processes if VRAM is not
cleared when it is freed or allocated. The reason is, that not all write
accesses clear RAS poison. 32-byte writes by the SDMA engine do clear RAS
poison. Clearing memory in the background when it is freed should avoid
major performance impact. KFD has been doing this already for a long time.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

fc6ea4be

12 1月, 2022 1 次提交

drm/amdgpu: Unmap MMIO mappings when device is not unplugged · 62d5f9f7

由 Leslie Shi 提交于 1月 05, 2022

Patch: 3efb17ae7e92 ("drm/amdgpu: Call amdgpu_device_unmap_mmio() if device
is unplugged to prevent crash in GPU initialization failure") makes call to
amdgpu_device_unmap_mmio() conditioned on device unplugged. This patch unmaps
MMIO mappings even when device is not unplugged.

v2: Add condition of drm_dev_enter() to deleted unmaps in patch
"drm/amdgpu: Unmap all MMIO mappings"
Signed-off-by: NLeslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

62d5f9f7

18 11月, 2021 1 次提交

drm/amdgpu: return early on error while setting bar0 memtype · 26db557e

由 Nirmoy Das 提交于 10月 29, 2021

We set WC memtype for aper_base but don't check return value
of arch_io_reserve_memtype_wc(). Be more defensive and return
early on error.
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

26db557e

06 11月, 2021 1 次提交

drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs · 5702d052

由 Felix Kuehling 提交于 11月 04, 2021

If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be
freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the
release_notify callback then the amdgpu_bo is freed.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-By: NRamesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5702d052

07 10月, 2021 1 次提交

drm/amdgpu: unify BO evicting method in amdgpu_ttm · 58144d28

由 Nirmoy Das 提交于 10月 06, 2021

Unify BO evicting functionality for possible memory
types in amdgpu_ttm.c.
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

58144d28

08 9月, 2021 1 次提交

drm/amdgpu: remove unused amdgpu_bo_validate · a7181b52

由 Christian König 提交于 9月 07, 2021

Just drop some dead code.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NNirmoy Das <nirmoy.das@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a7181b52

26 8月, 2021 1 次提交

drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain · d035f84d

由 Yifan Zhang 提交于 8月 25, 2021

amdgpu_bo_get_preferred_pin_domain is used for page tables
creation, which is not involved with page pinning. And it is used in
more cases than display scanout, modify its documentation as well.
Signed-off-by: NYifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d035f84d

21 8月, 2021 2 次提交

drm/amdgpu: use the preferred pin domain after the check · 2a7b9a84

由 Christian König 提交于 8月 18, 2021

For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.

Handle that case gracefully as well.
Reviewed-by: NShashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

2a7b9a84

drm/amdgpu: use the preferred pin domain after the check · 9deb0b3d

由 Christian König 提交于 8月 18, 2021

For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.

Handle that case gracefully as well.
Reviewed-by: NShashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

9deb0b3d

23 7月, 2021 1 次提交

drm/amdgpu: Fix documentaion for amdgpu_bo_add_to_shadow_list · d8c33180

由 Anson Jacob 提交于 7月 19, 2021

make htmldocs complaints about parameter for amdgpu_bo_add_to_shadow_list

./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Function parameter or member 'vmbo' not described in 'amdgpu_bo_add_to_shadow_list'
./drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:739: warning: Excess function parameter 'bo' description in 'amdgpu_bo_add_to_shadow_list'
Signed-off-by: NAnson Jacob <Anson.Jacob@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d8c33180

24 6月, 2021 1 次提交

drm/amdgpu: Fix BUG_ON assert · ea7acd7c

由 Andrey Grodzovsky 提交于 6月 22, 2021

With added CPU domain to placement you can have
now 3 placemnts at once.

CC: stable@kernel.org
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-5-andrey.grodzovsky@amd.com

ea7acd7c

22 6月, 2021 1 次提交

drm/amdgpu: rework dma_resv handling v3 · 8c505bdc

由 Christian König 提交于 6月 09, 2021

Drop the workaround and instead implement a better solution.

Basically we are now chaining all submissions using a dma_fence_chain
container and adding them as exclusive fence to the dma_resv object.

This way other drivers can still sync to the single exclusive fence
while amdgpu only sync to fences from different processes.

v3: add the shared fence first before the exclusive one
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210614174536.5188-2-christian.koenig@amd.com

8c505bdc

16 6月, 2021 2 次提交

drm/amdgpu: move shadow_list to amdgpu_bo_vm · e18aaea7

由 Nirmoy Das 提交于 6月 15, 2021

Move shadow_list to struct amdgpu_bo_vm as shadow BOs
are part of PT/PD BOs.
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e18aaea7

drm/amdgpu: parameterize ttm BO destroy callback · 23e24fbb

由 Nirmoy Das 提交于 6月 14, 2021

Make provision to pass different ttm BO destroy callback
while creating a amdgpu_bo.
Signed-off-by: NNirmoy Das <nirmoy.das@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

23e24fbb

09 6月, 2021 1 次提交

drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create · 2a48b591

由 Changfeng 提交于 6月 02, 2021

It will cause error when alloc memory larger than 128KB in
amdgpu_bo_create->kzalloc. So it needs to switch kzalloc to kvzalloc.

Call Trace:
   alloc_pages_current+0x6a/0xe0
   kmalloc_order+0x32/0xb0
   kmalloc_order_trace+0x1e/0x80
   __kmalloc+0x249/0x2d0
   amdgpu_bo_create+0x102/0x500 [amdgpu]
   ? xas_create+0x264/0x3e0
   amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
   amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
   amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]
Signed-off-by: NChangfeng <Changfeng.Zhu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2a48b591

06 6月, 2021 1 次提交

dma-buf: drop the _rcu postfix on function names v3 · d3fae3b3

由 Christian König 提交于 6月 02, 2021

The functions can be called both in _rcu context as well
as while holding the lock.

v2: add some kerneldoc as suggested by Daniel
v3: fix indentation
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJason Ekstrand <jason@jlekstrand.net>
Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com

d3fae3b3

05 6月, 2021 1 次提交

drm/amdgpu: switch kzalloc to kvzalloc in amdgpu_bo_create · 31c759bb

由 Changfeng 提交于 6月 02, 2021

It will cause error when alloc memory larger than 128KB in
amdgpu_bo_create->kzalloc. So it needs to switch kzalloc to kvzalloc.

Call Trace:
   alloc_pages_current+0x6a/0xe0
   kmalloc_order+0x32/0xb0
   kmalloc_order_trace+0x1e/0x80
   __kmalloc+0x249/0x2d0
   amdgpu_bo_create+0x102/0x500 [amdgpu]
   ? xas_create+0x264/0x3e0
   amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
   amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
   amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]
Signed-off-by: NChangfeng <Changfeng.Zhu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

31c759bb

openeuler / Kernel 大约 2 年 前同步成功

openeuler / Kernel
大约 2 年前同步成功