提交 · 742b48aed44b7f347e82d2d5c48f68110731ac78 · openeuler / Kernel

25 5月, 2019 1 次提交

drm/amdgpu: check no_user_fence flag for engines · 742b48ae

由 Leo Liu 提交于 5月 08, 2019

To replace checking ring type and make them generic
Signed-off-by: NLeo Liu <leo.liu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

742b48ae

13 4月, 2019 1 次提交

drm/amdgpu: add timeline support in amdgpu CS v3 · 2624dd15

由 Chunming Zhou 提交于 4月 01, 2019

syncobj wait/signal operation is appending in command submission.
v2: separate to two kinds in/out_deps functions
v3: fix checking for timeline syncobj
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Cc: Tobias Hector <Tobias.Hector@amd.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2624dd15

28 3月, 2019 1 次提交

Revert "drm/amdgpu: replace get_user_pages with HMM mirror helpers" · 318c3f4b

由 Alex Deucher 提交于 3月 28, 2019

This reverts commit 915d3eec.

This depends on an HMM fix which is not upstream yet.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

318c3f4b

20 3月, 2019 1 次提交

drm/amdgpu: replace get_user_pages with HMM mirror helpers · 915d3eec

由 Philip Yang 提交于 12月 13, 2018

Use HMM helper function hmm_vma_fault() to get physical pages backing
userptr and start CPU page table update track of those pages. Then use
hmm_vma_range_done() to check if those pages are updated before
amdgpu_cs_submit for gfx or before user queues are resumed for kfd.

If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart
from scratch, for kfd, restore worker is rescheduled to retry.

HMM simplify the CPU page table concurrent update check, so remove
guptasklock, mmu_invalidations, last_set_pages fields from
amdgpu_ttm_tt struct.

HMM does not pin the page (increase page ref count), so remove related
operations like release_pages(), put_page(), mark_page_dirty().
Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

915d3eec

06 2月, 2019 1 次提交

drm/amdgpu: Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES · 67dd1a36

由 Andrey Grodzovsky 提交于 1月 31, 2019

New chunk for dependency on start of job's execution instead on
the end. This is used for GPU deadlock prevention when
userspace uses mid-IB fences to wait for mid-IB work on other rings.

v2: Fix typo in AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES
v3: Bump KMS version
v4: put old fence AFTER acquiring the scheduled fence.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: NChristian Koenig <Christian.Koenig@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

67dd1a36

15 12月, 2018 1 次提交

drm/amdgpu: fix NULL fence handling in amdgpu_cs_fence_to_handle_ioctl · 4e917713

由 Christian König 提交于 12月 03, 2018

When the fence is already signaled it is perfectly normal to get a NULL
fence here. But since we can't export that we need to use a stub fence.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4e917713

12 12月, 2018 1 次提交

drm/amdgpu: Fix DEBUG_LOCKS_WARN_ON(depth <= 0) in amdgpu_ctx.lock · c5542060

由 Andrey Grodzovsky 提交于 12月 06, 2018

If CS is submitted using guilty ctx, we terminate amdgpu_cs_parser_init
before locking ctx->lock, latter in amdgpu_cs_parser_fini we still are
trying to release the lock just becase parser->ctx != NULL.
Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c5542060

08 12月, 2018 4 次提交

drm/amdgpu: always reserve one more shared slot for pipelined BO moves · 07daa8a0

由 Christian König 提交于 9月 24, 2018

This allows us to drop the extra reserve in TTM.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

07daa8a0

drm/amdgpu: always reserve two slots for the VM · 0aa7aa24

由 Christian König 提交于 9月 21, 2018

And drop the now superflous extra reservations.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0aa7aa24

drm/amdgpu: fix using shared fence for exported BOs v2 · 049aca43

由 Christian König 提交于 9月 19, 2018

It is perfectly possible that the BO list is created before the BO is
exported. While at it clean up setting shared to one instead of true.

v2: add comment and simplify logic
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Acked-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

049aca43

drm/ttm: allow reserving more than one shared slot v3 · a9f34c70

由 Christian König 提交于 9月 19, 2018

Let's support simultaneous submissions to multiple engines.

v2: rename the field to num_shared and fix up all users
v3: rebased
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a9f34c70

05 12月, 2018 1 次提交

drm: revert "expand replace_fence to support timeline point v2" · 0b258ed1

由 Christian König 提交于 11月 14, 2018

This reverts commit 9a09a423.

The whole interface isn't thought through. Since this function can't
fail we actually can't allocate an object to store the sync point.

Sorry, I should have taken the lead on this from the very beginning and
reviewed it more thoughtfully. Going to propose a new interface as a
follow up change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Link: https://patchwork.freedesktop.org/patch/265580/

0b258ed1

06 11月, 2018 2 次提交

drm/amdgpu: print an error when the parser can't be initialized · e0519696

由 Samuel Pitoiset 提交于 10月 29, 2018

Similar to other error messages, might help for tracking down
issues.
Signed-off-by: NSamuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e0519696

drm/scheduler: Add drm_sched_job_cleanup · 26efecf9

由 Sharat Masetty 提交于 10月 29, 2018

This patch adds a new API to clean up the scheduler job resources. This
is primarliy needed in cases the job was created but was not queued to
the scheduler queue. Additionally with this change, the layer which
creates the scheduler job also gets to free up the job's resources and
this entails moving the dma_fence_put(finished_fence) to the drivers
ops free handler routines.
Signed-off-by: NSharat Masetty <smasetty@codeaurora.org>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

26efecf9

25 10月, 2018 1 次提交

dma-buf: allow reserving more than one shared fence slot · ca05359f

由 Christian König 提交于 9月 19, 2018

Let's support simultaneous submissions to multiple engines.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Link: https://patchwork.kernel.org/patch/10626149/

ca05359f

16 10月, 2018 1 次提交

drm: add flags to drm_syncobj_find_fence · 649fdce2

由 Chunming Zhou 提交于 10月 15, 2018

flags can be used by driver to decide whether need to block wait submission.
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
SIgned-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.kernel.org/patch/10641339/

649fdce2

20 9月, 2018 1 次提交

drm/amdgpu: fix up GDS/GWS/OA shifting · 77a2faa5

由 Christian König 提交于 9月 14, 2018

That only worked by pure coincident. Completely remove the shifting and
always apply correct PAGE_SHIFT.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

77a2faa5

14 9月, 2018 1 次提交

drm/amdgpu: remove amdgpu_bo_list_entry.robj (v2) · e83dfe4d

由 Christian König 提交于 9月 10, 2018

We can get that just by casting tv.bo.

v2: squash in kfd fix (Alex)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e83dfe4d

13 9月, 2018 1 次提交

drm/amdgpu: move cs dependencies front a bit · 7e7bf8de

由 Chunming Zhou 提交于 9月 11, 2018

cs dependencies handling doesn't need in vm resv
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7e7bf8de

12 9月, 2018 2 次提交

drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk · 0165de98

由 Christian König 提交于 9月 10, 2018

Slowly leaking memory one page at a time :)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0165de98

drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk · 7893499e

由 Christian König 提交于 9月 10, 2018

Slowly leaking memory one page at a time :)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7893499e

11 9月, 2018 3 次提交

drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path · b463d4e5

由 Christian König 提交于 9月 03, 2018

Avoid unlocking a lock we never locked.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b463d4e5

drm/amdgpu: correctly sign extend 48bit addresses v3 · ad9a5b78

由 Christian König 提交于 8月 27, 2018

Correct sign extend the GMC addresses to 48bit.

v2: sign extending turned out easier than thought.
v3: clean up the defines and move them into amdgpu_gmc.h as well
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ad9a5b78

drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path · 0a53b69c

由 Christian König 提交于 9月 03, 2018

Avoid unlocking a lock we never locked.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0a53b69c

06 9月, 2018 2 次提交

drm: expand replace_fence to support timeline point v2 · 9a09a423

由 Chunming Zhou 提交于 8月 30, 2018

we can place a fence to a timeline point after expanded.
v2: change func parameter order
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/246543/

9a09a423

drm: expand drm_syncobj_find_fence to support timeline point v2 · 0a6730ea

由 Chunming Zhou 提交于 8月 30, 2018

we can fetch timeline point fence after expanded.
v2: The parameter fence is the result of the function and should come last.
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/246541/

0a6730ea

02 9月, 2018 1 次提交

drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 · b995795b

由 Christian König 提交于 8月 30, 2018

First step to fix the LRU corruption, we accidentially tried to move things
on the LRU after dropping the lock.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Tested-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b995795b

28 8月, 2018 10 次提交

drm/amdgpu: fix holding mn_lock while allocating memory · 4f9ea1d0

由 Christian König 提交于 8月 24, 2018

We can't hold the mn_lock while allocating memory.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4f9ea1d0

drm/amdgpu: amdgpu_ctx_add_fence can't fail · 85eff200

由 Christian König 提交于 8月 24, 2018

No more waiting for a fence done here.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

85eff200

drm/amdgpu: fix holding mn_lock while allocating memory · 4a2de54d

由 Christian König 提交于 8月 24, 2018

We can't hold the mn_lock while allocating memory.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a2de54d

drm/amdgpu: use bulk moves for efficient VM LRU handling (v6) · f921661b

由 Huang Rui 提交于 8月 06, 2018

I continue to work for bulk moving that based on the proposal by Christian.

Background:
amdgpu driver will move all PD/PT and PerVM BOs into idle list. Then move all of
them on the end of LRU list one by one. Thus, that cause so many BOs moved to
the end of the LRU, and impact performance seriously.

Then Christian provided a workaround to not move PD/PT BOs on LRU with below
patch:
Commit 0bbf32026cf5ba41e9922b30e26e1bed1ecd38ae ("drm/amdgpu: band aid
validating VM PTs")

However, the final solution should bulk move all PD/PT and PerVM BOs on the LRU
instead of one by one.

Whenever amdgpu_vm_validate_pt_bos() is called and we have BOs which need to be
validated we move all BOs together to the end of the LRU without dropping the
lock for the LRU.

While doing so we note the beginning and end of this block in the LRU list.

Now when amdgpu_vm_validate_pt_bos() is called and we don't have anything to do,
we don't move every BO one by one, but instead cut the LRU list into pieces so
that we bulk move everything to the end in just one operation.

Test data:
+--------------+-----------------+-----------+---------------------------------------+
|              |The Talos        |Clpeak(OCL)|BusSpeedReadback(OCL)                  |
|              |Principle(Vulkan)|           |                                       |
+------------------------------------------------------------------------------------+
|              |                 |           |0.319 ms(1k) 0.314 ms(2K) 0.308 ms(4K) |
| Original     |  147.7 FPS      |  76.86 us |0.307 ms(8K) 0.310 ms(16K)             |
+------------------------------------------------------------------------------------+
| Orignial + WA|                 |           |0.254 ms(1K) 0.241 ms(2K)              |
|(don't move   |  162.1 FPS      |  42.15 us |0.230 ms(4K) 0.223 ms(8K) 0.204 ms(16K)|
|PT BOs on LRU)|                 |           |                                       |
+------------------------------------------------------------------------------------+
| Bulk move    |  163.1 FPS      |  40.52 us |0.244 ms(1K) 0.252 ms(2K) 0.213 ms(4K) |
|              |                 |           |0.214 ms(8K) 0.225 ms(16K)             |
+--------------+-----------------+-----------+---------------------------------------+

After test them with above three benchmarks include vulkan and opencl. We can
see the visible improvement than original, and even better than original with
workaround.

v2: move all BOs include idle, relocated, and moved list to the end of LRU and
put them together.
v3: remove unused parameter and use list_for_each_entry instead of the one with
save entry.
v4: move the amdgpu_vm_move_to_lru_tail after command submission, at that time,
all bo will be back on idle list.
v5: remove amdgpu_vm_move_to_lru_tail_by_list(), use bulk_moveable instread of
validated, and move ttm_bo_bulk_move_lru_tail() also into
amdgpu_vm_move_to_lru_tail().
v6: clean up and fix return value.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Tested-by: NMike Lothian <mike@fireburn.co.uk>
Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f921661b

drm/amdgpu: add amdgpu_gmc_pd_addr helper · 11c3a249

由 Christian König 提交于 8月 22, 2018

Add a helper to get the root PD address and remove the workarounds from
the GMC9 code for that.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11c3a249

drm/amdgpu: cleanup VM handling in the CS a bit · 9a02ece4

由 Christian König 提交于 8月 17, 2018

Add a helper function for getting the root PD addr and cleanup join the
two VM related functions and cleanup the function name.

No functional change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9a02ece4

drm/amdgpu: use entity instead of ring for CS · 0d346a14

由 Christian König 提交于 7月 19, 2018

Further demangle ring from entity handling.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0d346a14

drm/amdgpu: remove the queue manager · 869a53d4

由 Christian König 提交于 7月 16, 2018

Not needed any more since that is now done by the scheduler.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

869a53d4

drm/amdgpu: move gem definitions into amdgpu_gem header · 2cddc50e

由 Huang Rui 提交于 8月 13, 2018

Demangle amdgpu.h.
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2cddc50e

drm/amdgpu: fix preamble handling · 92964357

由 Christian König 提交于 8月 21, 2018

At this point the command submission can still be interrupted.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

92964357

23 8月, 2018 1 次提交

drm/amdgpu: fix preamble handling · d98ff24e

由 Christian König 提交于 8月 21, 2018

At this point the command submission can still be interrupted.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d98ff24e

01 8月, 2018 2 次提交

drm/amdgpu: create an empty bo_list if no handle is provided · 4a102ad4

由 Christian König 提交于 7月 30, 2018

Instead of having extra handling just create an empty bo_list when no
handle is provided.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming  Zhou <david1.zhou@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a102ad4

drm/amdgpu: add bo_list iterators · 39f7f69a

由 Christian König 提交于 7月 30, 2018

Add helpers to iterate over all entries in a bo_list.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming  Zhou <david1.zhou@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

39f7f69a

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功