提交 · 0aa7aa24cc11720a05b4492345f0adba8373c226 · openeuler / Kernel

08 12月, 2018 3 次提交

drm/amdgpu: always reserve two slots for the VM · 0aa7aa24

由 Christian König 提交于 9月 21, 2018

And drop the now superflous extra reservations.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0aa7aa24

drm/amdgpu: fix using shared fence for exported BOs v2 · 049aca43

由 Christian König 提交于 9月 19, 2018

It is perfectly possible that the BO list is created before the BO is
exported. While at it clean up setting shared to one instead of true.

v2: add comment and simplify logic
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Acked-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

049aca43

drm/ttm: allow reserving more than one shared slot v3 · a9f34c70

由 Christian König 提交于 9月 19, 2018

Let's support simultaneous submissions to multiple engines.

v2: rename the field to num_shared and fix up all users
v3: rebased
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a9f34c70

06 11月, 2018 2 次提交

drm/amdgpu: print an error when the parser can't be initialized · e0519696

由 Samuel Pitoiset 提交于 10月 29, 2018

Similar to other error messages, might help for tracking down
issues.
Signed-off-by: NSamuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e0519696

drm/scheduler: Add drm_sched_job_cleanup · 26efecf9

由 Sharat Masetty 提交于 10月 29, 2018

This patch adds a new API to clean up the scheduler job resources. This
is primarliy needed in cases the job was created but was not queued to
the scheduler queue. Additionally with this change, the layer which
creates the scheduler job also gets to free up the job's resources and
this entails moving the dma_fence_put(finished_fence) to the drivers
ops free handler routines.
Signed-off-by: NSharat Masetty <smasetty@codeaurora.org>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

26efecf9

25 10月, 2018 1 次提交

dma-buf: allow reserving more than one shared fence slot · ca05359f

由 Christian König 提交于 9月 19, 2018

Let's support simultaneous submissions to multiple engines.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Link: https://patchwork.kernel.org/patch/10626149/

ca05359f

16 10月, 2018 1 次提交

drm: add flags to drm_syncobj_find_fence · 649fdce2

由 Chunming Zhou 提交于 10月 15, 2018

flags can be used by driver to decide whether need to block wait submission.
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
SIgned-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.kernel.org/patch/10641339/

649fdce2

20 9月, 2018 1 次提交

drm/amdgpu: fix up GDS/GWS/OA shifting · 77a2faa5

由 Christian König 提交于 9月 14, 2018

That only worked by pure coincident. Completely remove the shifting and
always apply correct PAGE_SHIFT.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

77a2faa5

14 9月, 2018 1 次提交

drm/amdgpu: remove amdgpu_bo_list_entry.robj (v2) · e83dfe4d

由 Christian König 提交于 9月 10, 2018

We can get that just by casting tv.bo.

v2: squash in kfd fix (Alex)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e83dfe4d

13 9月, 2018 1 次提交

drm/amdgpu: move cs dependencies front a bit · 7e7bf8de

由 Chunming Zhou 提交于 9月 11, 2018

cs dependencies handling doesn't need in vm resv
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7e7bf8de

12 9月, 2018 2 次提交

drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk · 0165de98

由 Christian König 提交于 9月 10, 2018

Slowly leaking memory one page at a time :)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0165de98

drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk · 7893499e

由 Christian König 提交于 9月 10, 2018

Slowly leaking memory one page at a time :)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7893499e

11 9月, 2018 3 次提交

drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path · b463d4e5

由 Christian König 提交于 9月 03, 2018

Avoid unlocking a lock we never locked.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b463d4e5

drm/amdgpu: correctly sign extend 48bit addresses v3 · ad9a5b78

由 Christian König 提交于 8月 27, 2018

Correct sign extend the GMC addresses to 48bit.

v2: sign extending turned out easier than thought.
v3: clean up the defines and move them into amdgpu_gmc.h as well
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ad9a5b78

drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path · 0a53b69c

由 Christian König 提交于 9月 03, 2018

Avoid unlocking a lock we never locked.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0a53b69c

06 9月, 2018 2 次提交

drm: expand replace_fence to support timeline point v2 · 9a09a423

由 Chunming Zhou 提交于 8月 30, 2018

we can place a fence to a timeline point after expanded.
v2: change func parameter order
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/246543/

9a09a423

drm: expand drm_syncobj_find_fence to support timeline point v2 · 0a6730ea

由 Chunming Zhou 提交于 8月 30, 2018

we can fetch timeline point fence after expanded.
v2: The parameter fence is the result of the function and should come last.
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/246541/

0a6730ea

02 9月, 2018 1 次提交

drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2 · b995795b

由 Christian König 提交于 8月 30, 2018

First step to fix the LRU corruption, we accidentially tried to move things
on the LRU after dropping the lock.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Tested-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b995795b

28 8月, 2018 10 次提交

drm/amdgpu: fix holding mn_lock while allocating memory · 4f9ea1d0

由 Christian König 提交于 8月 24, 2018

We can't hold the mn_lock while allocating memory.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4f9ea1d0

drm/amdgpu: amdgpu_ctx_add_fence can't fail · 85eff200

由 Christian König 提交于 8月 24, 2018

No more waiting for a fence done here.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

85eff200

drm/amdgpu: fix holding mn_lock while allocating memory · 4a2de54d

由 Christian König 提交于 8月 24, 2018

We can't hold the mn_lock while allocating memory.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a2de54d

drm/amdgpu: use bulk moves for efficient VM LRU handling (v6) · f921661b

由 Huang Rui 提交于 8月 06, 2018

I continue to work for bulk moving that based on the proposal by Christian.

Background:
amdgpu driver will move all PD/PT and PerVM BOs into idle list. Then move all of
them on the end of LRU list one by one. Thus, that cause so many BOs moved to
the end of the LRU, and impact performance seriously.

Then Christian provided a workaround to not move PD/PT BOs on LRU with below
patch:
Commit 0bbf32026cf5ba41e9922b30e26e1bed1ecd38ae ("drm/amdgpu: band aid
validating VM PTs")

However, the final solution should bulk move all PD/PT and PerVM BOs on the LRU
instead of one by one.

Whenever amdgpu_vm_validate_pt_bos() is called and we have BOs which need to be
validated we move all BOs together to the end of the LRU without dropping the
lock for the LRU.

While doing so we note the beginning and end of this block in the LRU list.

Now when amdgpu_vm_validate_pt_bos() is called and we don't have anything to do,
we don't move every BO one by one, but instead cut the LRU list into pieces so
that we bulk move everything to the end in just one operation.

Test data:
+--------------+-----------------+-----------+---------------------------------------+
|              |The Talos        |Clpeak(OCL)|BusSpeedReadback(OCL)                  |
|              |Principle(Vulkan)|           |                                       |
+------------------------------------------------------------------------------------+
|              |                 |           |0.319 ms(1k) 0.314 ms(2K) 0.308 ms(4K) |
| Original     |  147.7 FPS      |  76.86 us |0.307 ms(8K) 0.310 ms(16K)             |
+------------------------------------------------------------------------------------+
| Orignial + WA|                 |           |0.254 ms(1K) 0.241 ms(2K)              |
|(don't move   |  162.1 FPS      |  42.15 us |0.230 ms(4K) 0.223 ms(8K) 0.204 ms(16K)|
|PT BOs on LRU)|                 |           |                                       |
+------------------------------------------------------------------------------------+
| Bulk move    |  163.1 FPS      |  40.52 us |0.244 ms(1K) 0.252 ms(2K) 0.213 ms(4K) |
|              |                 |           |0.214 ms(8K) 0.225 ms(16K)             |
+--------------+-----------------+-----------+---------------------------------------+

After test them with above three benchmarks include vulkan and opencl. We can
see the visible improvement than original, and even better than original with
workaround.

v2: move all BOs include idle, relocated, and moved list to the end of LRU and
put them together.
v3: remove unused parameter and use list_for_each_entry instead of the one with
save entry.
v4: move the amdgpu_vm_move_to_lru_tail after command submission, at that time,
all bo will be back on idle list.
v5: remove amdgpu_vm_move_to_lru_tail_by_list(), use bulk_moveable instread of
validated, and move ttm_bo_bulk_move_lru_tail() also into
amdgpu_vm_move_to_lru_tail().
v6: clean up and fix return value.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Tested-by: NMike Lothian <mike@fireburn.co.uk>
Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f921661b

drm/amdgpu: add amdgpu_gmc_pd_addr helper · 11c3a249

由 Christian König 提交于 8月 22, 2018

Add a helper to get the root PD address and remove the workarounds from
the GMC9 code for that.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

11c3a249

drm/amdgpu: cleanup VM handling in the CS a bit · 9a02ece4

由 Christian König 提交于 8月 17, 2018

Add a helper function for getting the root PD addr and cleanup join the
two VM related functions and cleanup the function name.

No functional change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9a02ece4

drm/amdgpu: use entity instead of ring for CS · 0d346a14

由 Christian König 提交于 7月 19, 2018

Further demangle ring from entity handling.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0d346a14

drm/amdgpu: remove the queue manager · 869a53d4

由 Christian König 提交于 7月 16, 2018

Not needed any more since that is now done by the scheduler.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

869a53d4

drm/amdgpu: move gem definitions into amdgpu_gem header · 2cddc50e

由 Huang Rui 提交于 8月 13, 2018

Demangle amdgpu.h.
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2cddc50e

drm/amdgpu: fix preamble handling · 92964357

由 Christian König 提交于 8月 21, 2018

At this point the command submission can still be interrupted.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

92964357

23 8月, 2018 1 次提交

drm/amdgpu: fix preamble handling · d98ff24e

由 Christian König 提交于 8月 21, 2018

At this point the command submission can still be interrupted.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d98ff24e

01 8月, 2018 6 次提交

drm/amdgpu: create an empty bo_list if no handle is provided · 4a102ad4

由 Christian König 提交于 7月 30, 2018

Instead of having extra handling just create an empty bo_list when no
handle is provided.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming  Zhou <david1.zhou@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4a102ad4

drm/amdgpu: add bo_list iterators · 39f7f69a

由 Christian König 提交于 7月 30, 2018

Add helpers to iterate over all entries in a bo_list.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming  Zhou <david1.zhou@amd.com>
Acked-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

39f7f69a

drm/amdgpu: always recreate bo_list · 81c6dabc

由 Christian König 提交于 7月 30, 2018

The bo_list handle is allocated by OP_CREATE, so in OP_UPDATE here we just
re-create the bo_list object and replace the handle. This way we don't
need locking to protect the bo_list because it's always re-created when
changed.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming  Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

81c6dabc

drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function v2 · 8ab19ea6

由 Christian König 提交于 7月 27, 2018

This allows us to trace all VM ranges which should be valid inside a CS.

v2: dump mappings without BO as well
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming  Zhou <david1.zhou@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> (v1)
Reviewed-by: Huang Rui <ray.huang@amd.com> (v1)
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8ab19ea6

drm/amdgpu: return error if both BOs and bo_list handle is given · 0cb7c1f0

由 Christian König 提交于 7月 30, 2018

Return -EINVAL when both the BOs as well as a list handle is provided in
the IOCTL.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming  Zhou <david1.zhou@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0cb7c1f0

drm/amdgpu: add proper error handling to amdgpu_bo_list_get · 52c054ca

由 Christian König 提交于 7月 27, 2018

Otherwise we silently don't use a BO list when the handle is invalid.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

52c054ca

27 7月, 2018 1 次提交

drm/amdgpu: add support for inplace IB patching for MM engines v2 · 9d248517

由 Christian König 提交于 7月 23, 2018

We are going to need that for the second UVD instance on Vega20.

v2: rename to patch_cs_in_place
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-and-tested-by: NJames Zhu <James.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9d248517

26 7月, 2018 3 次提交

drm/scheduler: remove sched field from the entity · 068c3304

由 Nayan Deshmukh 提交于 7月 20, 2018

The scheduler of the entity is decided by the run queue on which
it is queued. This patch avoids us the effort required to maintain
a sync between rq and sched field when we start shifting entites
among different rqs.
Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

068c3304

drm/scheduler: modify API to avoid redundancy · cdc50176

由 Nayan Deshmukh 提交于 7月 20, 2018

entity has a scheduler field and we don't need the sched argument
in any of the functions where entity is provided.
Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cdc50176

drm/amdgpu: consistenly name amdgpu_bo_ functions · c704ab18

由 Christian König 提交于 7月 16, 2018

Just rename functions, no functional change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c704ab18

19 7月, 2018 1 次提交

drm/amdgpu: change ring priority after pushing the job (v2) · b5286801

由 Christian König 提交于 7月 16, 2018

Pushing a job can change the ring assignment of an entity.

v2: squash in:
"drm/amdgpu: fix job priority handling" (Christian)
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NChunming Zhou <david1.zhou@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b5286801

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功