- 25 5月, 2019 1 次提交
-
-
由 Leo Liu 提交于
To replace checking ring type and make them generic Signed-off-by: NLeo Liu <leo.liu@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 4月, 2019 1 次提交
-
-
由 Chunming Zhou 提交于
syncobj wait/signal operation is appending in command submission. v2: separate to two kinds in/out_deps functions v3: fix checking for timeline syncobj Signed-off-by: NChunming Zhou <david1.zhou@amd.com> Cc: Tobias Hector <Tobias.Hector@amd.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Dave Airlie <airlied@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 3月, 2019 1 次提交
-
-
由 Alex Deucher 提交于
This reverts commit 915d3eec. This depends on an HMM fix which is not upstream yet. Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 20 3月, 2019 1 次提交
-
-
由 Philip Yang 提交于
Use HMM helper function hmm_vma_fault() to get physical pages backing userptr and start CPU page table update track of those pages. Then use hmm_vma_range_done() to check if those pages are updated before amdgpu_cs_submit for gfx or before user queues are resumed for kfd. If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart from scratch, for kfd, restore worker is rescheduled to retry. HMM simplify the CPU page table concurrent update check, so remove guptasklock, mmu_invalidations, last_set_pages fields from amdgpu_ttm_tt struct. HMM does not pin the page (increase page ref count), so remove related operations like release_pages(), put_page(), mark_page_dirty(). Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 2月, 2019 1 次提交
-
-
由 Andrey Grodzovsky 提交于
New chunk for dependency on start of job's execution instead on the end. This is used for GPU deadlock prevention when userspace uses mid-IB fences to wait for mid-IB work on other rings. v2: Fix typo in AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES v3: Bump KMS version v4: put old fence AFTER acquiring the scheduled fence. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: NChristian Koenig <Christian.Koenig@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 12月, 2018 1 次提交
-
-
由 Christian König 提交于
When the fence is already signaled it is perfectly normal to get a NULL fence here. But since we can't export that we need to use a stub fence. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 12月, 2018 1 次提交
-
-
由 Andrey Grodzovsky 提交于
If CS is submitted using guilty ctx, we terminate amdgpu_cs_parser_init before locking ctx->lock, latter in amdgpu_cs_parser_fini we still are trying to release the lock just becase parser->ctx != NULL. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 12月, 2018 4 次提交
-
-
由 Christian König 提交于
This allows us to drop the extra reserve in TTM. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
And drop the now superflous extra reservations. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
It is perfectly possible that the BO list is created before the BO is exported. While at it clean up setting shared to one instead of true. v2: add comment and simplify logic Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Acked-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Let's support simultaneous submissions to multiple engines. v2: rename the field to num_shared and fix up all users v3: rebased Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 12月, 2018 1 次提交
-
-
由 Christian König 提交于
This reverts commit 9a09a423. The whole interface isn't thought through. Since this function can't fail we actually can't allocate an object to store the sync point. Sorry, I should have taken the lead on this from the very beginning and reviewed it more thoughtfully. Going to propose a new interface as a follow up change. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Link: https://patchwork.freedesktop.org/patch/265580/
-
- 06 11月, 2018 2 次提交
-
-
由 Samuel Pitoiset 提交于
Similar to other error messages, might help for tracking down issues. Signed-off-by: NSamuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Sharat Masetty 提交于
This patch adds a new API to clean up the scheduler job resources. This is primarliy needed in cases the job was created but was not queued to the scheduler queue. Additionally with this change, the layer which creates the scheduler job also gets to free up the job's resources and this entails moving the dma_fence_put(finished_fence) to the drivers ops free handler routines. Signed-off-by: NSharat Masetty <smasetty@codeaurora.org> Reviewed-by: NChristian König <christian.koenig@amd.com> Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 10月, 2018 1 次提交
-
-
由 Christian König 提交于
Let's support simultaneous submissions to multiple engines. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Link: https://patchwork.kernel.org/patch/10626149/
-
- 16 10月, 2018 1 次提交
-
-
由 Chunming Zhou 提交于
flags can be used by driver to decide whether need to block wait submission. Signed-off-by: NChunming Zhou <david1.zhou@amd.com> SIgned-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Link: https://patchwork.kernel.org/patch/10641339/
-
- 20 9月, 2018 1 次提交
-
-
由 Christian König 提交于
That only worked by pure coincident. Completely remove the shifting and always apply correct PAGE_SHIFT. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 9月, 2018 1 次提交
-
-
由 Christian König 提交于
We can get that just by casting tv.bo. v2: squash in kfd fix (Alex) Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 9月, 2018 1 次提交
-
-
由 Chunming Zhou 提交于
cs dependencies handling doesn't need in vm resv Signed-off-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 9月, 2018 2 次提交
-
-
由 Christian König 提交于
Slowly leaking memory one page at a time :) Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Slowly leaking memory one page at a time :) Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 11 9月, 2018 3 次提交
-
-
由 Christian König 提交于
Avoid unlocking a lock we never locked. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Correct sign extend the GMC addresses to 48bit. v2: sign extending turned out easier than thought. v3: clean up the defines and move them into amdgpu_gmc.h as well Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Avoid unlocking a lock we never locked. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 9月, 2018 2 次提交
-
-
由 Chunming Zhou 提交于
we can place a fence to a timeline point after expanded. v2: change func parameter order Signed-off-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NChristian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/246543/
-
由 Chunming Zhou 提交于
we can fetch timeline point fence after expanded. v2: The parameter fence is the result of the function and should come last. Signed-off-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NChristian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/246541/
-
- 02 9月, 2018 1 次提交
-
-
由 Christian König 提交于
First step to fix the LRU corruption, we accidentially tried to move things on the LRU after dropping the lock. Signed-off-by: NChristian König <christian.koenig@amd.com> Tested-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 8月, 2018 10 次提交
-
-
由 Christian König 提交于
We can't hold the mn_lock while allocating memory. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
No more waiting for a fence done here. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We can't hold the mn_lock while allocating memory. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Huang Rui 提交于
I continue to work for bulk moving that based on the proposal by Christian. Background: amdgpu driver will move all PD/PT and PerVM BOs into idle list. Then move all of them on the end of LRU list one by one. Thus, that cause so many BOs moved to the end of the LRU, and impact performance seriously. Then Christian provided a workaround to not move PD/PT BOs on LRU with below patch: Commit 0bbf32026cf5ba41e9922b30e26e1bed1ecd38ae ("drm/amdgpu: band aid validating VM PTs") However, the final solution should bulk move all PD/PT and PerVM BOs on the LRU instead of one by one. Whenever amdgpu_vm_validate_pt_bos() is called and we have BOs which need to be validated we move all BOs together to the end of the LRU without dropping the lock for the LRU. While doing so we note the beginning and end of this block in the LRU list. Now when amdgpu_vm_validate_pt_bos() is called and we don't have anything to do, we don't move every BO one by one, but instead cut the LRU list into pieces so that we bulk move everything to the end in just one operation. Test data: +--------------+-----------------+-----------+---------------------------------------+ | |The Talos |Clpeak(OCL)|BusSpeedReadback(OCL) | | |Principle(Vulkan)| | | +------------------------------------------------------------------------------------+ | | | |0.319 ms(1k) 0.314 ms(2K) 0.308 ms(4K) | | Original | 147.7 FPS | 76.86 us |0.307 ms(8K) 0.310 ms(16K) | +------------------------------------------------------------------------------------+ | Orignial + WA| | |0.254 ms(1K) 0.241 ms(2K) | |(don't move | 162.1 FPS | 42.15 us |0.230 ms(4K) 0.223 ms(8K) 0.204 ms(16K)| |PT BOs on LRU)| | | | +------------------------------------------------------------------------------------+ | Bulk move | 163.1 FPS | 40.52 us |0.244 ms(1K) 0.252 ms(2K) 0.213 ms(4K) | | | | |0.214 ms(8K) 0.225 ms(16K) | +--------------+-----------------+-----------+---------------------------------------+ After test them with above three benchmarks include vulkan and opencl. We can see the visible improvement than original, and even better than original with workaround. v2: move all BOs include idle, relocated, and moved list to the end of LRU and put them together. v3: remove unused parameter and use list_for_each_entry instead of the one with save entry. v4: move the amdgpu_vm_move_to_lru_tail after command submission, at that time, all bo will be back on idle list. v5: remove amdgpu_vm_move_to_lru_tail_by_list(), use bulk_moveable instread of validated, and move ttm_bo_bulk_move_lru_tail() also into amdgpu_vm_move_to_lru_tail(). v6: clean up and fix return value. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NHuang Rui <ray.huang@amd.com> Tested-by: NMike Lothian <mike@fireburn.co.uk> Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de> Acked-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Add a helper to get the root PD address and remove the workarounds from the GMC9 code for that. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Add a helper function for getting the root PD addr and cleanup join the two VM related functions and cleanup the function name. No functional change. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Further demangle ring from entity handling. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Not needed any more since that is now done by the scheduler. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Huang Rui 提交于
Demangle amdgpu.h. Signed-off-by: NHuang Rui <ray.huang@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
At this point the command submission can still be interrupted. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 8月, 2018 1 次提交
-
-
由 Christian König 提交于
At this point the command submission can still be interrupted. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 8月, 2018 2 次提交
-
-
由 Christian König 提交于
Instead of having extra handling just create an empty bo_list when no handle is provided. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Add helpers to iterate over all entries in a bo_list. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Acked-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-