- 13 9月, 2017 6 次提交
-
-
由 Christian König 提交于
Instead of moving them in the MMU notifier move them during CS. v2: still mark pages as accessed/dirty Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1) Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Instead use a counter to figure out if we need to set new pages or not. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This didn't helped as intended, just simplify the code. v2: unlock mmap_sem in the error path as well Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This reverts commit 10e709cb. The patch doesn't work at all: 1. The CS can still be blocked because of amdgpu_ctx_add_fence(). 2. The order of submission isn't correct any more. 3. We could end up using freed up memory because we now drop the ctx reference to early. This needs to be fixed cleanly by doing the context handling after the BO handling, but this is a larger task just avoid the obvious crashes for now. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: Monk Liu monk.liu@amd.com Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Move calling put_page into the unpopulate callback. Otherwise we mess up the pages reference count when it is unbound multiple times. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
first is incorrect if hit NULL/signaled fence Signed-off-by: NMonk Liu <monk.liu@amd.com> Reviewed-by: NChunming Zhou <David1.Zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 9月, 2017 1 次提交
-
-
由 Christian König 提交于
Per VM BOs are handled like VM PDs and PTs. They are always valid and don't need to be specified in the BO lists. v2: validate PDs/PTs first Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 8月, 2017 2 次提交
-
-
由 Christian König 提交于
Instead of validating all page tables when one was evicted, track which one needs a validation. v2: simplify amdgpu_vm_ready as well Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christophe JAILLET 提交于
Check memory allocation failure and return -ENOMEM in such a case. 'num_post_dep_syncobjs' still has to be set to 0 before the test in order to have it initialized if 'amdgpu_cs_parser_fini()' is called to free resources. The calling graph would be, in such a case! failure in amdgpu_cs_process_syncobj_out_dep() ---> error code returned by amdgpu_cs_dependencies() --> amdgpu_cs_parser_fini() is called Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 8月, 2017 1 次提交
-
-
由 Jason Ekstrand 提交于
The function has far more in common with drm_syncobj_find than with any in the get/put functions. Signed-off-by: NJason Ekstrand <jason@jlekstrand.net> Acked-by: Christian König <christian.koenig@amd.com> (v1) Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 18 8月, 2017 5 次提交
-
-
由 Christian König 提交于
That better describes what happens here with the BO. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Split that into vm_bo_base and bo_va to allow other uses as well. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Move the CSA bo_va from the VM to the fpriv structure. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Looks like a better place for this. v2: use atomic64_t members instead Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This should save us a bunch of command submission overhead. v2: move the LRU move to the right place to avoid the move for the root BO and handle the shadow BOs as well. This turned out to be a bug fix because the move needs to happen before the kmap. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 8月, 2017 3 次提交
-
-
由 Kent Russell 提交于
Change "prefered" to "preferred" Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Cihangir Akturk 提交于
drm_*_reference() and drm_*_unreference() functions are just compatibility alias for drm_*_get() and drm_*_put() and should not be used by new code. So convert all users of compatibility functions to use the new APIs. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NCihangir Akturk <cakturk@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Instead of open coding the conversion from u64 to pointers. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 7月, 2017 1 次提交
-
-
由 John Brooks 提交于
The BO move throttling code is designed to allow VRAM to fill quickly if it is relatively empty. However, this does not take into account situations where the visible VRAM is smaller than total VRAM, and total VRAM may not be close to full but the visible VRAM segment is under pressure. In such situations, visible VRAM would experience unrestricted swapping and performance would drop. Add a separate counter specifically for moves involving visible VRAM, and check it before moving BOs there. v2: Only perform calculations for separate counter if visible VRAM is smaller than total VRAM. (Michel Dänzer) v3: [Michel Dänzer] * Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag to determine whether to account a move for visible VRAM in most cases. * Use a single if (adev->mc.visible_vram_size < adev->mc.real_vram_size) { block in amdgpu_cs_get_threshold_for_moves. Fixes: 95844d20 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2)) Signed-off-by: NJohn Brooks <john@fastquake.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 7月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
the drm_file parameter is unused, so remove it. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com> Reviewed-by: NJason Ekstrand <jason@jlekstrand.net> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 30 6月, 2017 2 次提交
-
-
由 Alex Xie 提交于
The function is called only once inside the .c file. v2: update the commit message (Michel) Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Xie 提交于
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 6月, 2017 2 次提交
-
-
由 Dave Airlie 提交于
This creates a new command submission chunk for amdgpu to add in and out sync objects around the submission. Sync objects are managed via the drm syncobj ioctls. The command submission interface is enhanced with two new chunks, one for syncobj pre submission dependencies, and one for post submission sync obj signalling, and just takes a list of handles for each. This is based on work originally done by David Zhou at AMD, with input from Christian Konig on what things should look like. In theory VkFences could be backed with sync objects and just get passed into the cs as syncobj handles as well. NOTE: this interface addition needs a version bump to expose it to userspace. TODO: update to dep_sync when rebasing onto amdgpu master. (with this - r-b from Christian) v1.1: keep file reference on import. v2: move to using syncobjs v2.1: change some APIs to just use p pointer. v3: make more robust against CS failures, we now add the wait sems but only remove them once the CS job has been submitted. v4: rewrite names of API and base on new syncobj code. v5: move post deps earlier, rename some apis v6: lookup post deps earlier, and just replace fences in post deps stage (Christian) Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Dave Airlie 提交于
This just splits out the fence depenency checking into it's own function to make it easier to add semaphore dependencies. v2: rebase onto other changes. v1-Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 6月, 2017 1 次提交
-
-
由 Alex Xie 提交于
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 6月, 2017 1 次提交
-
-
由 Andres Rodriguez 提交于
Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's ring ids from the kernel's ring ids. The queue manager maintains a per-file descriptor map of user ring ids to amdgpu_ring pointers. Once a map is created it is permanent (this is required to maintain FIFO execution guarantees for a context's ring). Different queue map policies can be configured for each HW IP. Currently all HW IPs use the identity mapper, i.e. kernel ring id is equal to the user ring id. The purpose of this mechanism is to distribute the load across multiple queues more effectively for HW IPs that support multiple rings. Userspace clients are unable to check whether a specific resource is in use by a different client. Therefore, it is up to the kernel driver to make the optimal choice. v2: remove amdgpu_queue_mapper_funcs v3: made amdgpu_queue_mgr per context instead of per-fd v4: add context_put on error paths v5: rebase and include new IPs UVD_ENC & VCN_* v6: drop unused amdgpu_ring_is_valid_index (Alex) Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAndres Rodriguez <andresx7@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 5月, 2017 3 次提交
-
-
由 Chunming Zhou 提交于
below ioctl will return -ENODEV: amdgpu_cs_ioctl amdgpu_cs_wait_ioctl amdgpu_cs_wait_fences_ioctl amdgpu_gem_va_ioctl amdgpu_info_ioctl v2: only for map and replace cases in amdgpu_gem_va_ioctl Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Leo Liu 提交于
Signed-off-by: NLeo Liu <leo.liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Leo Liu 提交于
Signed-off-by: NLeo Liu <leo.liu@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 5月, 2017 1 次提交
-
-
由 Michal Hocko 提交于
Now that drm_[cm]alloc* helpers are simple one line wrappers around kvmalloc_array and drm_free_large is just kvfree alias we can drop them and replace by their native forms. This shouldn't introduce any functional change. Changes since v1 - fix typo in drivers/gpu//drm/etnaviv/etnaviv_gem.c - noticed by 0day build robot Suggested-by: NDaniel Vetter <daniel@ffwll.ch> Signed-off-by: Michal Hocko <mhocko@suse.com>drm: drop drm_[cm]alloc* helpers [danvet: Fixup vgem which grew another user very recently.] Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Acked-by: NChristian König <christian.koenig@amd.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517122312.GK18247@dhcp22.suse.cz
-
- 29 4月, 2017 1 次提交
-
-
由 Chunming Zhou 提交于
the case could happen when gpu reset: 1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation. 2. gpu_reset routine will also need pd reservation to restore page table from their shadow. 3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation. v2: handle amdgpu_cs_submit error path. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NMonk Liu <monk.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 4月, 2017 1 次提交
-
-
由 Chunming Zhou 提交于
V2: remove **array method, directly fence_put after fence wait. Signed-off-by: NChunming Zhou <David1.Zhou@amd.com> Reviewed-by: NChristian König <chrstian.koenig@amd.com> Reviewed-by: NKen Wang <Qingqing.Wang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 07 4月, 2017 1 次提交
-
-
由 Alex Xie 提交于
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 4月, 2017 1 次提交
-
-
由 Christian König 提交于
This only makes a difference for 32-bit systems. The idea is to have a fixed virtual address space size with 4-level page tables and to minimize differences between 32 and 64-bit systems. v2: Update commit message. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 3月, 2017 6 次提交
-
-
由 Harry Wentland 提交于
Signed-off-by: NHarry Wentland <harry.wentland@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
1) Adapt to vulkan: Now use double SWITCH BUFFER to replace the 128 nops w/a, because when vulkan introduced, umd can insert 7 ~ 16 IBs per submit which makes 256 DW size cannot hold the whole DMAframe (if we still insert those 128 nops), CP team suggests use double SWITCH_BUFFERs, instead of tricky 128 NOPs w/a. 2) To fix the CE VM fault issue when MCBP introduced: Need one more COND_EXEC wrapping IB part (original one us for VM switch part). this change can fix vm fault issue caused by below scenario without this change: >CE passed original COND_EXEC (no MCBP issued this moment), proceed as normal. >DE catch up to this COND_EXEC, but this time MCBP issued, thus DE treats all following packages as NOP. The following VM switch packages now looks just as NOP to DE, so DE dosen't do VM flush at all. >Now CE proceeds to the first IBc, and triggers VM fault, because DE didn't do VM flush for this DMAframe. 3) change estimated alloc size for gfx9. with new DMAframe scheme, we need modify emit_frame_size for gfx9 4) No need to insert 128 nops after gfx8 vm flush anymore because there was double SWITCH_BUFFER append to vm flush, and for gfx7 we already use double SWITCH_BUFFER following after vm_flush so no change needed for it. 5) Change emit_frame_size for gfx8 v2: squash in BUG removal from Monk Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
1,the check is only appliable for SRIOV GFX engine. 2,use chunk_ib instead of ib. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NKen Wang <Qingqing.wang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
to prevent submit two or more IBs with PREEMPT flags. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
should use chunk_ib instead of ib, otherwise the logic is incorrect. Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NKen Wang <Qingqing.wang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Update all levels of the page directory. V2: a. sub level pdes always are written to incorrect place. b. sub levels need to update regardless of parent updates. Signed-off-by: Christian König <christian.koenig@amd.com> (V1) Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (V1) Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> (V2) Acked-by: Alex Deucher <alexander.deucher@amd.com> (V2) Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-