- 11 9月, 2018 2 次提交
-
-
由 Christian König 提交于
Allows us to avoid taking the spinlock in more places. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
amdgpu_vm_bo_* functions should come much later. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 9月, 2018 1 次提交
-
-
由 Christian König 提交于
Add BOs to the idle state again and correctly clear the flag when new BOs are added. Signed-off-by: NChristian König <christian.koenig@amd.com> Tested-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 8月, 2018 5 次提交
-
-
由 Christian König 提交于
This reverts commit a7f91061. Felix pointed out that we need to have the BOs mapped even before amdgpu_vm_update_directories is called. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Philip Yang 提交于
kvmalloc_array uses __GFP_ZERO flag ensures that the returned address is zeroed already, memset it to zero again afterwards is unnecessary, and in this case buggy because we only clear the first entry. Signed-off-by: NPhilip Yang <Philip.Yang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Michel Dänzer 提交于
This reverts commit 31625ccae4464b61ec8cdb9740df848bbc857a5b. It triggered various badness on my development machine when running the piglit gpu profile with radeonsi on Bonaire, looks like memory corruption due to insufficiently protected list manipulations. Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
For compute vm acquired from amdgpu, vm.pasid is managed by kfd. Decouple pasid from such vm on process destroy to avoid duplicate pasid release. Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Oak Zeng 提交于
To make a amdgpu vm to a compute vm, the old pasid will be freed and replaced with a pasid managed by kfd. Kfd can't reuse original pasid allocated by amdgpu because kfd uses different pasid policy with amdgpu. For example, all graphic devices share one same pasid in a process. v2: rebase (Alex) Signed-off-by: NOak Zeng <Oak.Zeng@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 8月, 2018 1 次提交
-
-
由 Emily Deng 提交于
Fix the VMC page fault when the running sequence is as below: 1.amdgpu_gem_create_ioctl 2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called amdgpu_vm_bo_base_init, so won't called list_add_tail(&base->bo_list, &bo->va). Even the bo was evicted, it won't set the bo_base->moved. 3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called list_move_tail(&base->vm_status, &vm->evicted), but not set the bo_base->moved. 4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is not set true, the function amdgpu_vm_bo_insert_map will call list_move(&bo_va->base.vm_status, &vm->moved) 5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the moved list, not in the evict list. So VMC page fault occurs. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 8月, 2018 11 次提交
-
-
由 Christian König 提交于
Should work on Vega10 as well, but with an obvious performance hit. Older APUs can be enabled as well, but will probably be more work. v2: fix error checking v3: use more general check Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Helper to get the PDE for a PD/PT. v2: improve documentation Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Add a helper function to figure them out only once. v2: fix typo with memset v3: rebase on kfd changes (Alex) Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Just another leftover from radeon. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Set the VM size based on system memory size between the ASIC-specific limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their default VM size of 256TB (48 bit). Only older GPUs will adjust VM size depending on system memory size. This makes more VM space available for ROCm applications on GFXv8 GPUs that want to map all available VRAM and system memory in their SVM address space. v2: * Clarify comment * Round up memory size before >> 30 * Round up automatic vm_size to power of two Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Huang Rui 提交于
The new bulk moving functionality is ready, the overhead of moving PD/PT bos to LRU is fixed. So move them on LRU again. Signed-off-by: NHuang Rui <ray.huang@amd.com> Tested-by: NMike Lothian <mike@fireburn.co.uk> Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de> Acked-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Huang Rui 提交于
I continue to work for bulk moving that based on the proposal by Christian. Background: amdgpu driver will move all PD/PT and PerVM BOs into idle list. Then move all of them on the end of LRU list one by one. Thus, that cause so many BOs moved to the end of the LRU, and impact performance seriously. Then Christian provided a workaround to not move PD/PT BOs on LRU with below patch: Commit 0bbf32026cf5ba41e9922b30e26e1bed1ecd38ae ("drm/amdgpu: band aid validating VM PTs") However, the final solution should bulk move all PD/PT and PerVM BOs on the LRU instead of one by one. Whenever amdgpu_vm_validate_pt_bos() is called and we have BOs which need to be validated we move all BOs together to the end of the LRU without dropping the lock for the LRU. While doing so we note the beginning and end of this block in the LRU list. Now when amdgpu_vm_validate_pt_bos() is called and we don't have anything to do, we don't move every BO one by one, but instead cut the LRU list into pieces so that we bulk move everything to the end in just one operation. Test data: +--------------+-----------------+-----------+---------------------------------------+ | |The Talos |Clpeak(OCL)|BusSpeedReadback(OCL) | | |Principle(Vulkan)| | | +------------------------------------------------------------------------------------+ | | | |0.319 ms(1k) 0.314 ms(2K) 0.308 ms(4K) | | Original | 147.7 FPS | 76.86 us |0.307 ms(8K) 0.310 ms(16K) | +------------------------------------------------------------------------------------+ | Orignial + WA| | |0.254 ms(1K) 0.241 ms(2K) | |(don't move | 162.1 FPS | 42.15 us |0.230 ms(4K) 0.223 ms(8K) 0.204 ms(16K)| |PT BOs on LRU)| | | | +------------------------------------------------------------------------------------+ | Bulk move | 163.1 FPS | 40.52 us |0.244 ms(1K) 0.252 ms(2K) 0.213 ms(4K) | | | | |0.214 ms(8K) 0.225 ms(16K) | +--------------+-----------------+-----------+---------------------------------------+ After test them with above three benchmarks include vulkan and opencl. We can see the visible improvement than original, and even better than original with workaround. v2: move all BOs include idle, relocated, and moved list to the end of LRU and put them together. v3: remove unused parameter and use list_for_each_entry instead of the one with save entry. v4: move the amdgpu_vm_move_to_lru_tail after command submission, at that time, all bo will be back on idle list. v5: remove amdgpu_vm_move_to_lru_tail_by_list(), use bulk_moveable instread of validated, and move ttm_bo_bulk_move_lru_tail() also into amdgpu_vm_move_to_lru_tail(). v6: clean up and fix return value. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NHuang Rui <ray.huang@amd.com> Tested-by: NMike Lothian <mike@fireburn.co.uk> Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de> Acked-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
When move a BO to the end of LRU, it need remember the BO positions. Make sure all moved bo in between "first" and "last". And they will be bulk moving together. Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NHuang Rui <ray.huang@amd.com> Tested-by: NMike Lothian <mike@fireburn.co.uk> Tested-by: NDieter Nützel <Dieter@nuetzel-hh.de> Acked-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Preparation for following changes. This validates the root PD twice, but the overhead of that should be minimal. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Instead of the fixed round robin use let the scheduler balance the load of page table updates. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
We need to figure out the address after validating the BO, not before. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 8月, 2018 1 次提交
-
-
由 Christian König 提交于
This allows us to trace all VM ranges which should be valid inside a CS. v2: dump mappings without BO as well Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> (v1) Reviewed-by: Huang Rui <ray.huang@amd.com> (v1) Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 7月, 2018 2 次提交
-
-
由 Nayan Deshmukh 提交于
The scheduler of the entity is decided by the run queue on which it is queued. This patch avoids us the effort required to maintain a sync between rq and sched field when we start shifting entites among different rqs. Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nayan Deshmukh 提交于
entity has a scheduler field and we don't need the sched argument in any of the functions where entity is provided. Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NEric Anholt <eric@anholt.net> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 19 7月, 2018 1 次提交
-
-
由 Huang Rui 提交于
BO ptr already be initialized at definition, we needn't use the complicated reference. v2: fix typo at subject line Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 7月, 2018 1 次提交
-
-
由 Christian König 提交于
We know the ring through the entity anyway. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 7月, 2018 1 次提交
-
-
由 Nayan Deshmukh 提交于
replace run queue by a list of run queues and remove the sched arg as that is part of run queue itself Signed-off-by: NNayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Acked-by: NEric Anholt <eric@anholt.net> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 12 7月, 2018 1 次提交
-
-
由 Shaoyun Liu 提交于
Compute contexts cannot keep going after a GPU reset. Currently the process must terminate. In the future a process may be able recreate its context from scratch. Either way, there is no need to restore the GPUVM page table from shadow BOs. Signed-off-by: NShaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>
-
- 11 7月, 2018 3 次提交
-
-
由 Andrey Grodzovsky 提交于
Problem: When PD/PT update made by CPU root PD was not yet mapped causing page fault. Fix: Verify root PD is mapped into CPU address space. v2: Make sure that we add the root PD to the relocated list since then it's get mapped into CPU address space bt default in amdgpu_vm_update_directories. v3: Drop change to not move kernel type BOs to evicted list. v4: Remove redundant bo move to relocated list. Link: https://bugs.freedesktop.org/show_bug.cgi?id=107065Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
由 Andrey Grodzovsky 提交于
Problem: When PD/PT update made by CPU root PD was not yet mapped causing page fault. Fix: Verify root PD is mapped into CPU address space. v2: Make sure that we add the root PD to the relocated list since then it's get mapped into CPU address space bt default in amdgpu_vm_update_directories. v3: Drop change to not move kernel type BOs to evicted list. v4: Remove redundant bo move to relocated list. Link: https://bugs.freedesktop.org/show_bug.cgi?id=107065Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Add process and thread names and pids and a function to extract this info from relevant amdgpu_vm. v2: Add documentation and fix identation. v3: Add getter and setter functions for amdgpu_task_info. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: NJim Qu <Jim.Qu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 06 7月, 2018 3 次提交
-
-
由 Michel Dänzer 提交于
To hopefully make the code dealing with GPU vs CPU pages a little clearer. Suggested-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Everything in the flush code path (i.e. waiting for SW queue to become empty) names with *_flush() and everything in the release code path names *_fini() This patch also effect the amdgpu and etnaviv drivers which use those functions. v2: Also pplay the change to vd3. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: NChristian König <christian.koenig@amd.com> Acked-by: NLucas Stach <l.stach@pengutronix.de> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Always validating the VM PTs takes to much time. Only always validate the per VM BOs for now. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NJunwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 23 6月, 2018 1 次提交
-
-
由 Michel Dänzer 提交于
start / last / max_entries are numbers of GPU pages, pfn / count are numbers of CPU pages. Convert between them accordingly. Fixes badness on systems with > 4K page size. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/106258Reported-by: NMatt Corallo <freedesktop@bluematt.me> Tested-by: foxbat@ruin.net Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 6月, 2018 4 次提交
-
-
由 Andrey Grodzovsky 提交于
Add documentation for missed parameters. Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Move all instnaces of this check into a function in amdgpu_gmc.h Rename the original function to a more proper name. v2: Add more places to cleanup. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Andrey Grodzovsky 提交于
Add/update function level documentation and add reference to amdgpu_vm.c in amdgpu.rst v2: Fix reference in rst file. Fix compilation warnings. Add space between function names and params list where it's missing. v3: Fix some funtion comments. Add formatted documentation to structs. Signed-off-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Emily Deng 提交于
For buffer object that has shadow buffer, need twice commands. Signed-off-by: NEmily Deng <Emily.Deng@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 6月, 2018 1 次提交
-
-
由 Junwei Zhang 提交于
v2: assign bo_va as well We need to put the lose ends on the invalid list because it is possible that we need to split up huge pages for them. Cc: stable@vger.kernel.org Signed-off-by: NChristian König <christian.koenig@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> (v2) Reviewed-by: NDavid Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 24 5月, 2018 1 次提交
-
-
由 Christian König 提交于
Move all BOs belonging to a VM on the LRU with every submission. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NChunming Zhou <david1.zhou@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-