- 13 9月, 2017 5 次提交
-
-
由 Christian König 提交于
When we need to find the mapping we need sysvm access anyway. Signed-off-by: NChristian König <christian.koenig@amd.com> Acked-by: NLeo Liu <leo.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Instead take the callback lock during the final parts of CS. This should solve the last remaining locking order problems with BO reservations. v2: rebase, make dummy functions static inline v3: add one more missing inline and comments Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Instead of moving them in the MMU notifier move them during CS. v2: still mark pages as accessed/dirty Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1) Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Instead use a counter to figure out if we need to set new pages or not. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Move calling put_page into the unpopulate callback. Otherwise we mess up the pages reference count when it is unbound multiple times. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 01 9月, 2017 1 次提交
-
-
由 Christian König 提交于
Add the IOCTL interface so that applications can allocate per VM BOs. Still WIP since not all corner cases are tested yet, but this reduces average CS overhead for 10K BOs from 21ms down to 48us. v2: add some extra checks, remove the WIP tag v3: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 8月, 2017 1 次提交
-
-
由 Alex Deucher 提交于
We need a larger gart for asics that do not support GPUVM on all engines (e.g., MM) to make sure we have enough space for all gtt buffers in physical mode. Change the default size based on the asic type. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 8月, 2017 4 次提交
-
-
由 Roger He 提交于
Allow overrides on the command line. v2: agd: sqaush in spelling fix and bogus default value warning Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NRoger He <Hongbo.He@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Move the CSA bo_va from the VM to the fpriv structure. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Looks like a better place for this. v2: use atomic64_t members instead Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
It doesn't make much sense to count those numbers twice. v2: use and atomic64_t instead Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 8月, 2017 8 次提交
-
-
由 Kent Russell 提交于
Change "stollen" to "stolen" Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
May waste a bit of memory, but simplifies the interface significantly. v2: convert internal accounting to use 256bit slots Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
We already allocate this as part of the ring structure, use that instead. Cc: Frank Min <Frank.Min@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Use a lower case b to be consistent with the other wb functions. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Move amdgpu_bo and related structures into amdgpu_object.h. Move amdgpu_bo_list structures to the amdgpu_bo_list functions. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Don't keep around the same pointer twice. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Frank Min 提交于
While doing flr on VFs, there is possibility to lost the doorbell writing for sdma, so enable poll mem for sdma, then sdma fw would check the pollmem holding wptr. Signed-off-by: NFrank Min <Frank.Min@amd.com> Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Frank Min 提交于
Now uvd doorbell is from 0xf8-0xfb and vce doorbell is from 0xfc-0xff Signed-off-by: NFrank Min <Frank.Min@amd.com> Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 26 7月, 2017 1 次提交
-
-
由 Monk Liu 提交于
1, for sriov, we need 8dw for the gfx fence due to CP behaviour 2, cleanup wrong logic in wptr/rptr wb alloc and free Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294 Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Signed-off-by: NXiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 14 7月, 2017 15 次提交
-
-
由 Felix Kuehling 提交于
Set a configurable SDMA phase quantum when enabling SDMA context switching. The default value significantly reduces SDMA latency in page table updates when user-mode SDMA queues have concurrent activity, compared to the initial HW setting. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NAndres Rodriguez <andres.rodriguez@amd.com> Reviewed-by: NShaoyun Liu <shaoyun.liu@amd.com> Acked-by: NChunming Zhou <david1.zhou@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 John Brooks 提交于
The BO move throttling code is designed to allow VRAM to fill quickly if it is relatively empty. However, this does not take into account situations where the visible VRAM is smaller than total VRAM, and total VRAM may not be close to full but the visible VRAM segment is under pressure. In such situations, visible VRAM would experience unrestricted swapping and performance would drop. Add a separate counter specifically for moves involving visible VRAM, and check it before moving BOs there. v2: Only perform calculations for separate counter if visible VRAM is smaller than total VRAM. (Michel Dänzer) v3: [Michel Dänzer] * Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag to determine whether to account a move for visible VRAM in most cases. * Use a single if (adev->mc.visible_vram_size < adev->mc.real_vram_size) { block in amdgpu_cs_get_threshold_for_moves. Fixes: 95844d20 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2)) Signed-off-by: NJohn Brooks <john@fastquake.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 John Brooks 提交于
Allow specifying a limit on visible VRAM via a module parameter. This is helpful for testing performance under visible VRAM pressure. v2: Add cast to 64-bit (Christian König) Signed-off-by: NJohn Brooks <john@fastquake.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Limit the default GART size and save a lot of VRAM. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This allows setting the gtt size independent of the gart size. v2: fix copy and paste typo Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Rename symbols from gtt_ to gart_ as appropriate. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Not used any more. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
No functional change, just cleanup. v2: rebased, keep gart name. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
- v2: added missing spinlock init Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
This allows us to write the mapped PTEs into an IB instead of the table directly. v2: fix build with debugfs enabled, remove unused assignment Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Ken Wang 提交于
Certain MC registers need a delay after writing them to properly update in the init sequence. Signed-off-by: NKen Wang <Ken.Wang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Keep them where they belong. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>
-
由 Christian König 提交于
Stop spreading the code over all GMC generations. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
These are no longer needed now that we use the fb_location programmed by the vbios. Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Not used. Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 30 6月, 2017 2 次提交
-
-
由 Alex Xie 提交于
The function is called only once inside the .c file. v2: update the commit message (Michel) Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Flora Cui 提交于
Newer asics with 4 SEs are not able to fit the entire bitmask in the original field, use an array instead. v2: keep cu_ao_mask for backward compatibility. Signed-off-by: NFlora Cui <Flora.Cui@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 20 6月, 2017 1 次提交
-
-
由 Alex Xie 提交于
In original function amdgpu_bo_list_get, the waiting for result->lock can be quite long while mutex bo_list_lock was holding. It can make other tasks waiting for bo_list_lock for long period. Secondly, this patch allows several tasks(readers of idr) to proceed at the same time. v2: use rcu and kref (Dave Airlie and Christian König) v3: update v1 commit message (Michel Dänzer) v4: rebase on upstream (Alex Deucher) Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com>
-
- 17 6月, 2017 1 次提交
-
-
由 Dave Airlie 提交于
This creates a new command submission chunk for amdgpu to add in and out sync objects around the submission. Sync objects are managed via the drm syncobj ioctls. The command submission interface is enhanced with two new chunks, one for syncobj pre submission dependencies, and one for post submission sync obj signalling, and just takes a list of handles for each. This is based on work originally done by David Zhou at AMD, with input from Christian Konig on what things should look like. In theory VkFences could be backed with sync objects and just get passed into the cs as syncobj handles as well. NOTE: this interface addition needs a version bump to expose it to userspace. TODO: update to dep_sync when rebasing onto amdgpu master. (with this - r-b from Christian) v1.1: keep file reference on import. v2: move to using syncobjs v2.1: change some APIs to just use p pointer. v3: make more robust against CS failures, we now add the wait sems but only remove them once the CS job has been submitted. v4: rewrite names of API and base on new syncobj code. v5: move post deps earlier, rename some apis v6: lookup post deps earlier, and just replace fences in post deps stage (Christian) Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 15 6月, 2017 1 次提交
-
-
由 Huang Rui 提交于
gpu_info firmware is released after data is used. But when system enters into suspend, upper class driver will cache all firmware names. At that time, gpu_info will be failing to load. It seems an upper class issue, that we should not release gpu_info firmware until device finished. [ 903.236589] cache_firmware: amdgpu/vega10_sdma1.bin [ 903.236590] fw_set_page_data: fw-amdgpu/vega10_sdma1.bin buf=ffff88041eee10c0 data=ffffc90002561000 size=17408 [ 903.236591] cache_firmware: amdgpu/vega10_sdma1.bin ret=0 [ 903.464160] __allocate_fw_buf: fw-amdgpu/vega10_gpu_info.bin buf=ffff88041eee2c00 [ 903.471815] (NULL device *): loading /lib/firmware/updates/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2 [ 903.482870] (NULL device *): loading /lib/firmware/updates/amdgpu/vega10_gpu_info.bin failed with error -2 [ 903.492716] (NULL device *): loading /lib/firmware/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2 [ 903.503156] (NULL device *): direct-loading amdgpu/vega10_gpu_info.bin Signed-off-by: NHuang Rui <ray.huang@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-