1. 14 7月, 2017 1 次提交
    • J
      drm/amdgpu: Throttle visible VRAM moves separately · 00f06b24
      John Brooks 提交于
      The BO move throttling code is designed to allow VRAM to fill quickly if it
      is relatively empty. However, this does not take into account situations
      where the visible VRAM is smaller than total VRAM, and total VRAM may not
      be close to full but the visible VRAM segment is under pressure. In such
      situations, visible VRAM would experience unrestricted swapping and
      performance would drop.
      
      Add a separate counter specifically for moves involving visible VRAM, and
      check it before moving BOs there.
      
      v2: Only perform calculations for separate counter if visible VRAM is
          smaller than total VRAM. (Michel Dänzer)
      v3: [Michel Dänzer]
      * Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED
        flag to determine whether to account a move for visible VRAM in most
        cases.
      * Use a single
      
      	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {
      
        block in amdgpu_cs_get_threshold_for_moves.
      
      Fixes: 95844d20 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2))
      Signed-off-by: NJohn Brooks <john@fastquake.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      00f06b24
  2. 06 7月, 2017 1 次提交
  3. 30 6月, 2017 2 次提交
  4. 17 6月, 2017 2 次提交
    • D
      amdgpu: use drm sync objects for shared semaphores (v6) · 660e8558
      Dave Airlie 提交于
      This creates a new command submission chunk for amdgpu
      to add in and out sync objects around the submission.
      
      Sync objects are managed via the drm syncobj ioctls.
      
      The command submission interface is enhanced with two new
      chunks, one for syncobj pre submission dependencies,
      and one for post submission sync obj signalling,
      and just takes a list of handles for each.
      
      This is based on work originally done by David Zhou at AMD,
      with input from Christian Konig on what things should look like.
      
      In theory VkFences could be backed with sync objects and
      just get passed into the cs as syncobj handles as well.
      
      NOTE: this interface addition needs a version bump to expose
      it to userspace.
      
      TODO: update to dep_sync when rebasing onto amdgpu master.
      (with this - r-b from Christian)
      
      v1.1: keep file reference on import.
      v2: move to using syncobjs
      v2.1: change some APIs to just use p pointer.
      v3: make more robust against CS failures, we now add the
      wait sems but only remove them once the CS job has been
      submitted.
      v4: rewrite names of API and base on new syncobj code.
      v5: move post deps earlier, rename some apis
      v6: lookup post deps earlier, and just replace fences
      in post deps stage (Christian)
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      660e8558
    • D
      amdgpu/cs: split out fence dependency checking (v2) · 6f0308eb
      Dave Airlie 提交于
      This just splits out the fence depenency checking into it's
      own function to make it easier to add semaphore dependencies.
      
      v2: rebase onto other changes.
      
      v1-Reviewed-by: Christian König <christian.koenig@amd.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      6f0308eb
  5. 09 6月, 2017 1 次提交
  6. 01 6月, 2017 1 次提交
    • A
      drm/amdgpu: untie user ring ids from kernel ring ids v6 · effd924d
      Andres Rodriguez 提交于
      Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's
      ring ids from the kernel's ring ids.
      
      The queue manager maintains a per-file descriptor map of user ring ids
      to amdgpu_ring pointers. Once a map is created it is permanent (this is
      required to maintain FIFO execution guarantees for a context's ring).
      
      Different queue map policies can be configured for each HW IP.
      Currently all HW IPs use the identity mapper, i.e. kernel ring id is
      equal to the user ring id.
      
      The purpose of this mechanism is to distribute the load across multiple
      queues more effectively for HW IPs that support multiple rings.
      Userspace clients are unable to check whether a specific resource is in
      use by a different client. Therefore, it is up to the kernel driver to
      make the optimal choice.
      
      v2: remove amdgpu_queue_mapper_funcs
      v3: made amdgpu_queue_mgr per context instead of per-fd
      v4: add context_put on error paths
      v5: rebase and include new IPs UVD_ENC & VCN_*
      v6: drop unused amdgpu_ring_is_valid_index (Alex)
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NAndres Rodriguez <andresx7@gmail.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      effd924d
  7. 25 5月, 2017 3 次提交
  8. 18 5月, 2017 1 次提交
  9. 29 4月, 2017 1 次提交
  10. 08 4月, 2017 1 次提交
  11. 07 4月, 2017 1 次提交
  12. 05 4月, 2017 1 次提交
  13. 30 3月, 2017 11 次提交
  14. 11 3月, 2017 1 次提交
  15. 10 2月, 2017 1 次提交
  16. 28 1月, 2017 2 次提交
  17. 24 1月, 2017 1 次提交
  18. 07 12月, 2016 1 次提交
  19. 11 11月, 2016 2 次提交
  20. 09 11月, 2016 1 次提交
  21. 26 10月, 2016 4 次提交