1. 13 9月, 2017 10 次提交
  2. 01 9月, 2017 1 次提交
  3. 30 8月, 2017 2 次提交
  4. 29 8月, 2017 1 次提交
  5. 18 8月, 2017 5 次提交
  6. 16 8月, 2017 3 次提交
  7. 14 7月, 2017 1 次提交
    • J
      drm/amdgpu: Throttle visible VRAM moves separately · 00f06b24
      John Brooks 提交于
      The BO move throttling code is designed to allow VRAM to fill quickly if it
      is relatively empty. However, this does not take into account situations
      where the visible VRAM is smaller than total VRAM, and total VRAM may not
      be close to full but the visible VRAM segment is under pressure. In such
      situations, visible VRAM would experience unrestricted swapping and
      performance would drop.
      
      Add a separate counter specifically for moves involving visible VRAM, and
      check it before moving BOs there.
      
      v2: Only perform calculations for separate counter if visible VRAM is
          smaller than total VRAM. (Michel Dänzer)
      v3: [Michel Dänzer]
      * Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED
        flag to determine whether to account a move for visible VRAM in most
        cases.
      * Use a single
      
      	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {
      
        block in amdgpu_cs_get_threshold_for_moves.
      
      Fixes: 95844d20 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2))
      Signed-off-by: NJohn Brooks <john@fastquake.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      00f06b24
  8. 06 7月, 2017 1 次提交
  9. 30 6月, 2017 2 次提交
  10. 17 6月, 2017 2 次提交
    • D
      amdgpu: use drm sync objects for shared semaphores (v6) · 660e8558
      Dave Airlie 提交于
      This creates a new command submission chunk for amdgpu
      to add in and out sync objects around the submission.
      
      Sync objects are managed via the drm syncobj ioctls.
      
      The command submission interface is enhanced with two new
      chunks, one for syncobj pre submission dependencies,
      and one for post submission sync obj signalling,
      and just takes a list of handles for each.
      
      This is based on work originally done by David Zhou at AMD,
      with input from Christian Konig on what things should look like.
      
      In theory VkFences could be backed with sync objects and
      just get passed into the cs as syncobj handles as well.
      
      NOTE: this interface addition needs a version bump to expose
      it to userspace.
      
      TODO: update to dep_sync when rebasing onto amdgpu master.
      (with this - r-b from Christian)
      
      v1.1: keep file reference on import.
      v2: move to using syncobjs
      v2.1: change some APIs to just use p pointer.
      v3: make more robust against CS failures, we now add the
      wait sems but only remove them once the CS job has been
      submitted.
      v4: rewrite names of API and base on new syncobj code.
      v5: move post deps earlier, rename some apis
      v6: lookup post deps earlier, and just replace fences
      in post deps stage (Christian)
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      660e8558
    • D
      amdgpu/cs: split out fence dependency checking (v2) · 6f0308eb
      Dave Airlie 提交于
      This just splits out the fence depenency checking into it's
      own function to make it easier to add semaphore dependencies.
      
      v2: rebase onto other changes.
      
      v1-Reviewed-by: Christian König <christian.koenig@amd.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      6f0308eb
  11. 09 6月, 2017 1 次提交
  12. 01 6月, 2017 1 次提交
    • A
      drm/amdgpu: untie user ring ids from kernel ring ids v6 · effd924d
      Andres Rodriguez 提交于
      Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's
      ring ids from the kernel's ring ids.
      
      The queue manager maintains a per-file descriptor map of user ring ids
      to amdgpu_ring pointers. Once a map is created it is permanent (this is
      required to maintain FIFO execution guarantees for a context's ring).
      
      Different queue map policies can be configured for each HW IP.
      Currently all HW IPs use the identity mapper, i.e. kernel ring id is
      equal to the user ring id.
      
      The purpose of this mechanism is to distribute the load across multiple
      queues more effectively for HW IPs that support multiple rings.
      Userspace clients are unable to check whether a specific resource is in
      use by a different client. Therefore, it is up to the kernel driver to
      make the optimal choice.
      
      v2: remove amdgpu_queue_mapper_funcs
      v3: made amdgpu_queue_mgr per context instead of per-fd
      v4: add context_put on error paths
      v5: rebase and include new IPs UVD_ENC & VCN_*
      v6: drop unused amdgpu_ring_is_valid_index (Alex)
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NAndres Rodriguez <andresx7@gmail.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      effd924d
  13. 25 5月, 2017 3 次提交
  14. 18 5月, 2017 1 次提交
  15. 29 4月, 2017 1 次提交
  16. 08 4月, 2017 1 次提交
  17. 07 4月, 2017 1 次提交
  18. 05 4月, 2017 1 次提交
  19. 30 3月, 2017 2 次提交
    • H
    • M
      drm/amdgpu:changes in gfx DMAframe scheme (v2) · e9d672b2
      Monk Liu 提交于
      1) Adapt to vulkan:
      Now use double SWITCH BUFFER to replace the 128 nops w/a,
      because when vulkan introduced, umd can insert 7 ~ 16 IBs
      per submit which makes 256 DW size cannot hold the whole
      DMAframe (if we still insert those 128 nops), CP team suggests
      use double SWITCH_BUFFERs, instead of tricky 128 NOPs w/a.
      
      2) To fix the CE VM fault issue when MCBP introduced:
      Need one more COND_EXEC wrapping IB part (original one us
      for VM switch part).
      
      this change can fix vm fault issue caused by below scenario
      without this change:
      
      >CE passed original COND_EXEC (no MCBP issued this moment),
       proceed as normal.
      
      >DE catch up to this COND_EXEC, but this time MCBP issued,
       thus DE treats all following packages as NOP. The following
       VM switch packages now looks just as NOP to DE, so DE
       dosen't do VM flush at all.
      
      >Now CE proceeds to the first IBc, and triggers VM fault,
       because DE didn't do VM flush for this DMAframe.
      
      3) change estimated alloc size for gfx9.
      with new DMAframe scheme, we need modify emit_frame_size
      for gfx9
      
      4) No need to insert 128 nops after gfx8 vm flush anymore
      because there was double SWITCH_BUFFER append to vm flush,
      and for gfx7 we already use double SWITCH_BUFFER following
      after vm_flush so no change needed for it.
      
      5) Change emit_frame_size for gfx8
      
      v2: squash in BUG removal from Monk
      Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
      Acked-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      e9d672b2