1. 23 6月, 2015 14 次提交
  2. 22 6月, 2015 2 次提交
  3. 29 5月, 2015 1 次提交
  4. 21 5月, 2015 1 次提交
  5. 08 5月, 2015 2 次提交
  6. 30 4月, 2015 1 次提交
  7. 24 4月, 2015 3 次提交
    • D
      drm/i915: Fix up the vma aliasing ppgtt binding · 0875546c
      Daniel Vetter 提交于
      Currently we have the problem that the decision whether ptes need to
      be (re)written is splattered all over the codebase. Move all that into
      i915_vma_bind. This needs a few changes:
      - Just reuse the PIN_* flags for i915_vma_bind and do the conversion
        to vma->bound in there to avoid duplicating the conversion code all
        over.
      - We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if
        around) explicit, add PIN_USER for that.
      - Two callers want to update ptes, give them a PIN_UPDATE for that.
      
      Of course we still want to avoid double-binding, but that should be
      taken care of:
      - A ppgtt vma will only ever see PIN_USER, so no issue with
        double-binding.
      - A ggtt vma with aliasing ppgtt needs both types of binding, and we
        track that properly now.
      - A ggtt vma without aliasing ppgtt could be bound twice. In the
        lower-level ->bind_vma functions hence unconditionally set
        GLOBAL_BIND when writing the ggtt ptes.
      
      There's still a bit room for cleanup, but that's for follow-up
      patches.
      
      v2: Fixup fumbles.
      
      v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      0875546c
    • D
      drm/i915: Don't use atomics for pg_dirty_rings · 9258811c
      Daniel Vetter 提交于
      It's already protected by the bkl^Wdev->struct_mutex. While at it
      realign some related code.
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      9258811c
    • D
      drm/i915: Don't look at pg_dirty_rings for aliasing ppgtt · 71b7e54f
      Daniel Vetter 提交于
      We load the ppgtt ptes once per gpu reset/driver load/resume and
      that's all that's needed. Note that this only blows up when we're
      using the allocate_va_range funcs and not the special-purpose ones
      used. With this change we can get rid of that duplication.
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      71b7e54f
  8. 20 4月, 2015 1 次提交
    • D
      drm/i915: Dont clear PIN_GLOBAL in the execbuf pinning fallback · 0229da32
      Daniel Vetter 提交于
      PIN_GLOBAL is set only when userspace asked for it, and that
      is only the case for the gen6 PIPE_CONTROL workaround. We're not
      allowed to just clear this.
      
      The important part of the fallback is to drop the restriction to
      the mappable range.
      
      This issue has been introduced in
      
      commit edf4427b
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Wed Jan 14 11:20:56 2015 +0000
      
          drm/i915: Fallback to using CPU relocations for large batch buffers
      
      v2: Chris pointed out that we also miss to set PIN_GLOBAL when the
      buffer is already bound. Fix this up too.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      0229da32
  9. 10 4月, 2015 2 次提交
  10. 01 4月, 2015 1 次提交
    • J
      drm/i915: Rename 'do_execbuf' to 'execbuf_submit' · f3dc74c0
      John Harrison 提交于
      The submission portion of the execbuffer code path was abstracted into a
      function pointer indirection as part of the legacy vs execlist work. The two
      implementation functions are called 'i915_gem_ringbuffer_submission' and
      'intel_execlists_submission' but the pointer was called 'do_execbuf'. There is
      already a 'i915_gem_do_execbuffer' function (which is what calls the pointer
      indirection). The name of the pointer is therefore considered to be backwards
      and should be changed.
      
      This patch renames it to 'execbuf_submit' which is hopefully a bit clearer.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f3dc74c0
  11. 30 3月, 2015 1 次提交
  12. 27 3月, 2015 1 次提交
  13. 20 3月, 2015 2 次提交
    • B
      drm/i915: Track page table reload need · 563222a7
      Ben Widawsky 提交于
      This patch was formerly known as, "Force pd restore when PDEs change,
      gen6-7." I had to change the name because it is needed for GEN8 too.
      
      The real issue this is trying to solve is when a new object is mapped
      into the current address space. The GPU does not snoop the new mapping
      so we must do the gen specific action to reload the page tables.
      
      GEN8 and GEN7 do differ in the way they load page tables for the RCS.
      GEN8 does so with the context restore, while GEN7 requires the proper
      load commands in the command streamer. Non-render is similar for both.
      
      Caveat for GEN7
      The docs say you cannot change the PDEs of a currently running context.
      We never map new PDEs of a running context, and expect them to be
      present - so I think this is okay. (We can unmap, but this should also
      be okay since we only unmap unreferenced objects that the GPU shouldn't
      be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag
      to signal that even if the context is the same, force a reload. It's
      unclear exactly what this does, but I have a hunch it's the right thing
      to do.
      
      The logic assumes that we always emit a context switch after mapping new
      PDEs, and before we submit a batch. This is the case today, and has been
      the case since the inception of hardware contexts. A note in the comment
      let's the user know.
      
      It's not just for gen8. If the current context has mappings change, we
      need a context reload to switch
      
      v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
      and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt
      is always null.
      
      v3: Invalidate PPGTT TLBs inside alloc_va_range.
      
      v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
      pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
      neither ctx->ppgtt and aliasing_ppgtt exist.
      
      v5: Removed references to teardown_va_range.
      
      v6: Updated needs_pd_load_pre/post.
      
      v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
      comment about updated PDEs to object_pin/bind (Mika).
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      563222a7
    • C
      drm/i915: Fallback to using CPU relocations for large batch buffers · edf4427b
      Chris Wilson 提交于
      If the batch buffer is too large to fit into the aperture and we need a
      GTT mapping for relocations, we currently fail. This only applies to a
      subset of machines for a subset of environments, quite undesirable. We
      can simply check after failing to insert the batch into the GTT as to
      whether we only need a mappable binding for relocation and, if so, we can
      revert to using a non-mappable binding and an alternate relocation
      method. However, using relocate_entry_cpu() is excruciatingly slow for
      large buffers on non-LLC as the entire buffer requires clflushing before
      and after the relocation handling. Alternatively, we can implement a
      third relocation method that only clflushes around the relocation entry.
      This is still slower than updating through the GTT, so we prefer using
      the GTT where possible, but is orders of magnitude faster as we
      typically do not have to then clflush the entire buffer.
      
      An alternative idea of using a temporary WC mapping of the backing store
      is promising (it should be faster than using the GTT itself), but
      requires fairly extensive arch/x86 support - along the lines of
      kmap_atomic_prof_pfn() (which is not universally implemented even for
      x86).
      
      Testcase: igt/gem_exec_big #pnv,byt
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88392Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Add a WARN_ONCE for the impossible reloc case and explain in
      a short comment why we want to avoid ping-pong.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      edf4427b
  14. 18 3月, 2015 2 次提交
  15. 26 2月, 2015 1 次提交
    • J
      drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading · 8e004efc
      John Harrison 提交于
      There is a flags word that is passed through the execbuffer code path all the
      way from initial decoding of the user parameters down to the very final dispatch
      buffer call. It is simply called 'flags'. Unfortuantely, there are many other
      flags words floating around in the same blocks of code. Even more once the GPU
      scheduler arrives.
      
      This patch makes it more obvious exactly which flags word is which by renaming
      'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word
      already have an 'I915_DISPATCH_' prefix on them and so are not quite so
      ambiguous.
      
      OTC-Jira: VIZ-1587
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      [danvet: Resolve conflict with Chris' rework of the bb parsing.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8e004efc
  16. 24 2月, 2015 1 次提交
  17. 27 1月, 2015 1 次提交
    • Z
      drm/i915: Specify bsd rings through exec flag · 8d360dff
      Zhipeng Gong 提交于
      On Skylake GT3 we have 2 Video Command Streamers (VCS), which is asymmetrical.
      For example, HEVC GPU commands can be only dispatched to VCS1 ring.
      But userspace has no control when using VCS1 or VCS2. This patch introduces
      a mechanism to avoid the default ping-pong mode and use one specific ring
      through execution flag. This mechanism is usable for all the platforms
      with 2 VCS rings.
      
      The open source usage is from these two commits in vaapi/intel:
      	commit 702050f04131a44ef8ac16651708ce8a8d98e4b8
      	Author: Zhao, Yakui <yakui.zhao@intel.com>
      	Date:   Mon Nov 17 12:44:19 2014 +0800
      
      	    Allow the batchbuffer to be submitted with override flag
      
      	commit a56efcdf27d11ad9b21664b4a2cda72d7f90f5a8
      	Author: Zhao Yakui <yakui.zhao@intel.com>
      	Date:   Mon Nov 17 12:44:22 2014 +0800
      
      	    Add the override flag to assure that HEVC video command
      		always uses BSD ring0 for SKL GT3 machine
      
      v2: fix whitespace (Rodrigo)
      v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
      v4: change the comment (Zhipeng)
      v5: address Daniel's comment (Zhipeng)
      Signed-off-by: NZhipeng Gong <zhipeng.gong@intel.com>
      Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8d360dff
  18. 08 1月, 2015 1 次提交
  19. 24 12月, 2014 1 次提交
  20. 16 12月, 2014 1 次提交