1. 01 5月, 2013 1 次提交
    • M
      drm/i915: reference count for i915_hw_contexts · dce3271b
      Mika Kuoppala 提交于
      Enabling PPGTT and also the need to track which context was guilty of
      gpu hang (arb robustness enabling) have put pressure for struct i915_hw_context
      to be more than just a placeholder for hw context state.
      
      In order to track object lifetime properly in a multi peer usage, add reference
      counting for i915_hw_context.
      
      v2: track i915_hw_context pointers instead of using ctx_ids
      (from Chris Wilson)
      
      v3 (Ben): Get rid of do_release() and handle refcounting more compactly.
      (recommended by Chis)
      
      v4: kref_* put inside static inlines (Daniel Vetter)
      remove code duplication on freeing context (Chris Wilson)
      
      v5: idr_remove and ctx->file_priv = NULL in destroy ioctl (Chris)
      This actually will cause a problem if one destroys a context and later
      refers to the idea of the context (multiple contexts may have the same
      id, but only 1 will exist in the idr).
      
      v6: Strip out the request related stuff. Reworded commit message.
      Got rid of do_destroy and introduced i915_gem_context_release_handle,
      suggested by Chris Wilson.
      
      v7: idr_remove can't be called inside idr_for_each (Chris Wilson)
      
      Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v5)
      Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v7)
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Squash sob lines, the patch ping-ponged between Ben and Mika
      a bit ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      dce3271b
  2. 18 4月, 2013 1 次提交
    • C
      drm/i915: Use MLC (l3$) for context objects · 4615d4c9
      Chris Wilson 提交于
      Enabling context support increases SwapBuffers latency by about 20%
      (measured on an i7-3720qm). We can offset that loss slightly by enabling
      faster caching for the contexts. As they are not backed by any
      particular cache (such as the sampler or render caches) our only option
      is to select the generic mid-level cache. This reduces the latency of
      the swap by about 5%.
      
      Oddly this effect can be observed running smokin-guns on IVB at
      1280x1024:
      Using BLT copies for swaps: 151.67 fps
      Using Render copies for swaps (unpatched):  141.70 fps
      With contexts disabled: 150.23 fps
      With contexts in L3$: 150.77 fps
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Kenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4615d4c9
  3. 28 2月, 2013 1 次提交
  4. 15 2月, 2013 2 次提交
  5. 29 11月, 2012 1 次提交
    • C
      drm/i915: Preallocate next seqno before touching the ring · 9d773091
      Chris Wilson 提交于
      Based on the work by Mika Kuoppala, we realised that we need to handle
      seqno wraparound prior to committing our changes to the ring. The most
      obvious point then is to grab the seqno inside intel_ring_begin(), and
      then to reuse that seqno for all ring operations until the next request.
      As intel_ring_begin() can fail, the callers must already be prepared to
      handle such failure and so we can safely add further checks.
      
      This patch looks like it should be split up into the interface
      changes and the tweaks to move seqno wrapping from the execbuffer into
      the core seqno increment. However, I found no easy way to break it into
      incremental steps without introducing further broken behaviour.
      
      v2: Mika found a silly mistake and a subtle error in the existing code;
      inside i915_gem_retire_requests() we were resetting the sync_seqno of
      the target ring based on the seqno from this ring - which are only
      related by the order of their allocation, not retirement. Hence we were
      applying the optimisation that the rings were synchronised too early,
      fortunately the only real casualty there is the handling of seqno
      wrapping.
      
      v3: Do not forget to reset the sync_seqno upon module reinitialisation,
      ala resume.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861
      Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9d773091
  6. 12 11月, 2012 1 次提交
  7. 03 10月, 2012 1 次提交
  8. 02 10月, 2012 1 次提交
    • C
      drm/i915: Actually invalidate the TLB for the SandyBridge HW contexts w/a · ac82ea2e
      Chris Wilson 提交于
      A side-effect of commit 7d54a904
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Fri Aug 10 10:18:10 2012 +0100
      
          drm/i915: Apply post-sync write for pipe control invalidates
      
      was that only a request to emit invalidate flush would result in the
      TLB being invalidated (since it requires synchronisation and so incurs a
      performance penalty). However, the stated w/a for hardware contexts is
      that the TLBs must be invalidated prior to a MI_SET_CONTEXT, yet the w/a
      itself did not request the TLBs to be invalidated...
      
      Note this w/a does not prevent the hard system hang I experience when
      using hw contexts (with rc6 enabled) on SNB GT1.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ac82ea2e
  9. 24 8月, 2012 1 次提交
  10. 06 8月, 2012 1 次提交
  11. 26 7月, 2012 2 次提交
  12. 25 7月, 2012 3 次提交
  13. 20 7月, 2012 1 次提交
  14. 30 6月, 2012 1 次提交
  15. 20 6月, 2012 6 次提交
  16. 18 6月, 2012 1 次提交
  17. 14 6月, 2012 9 次提交
    • B
      drm/i915: reset the GPU on context fini · 8e96d9c4
      Ben Widawsky 提交于
      It's the only way we know how to make the GPU actually forget about the
      default context.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      8e96d9c4
    • B
      drm/i915/context: create & destroy ioctls · 84624813
      Ben Widawsky 提交于
      Add the interfaces to allow user space to create and destroy contexts.
      Contexts are destroyed automatically if the file descriptor for the dri
      device is closed.
      
      Following convention as usual here causes checkpatch warnings.
      
      v2: with is_initialized, no longer need to init at create
      drop the context switch on create (daniel)
      
      v3: Use interruptible lock (Chris)
      return -ENODEV in !GEM case (Chris)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      84624813
    • B
      drm/i915: use the default context · dfabbcb4
      Ben Widawsky 提交于
      With the code to do HW context switches in place have the driver load the
      default context for the render ring when the driver loads.
      
      The default context will be an ever present context that is available to
      switch to at any time for the given ring.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      dfabbcb4
    • B
      drm/i915: possibly invalidate TLB before context switch · 12b0286f
      Ben Widawsky 提交于
      From http://intellinuxgraphics.org/documentation/SNB/IHD_OS_Vol1_Part3.pdf
      
      [DevSNB] If Flush TLB invalidation Mode is enabled it's the driver's
      responsibility to invalidate the TLBs at least once after the previous
      context switch after any GTT mappings changed (including new GTT
      entries).  This can be done by a pipelined PIPE_CONTROL with TLB inv bit
      set immediately before MI_SET_CONTEXT.
      
      On GEN7 the invalidation mode is explicitly set, but this appears to be
      lacking for GEN6. Since I don't know the history on this, I've decided
      to dynamically read the value at ring init time, and use that value
      throughout.
      
      v2: better comment (daniel)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      12b0286f
    • B
      drm/i915: Ivybridge MI_ARB_ON_OFF context w/a · e37ec39b
      Ben Widawsky 提交于
      The workaround itself applies to gen7 only (according to the docs) and
      as Eric Anholt points out shouldn't be required since we don't use HW
      scheduling features, and therefore arbitration. Though since it is a
      small, and simple addition, and we don't really understand the issue,
      just do it.
      
      FWIW, I eventually want to play with some of the arbitration stuff, and
      I'd hate to forget about this.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      e37ec39b
    • D
      drm/i915: ensure context objects are bound to the global gtt · 3af7b857
      Daniel Vetter 提交于
      This way round we don't introduce and ugly layering violations and use
      the interface as I planned to use it.
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3af7b857
    • B
      drm/i915: context switch implementation · e0556841
      Ben Widawsky 提交于
      Implement the context switch code as well as the interfaces to do the
      context switch. This patch also doesn't match 1:1 with the RFC patches.
      The main difference is that from Daniel's responses the last context
      object is now stored instead of the last context. This aids in allows us
      to free the context data structure, and context object independently.
      
      There is room for optimization: this code will pin the context object
      until the next context is active. The optimal way to do it is to
      actually pin the object, move it to the active list, do the context
      switch, and then unpin it. This allows the eviction code to actually
      evict the context object if needed.
      
      The context switch code is missing workarounds, they will be implemented
      in future patches.
      
      v2: actually do obj->dirty=1 in switch (daniel)
      Modified comment around above
      Remove flags to context switch (daniel)
      Move mi_set_context code to i915_gem_context.c (daniel)
      Remove seqno , use lazy request instead (daniel)
      
      v3: use i915_gem_request_next_seqno instead of
            outstanding_lazy_request (Daniel)
      remove id's from trace events (Daniel)
      Put the context BO in the instruction domain (Daniel)
      Don't unref the BO is context switch fails (Chris)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      e0556841
    • B
      drm/i915: context basic create & destroy · 40521054
      Ben Widawsky 提交于
      Invent an abstraction for a hw context which is passed around through
      the core functions. The main bit a hw context holds is the buffer object
      which backs the context. The rest of the members are just helper
      functions. Specifically the ring member, which could likely go away if
      we decide to never implement whatever other hw context support exists.
      
      Of note here is the introduction of the 64k alignment constraint for the
      BO. If contexts become heavily used, we should consider tweaking this
      down to 4k. Until the contexts are merged and tested a bit though, I
      think 64k is a nice start (based on docs).
      
      Since we don't yet switch contexts, there is really not much complexity
      here. Creation/destruction works pretty much as one would expect. An idr
      is used to generate the context id numbers which are unique per file
      descriptor.
      
      v2: add DRM_DEBUG_DRIVERS to distinguish ENOMEM failures (ben)
      convert a BUG_ON to WARN_ON, default destruction is still fatal (ben)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      40521054
    • B
      drm/i915: preliminary context support · 254f965c
      Ben Widawsky 提交于
      Very basic code for context setup/destruction in the driver.
      
      Adds the file i915_gem_context.c This file implements HW context
      support. On gen5+ a HW context consists of an opaque GPU object which is
      referenced at times of context saves and restores.  With RC6 enabled,
      the context is also referenced as the GPU enters and exists from RC6
      (GPU has it's own internal power context, except on gen5).  Though
      something like a context does exist for the media ring, the code only
      supports contexts for the render ring.
      
      In software, there is a distinction between contexts created by the
      user, and the default HW context. The default HW context is used by GPU
      clients that do not request setup of their own hardware context. The
      default context's state is never restored to help prevent programming
      errors. This would happen if a client ran and piggy-backed off another
      clients GPU state.  The default context only exists to give the GPU some
      offset to load as the current to invoke a save of the context we
      actually care about. In fact, the code could likely be constructed,
      albeit in a more complicated fashion, to never use the default context,
      though that limits the driver's ability to swap out, and/or destroy
      other contexts.
      
      All other contexts are created as a request by the GPU client. These
      contexts store GPU state, and thus allow GPU clients to not re-emit
      state (and potentially query certain state) at any time. The kernel
      driver makes certain that the appropriate commands are inserted.
      
      There are 4 entry points into the contexts, init, fini, open, close.
      The names are self-explanatory except that init can be called during
      reset, and also during pm thaw/resume. As we expect our context to be
      preserved across these events, we do not reinitialize in this case.
      
      As Adam Jackson pointed out, The cutoff of 1MB where a HW context is
      considered too big is arbitrary. The reason for this is even though
      context sizes are increasing with every generation, they have yet to
      eclipse even 32k. If we somehow read back way more than that, it
      probably means BIOS has done something strange, or we're running on a
      platform that wasn't designed for this.
      
      v2: rename load/unload to init/fini (daniel)
      remove ILK support for get_size() (indirectly daniel)
      add HAS_HW_CONTEXTS macro to clarify supported platforms (daniel)
      added comments (Ben)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      254f965c