1. 12 11月, 2012 3 次提交
  2. 18 10月, 2012 1 次提交
    • C
      drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers · d7d4eedd
      Chris Wilson 提交于
      With the introduction of per-process GTT space, the hardware designers
      thought it wise to also limit the ability to write to MMIO space to only
      a "secure" batch buffer. The ability to rewrite registers is the only
      way to program the hardware to perform certain operations like scanline
      waits (required for tear-free windowed updates). So we either have a
      choice of adding an interface to perform those synchronized updates
      inside the kernel, or we permit certain processes the ability to write
      to the "safe" registers from within its command stream. This patch
      exposes the ability to submit a SECURE batch buffer to
      DRM_ROOT_ONLY|DRM_MASTER processes.
      
      v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security
      bit (bit 13, accidentally not set). Also add a comment explaining why
      secure batches need a global gtt binding.
      
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      [danvet: added hsw fixup.]
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d7d4eedd
  3. 03 10月, 2012 2 次提交
  4. 20 9月, 2012 1 次提交
    • C
      drm/i915: Replace the array of pages with a scatterlist · 9da3da66
      Chris Wilson 提交于
      Rather than have multiple data structures for describing our page layout
      in conjunction with the array of pages, we can migrate all users over to
      a scatterlist.
      
      One major advantage, other than unifying the page tracking structures,
      this offers is that we replace the vmalloc'ed array (which can be up to
      a megabyte in size) with a chain of individual pages which helps reduce
      memory pressure.
      
      The disadvantage is that we then do not have a simple array to iterate,
      or to access randomly. The common case for this is in the relocation
      processing, which will typically fit within a single scatterlist page
      and so be almost the same cost as the simple array. For iterating over
      the array, the extra function call could be optimised away, but in
      reality is an insignificant cost of either binding the pages, or
      performing the pwrite/pread.
      
      v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the
      trivial compile error from rebasing.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9da3da66
  5. 03 9月, 2012 3 次提交
    • P
      drm/i915: add workarounds to gen7_render_ring_flush · f3987631
      Paulo Zanoni 提交于
      From Bspec, Vol 2a, Section 1.9.3.4 "PIPE_CONTROL", intro section
      detailing the various workarounds:
      
      "[DevIVB {W/A}, DevHSW {W/A}]: Pipe_control with CS-stall bit
      set must be issued before a pipe-control command that has the State
      Cache Invalidate bit set."
      
      Note that public Bspec has different numbering, it's Vol2Part1,
      Section 1.10.4.1 "PIPE_CONTROL" there.
      
      There's also a second workaround for the PIPE_CONTROL command itself:
      
      "[DevIVB, DevVLV, DevHSW] {WA}: Every 4th PIPE_CONTROL command, not
      counting the PIPE_CONTROL with only read-cache-invalidate bit(s) set,
      must have a CS_STALL bit set"
      
      For simplicity we simply set the CS_STALL bit on every pipe_control on
      gen7+
      
      Note that this massively helps on some hsw machines, together with the
      following patch to unconditionally set the CS_STALL bit on every
      pipe_control it prevents a gpu hang every few seconds.
      
      This is a regression that has been introduced in the pipe_control
      cleanup:
      
      commit 6c6cf5aa
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Fri Jul 20 18:02:28 2012 +0100
      
          drm/i915: Only apply the SNB pipe control w/a to gen6
      
      It looks like the massive snb pipe_control workaround also papered
      over any issues on ivb and hsw.
      Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      [danvet: squashed both workarounds together, pimped commit message
      with Bsepc citations, regression commit citation and changed the
      comment in the code a bit to clarify that we unconditionally set
      CS_STALL to avoid being hurt by trying to be clever.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f3987631
    • P
      drm/i915: add workarounds directly to gen6_render_ring_flush · b3111509
      Paulo Zanoni 提交于
      Since gen 7+ now run the new gen7_render_ring_flush function.
      Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b3111509
    • P
      drm/i915: add gen7_render_ring_flush · 4772eaeb
      Paulo Zanoni 提交于
      For now, just a copy of gen6_render_ring_flush. Different gens have
      different workarounds, so we want different functions.
      Signed-off-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4772eaeb
  6. 24 8月, 2012 1 次提交
  7. 14 8月, 2012 1 次提交
  8. 10 8月, 2012 1 次提交
    • C
      drm/i915: Lazily apply the SNB+ seqno w/a · b2eadbc8
      Chris Wilson 提交于
      Avoid the forcewake overhead when simply retiring requests, as often the
      last seen seqno is good enough to satisfy the retirment process and will
      be promptly re-run in any case. Only ensure that we force the coherent
      seqno read when we are explicitly waiting upon a completion event to be
      sure that none go missing, and also for when we are reporting seqno
      values in case of error or debugging.
      
      This greatly reduces the load for userspace using the busy-ioctl to
      track active buffers, for instance halving the CPU used by X in pushing
      the pixels from a software render (flash). The effect will be even more
      magnified with userptr and so providing a zero-copy upload path in that
      instance, or in similar instances where X is simply compositing DRI
      buffers.
      
      v2: Reverse the polarity of the tachyon stream. Daniel suggested that
      'force' was too generic for the parameter name and that 'lazy_coherency'
      better encapsulated the semantics of it being an optimization and its
      purpose. Also notice that gen6_get_seqno() is only used by gen6/7
      chipsets and so the test for IS_GEN6 || IS_GEN7 is redundant in that
      function.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b2eadbc8
  9. 08 8月, 2012 2 次提交
  10. 26 7月, 2012 3 次提交
  11. 20 7月, 2012 2 次提交
  12. 05 7月, 2012 2 次提交
    • D
      drm/i915: don't return a spurious -EIO from intel_ring_begin · de2b9985
      Daniel Vetter 提交于
      The issue with this check is that it results in userspace receiving an
      -EIO while the gpu reset hasn't completed, resulting in fallback to sw
      rendering or worse.
      
      Now there's also a stern comment in intel_ring_wait_seqno saying that
      intel_ring_begin should not return -EAGAIN, ever, because some callers
      can't handle that. But after an audit of the callsites I don't see any
      issues. I guess the last problematic spot disappeared with the removal
      of the pipelined fencing code.
      
      So do the right thing and call check_wedge, which should properly
      decide whether an -EAGAIN or -EIO is appropriate if wedged is set.
      
      Note that the early check for a wedged gpu before touching the ring is
      rather important (and it took me quite some time of acting like the
      densest doofus to figure that out): If we don't do that and the gpu
      died for good, not having been resurrect by the reset code, userspace
      can merrily fill up the entire ring until it notices that something is
      amiss.
      
      Allowing userspace to emit more render, despite that we know that it
      will fail can't lead to anything good (and by experience can lead to
      all sorts of havoc, including angering the OOM gods and hard-hanging
      the hw for good).
      
      v2: Fix EAGAIN mispell, noticed by Chris Wilson.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      de2b9985
    • D
      drm/i915: non-interruptible sleeps can't handle -EAGAIN · d6b2c790
      Daniel Vetter 提交于
      So don't return -EAGAIN, even in the case of a gpu hang. Remap it to
      -EIO instead. Note that this isn't really an issue with
      interruptability, but more that we have quite a few codepaths (mostly
      around kms stuff) that simply can't handle any errors and hence not
      even -EAGAIN. Instead of adding proper failure paths so that we could
      restart these ioctls we've opted for the cheap way out of sleeping
      non-interruptibly.  Which works everywhere but when the gpu dies,
      which this patch fixes.
      
      So essentially interruptible == false means 'wait for the gpu or die
      trying'.'
      
      This patch is a bit ugly because intel_ring_begin is all non-interruptible
      and hence only returns -EIO. But as the comment in there says,
      auditing all the callsites would be a pain.
      
      To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno
      and intel_wait_ring_buffer. Also use the opportunity to clarify the
      different cases in i915_gem_check_wedge a bit with comments.
      
      v2: Don't access dev_priv->mm.interruptible from check_wedge - we
      might not hold dev->struct_mutex, making this racy. Instead pass
      interruptible in as a parameter. I've noticed this because I've hit a
      BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been
      added in
      
      commit b4aca010
      Author: Ben Widawsky <ben@bwidawsk.net>
      Date:   Wed Apr 25 20:50:12 2012 -0700
      
          drm/i915: extract some common olr+wedge code
      
      although that commit is missing any justification for this. I guess
      it's just copy&paste, because the same commit add the same BUG_ON
      check to check_olr, where it indeed makes sense.
      
      But in check_wedge everything we access is protected by other means,
      so this is superflous. And because it now gets in the way (we add a
      new caller in __wait_seqno, which can be called without
      dev->struct_mutext) let's just remove it.
      
      v3: Group all the i915_gem_check_wedge refactoring into this patch, so
      that this patch here is all about not returning -EAGAIN to callsites
      that can't handle syscall restarting.
      
      v4: Add clarification what interuptible == fales means in our code,
      requested by Ben Widawsky.
      
      v5: Fix EAGAIN mispell noticed by Chris Wilson.
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d6b2c790
  13. 29 6月, 2012 1 次提交
  14. 14 6月, 2012 2 次提交
    • B
      drm/i915: possibly invalidate TLB before context switch · 12b0286f
      Ben Widawsky 提交于
      From http://intellinuxgraphics.org/documentation/SNB/IHD_OS_Vol1_Part3.pdf
      
      [DevSNB] If Flush TLB invalidation Mode is enabled it's the driver's
      responsibility to invalidate the TLBs at least once after the previous
      context switch after any GTT mappings changed (including new GTT
      entries).  This can be done by a pipelined PIPE_CONTROL with TLB inv bit
      set immediately before MI_SET_CONTEXT.
      
      On GEN7 the invalidation mode is explicitly set, but this appears to be
      lacking for GEN6. Since I don't know the history on this, I've decided
      to dynamically read the value at ring init time, and use that value
      throughout.
      
      v2: better comment (daniel)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      12b0286f
    • B
      drm/i915: PIPE_CONTROL_TLB_INVALIDATE · cc0f6398
      Ben Widawsky 提交于
      This has showed up in several other patches. It's required for the next
      context workaround.
      
      I tested this one on its own and saw no differences in basic tests
      (performance or otherwise). This patch is relatively likely to cause
      regressions, hence why it's split out.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      cc0f6398
  15. 13 6月, 2012 1 次提交
  16. 05 6月, 2012 2 次提交
  17. 31 5月, 2012 1 次提交
  18. 30 5月, 2012 1 次提交
    • C
      drm/i915: Reset last_retired_head when resetting ring · c3b20037
      Chris Wilson 提交于
      When we reset the ring control registers, including the HEAD and TAIL of
      the ring, we also need to reset associated state. In this instance, we
      were failing to reset the cached value of ring->last_retired_head and so
      upon the first request for more space following a resume would
      potentially (depending on a narrow race window) believe that the HEAD had
      advanced much further than reality.
      
      This is a regression from:
      
      commit a71d8d94
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Wed Feb 15 11:25:36 2012 +0000
      
          drm/i915: Record the tail at each request and use it to estimate the head
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: stable@vger.kernel.org # 3.4
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c3b20037
  19. 25 5月, 2012 1 次提交
  20. 07 5月, 2012 1 次提交
  21. 06 5月, 2012 1 次提交
    • D
      drm/i915: add interface to simulate gpu hangs · e5eb3d63
      Daniel Vetter 提交于
      gpu reset is a very important piece of our infrastructure.
      Unfortunately we only really it test by actually hanging the gpu,
      which often has bad side-effects for the entire system. And the gpu
      hang handling code is one of the rather complicated pieces of code we
      have, consisting of
      - hang detection
      - error capture
      - actual gpu reset
      - reset of all the gem bookkeeping
      - reinitialition of the entire gpu
      
      This patch adds a debugfs to selectively stopping rings by ceasing to
      update the hw tail pointer, which will result in the gpu no longer
      updating it's head pointer and eventually to the hangcheck firing.
      This way we can exercise the gpu hang code under controlled conditions
      without a dying gpu taking down the entire systems.
      
      Patch motivated by me forgetting to properly reinitialize ppgtt after
      a gpu reset.
      
      Usage:
      
      echo $((1 << $ringnum)) > i915_ring_stop # stops one ring
      
      echo 0xffffffff > i915_ring_stop # stops all, future-proof version
      
      then run whatever testload is desired. i915_ring_stop automatically
      resets after a gpu hang is detected to avoid hanging the gpu to fast
      and declaring it wedged.
      
      v2: Incorporate feedback from Chris Wilson.
      
      v3: Add the missing cleanup.
      
      v4: Fix up inconsistent size of ring_stop_read vs _write, noticed by
      Eugeni Dodonov.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e5eb3d63
  22. 03 5月, 2012 7 次提交