1. 20 9月, 2012 2 次提交
  2. 25 8月, 2012 1 次提交
    • C
      drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer · 7788a765
      Chris Wilson 提交于
      If we need to stall in order to complete the pin_and_fence operation
      during execbuffer reservation, there is a high likelihood that the
      operation will be interrupted by a signal (thanks X!). In order to
      simplify the cleanup along that error path, the object was
      unconditionally unbound and the error propagated. However, being
      interrupted here is far more common than I would like and so we can
      strive to avoid the extra work by eliminating the forced unbind.
      
      v2: In discussion over the indecent colour of the new functions and
      unwind path, we realised that we can use the new unreserve function to
      clean up the code even further.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7788a765
  3. 24 8月, 2012 2 次提交
  4. 21 8月, 2012 1 次提交
    • C
      drm/i915: Track unbound pages · 6c085a72
      Chris Wilson 提交于
      When dealing with a working set larger than the GATT, or even the
      mappable aperture when touching through the GTT, we end up with evicting
      objects only to rebind them at a new offset again later. Moving an
      object into and out of the GTT requires clflushing the pages, thus
      causing a double-clflush penalty for rebinding.
      
      To avoid having to clflush on rebinding, we can track the pages as they
      are evicted from the GTT and only relinquish those pages on memory
      pressure.
      
      As usual, if it were not for the handling of out-of-memory condition and
      having to manually shrink our own bo caches, it would be a net reduction
      of code. Alas.
      
      Note: The patch also contains a few changes to the last-hope
      evict_everything logic in i916_gem_execbuffer.c - we no longer try to
      only evict the purgeable stuff in a first try (since that's superflous
      and only helps in OOM corner-cases, not fragmented-gtt trashing
      situations).
      
      Also, the extraction of the get_pages retry loop from bind_to_gtt (and
      other callsites) to get_pages should imo have been a separate patch.
      
      v2: Ditch the newly added put_pages (for unbound objects only) in
      i915_gem_reset. A quick irc discussion hasn't revealed any important
      reason for this, so if we need this, I'd like to have a git blame'able
      explanation for it.
      
      v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Split out code movements and rant a bit in the commit message
      with a few Notes. Done v2]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6c085a72
  5. 06 8月, 2012 1 次提交
  6. 26 7月, 2012 7 次提交
  7. 25 7月, 2012 1 次提交
  8. 20 7月, 2012 1 次提交
    • C
      drm/i915: Insert a flush between batches if the breadcrumb was dropped · 09cf7c9a
      Chris Wilson 提交于
      If we drop the breadcrumb request after a batch due to a signal for
      example we aim to fix it up at the next opportunity. In this case we
      emit a second batchbuffer with no waits upon the first and so no
      opportunity to insert the missing request, so we need to emit the
      missing flush for coherency. (Note that that invalidating the render
      cache is the same as flushing it, so there should have been no
      observable corruption.)
      
      Note that beside simply adding the missing flush, avoiding potential
      render corruption, this will also fix at least parts of the problem
      introduced by some funny interaction of these two commits:
      
      commit de2b9985
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Jul 4 22:52:50 2012 +0200
      
          drm/i915: don't return a spurious -EIO from intel_ring_begin
      
      which allowed intel_ring_begin to return -ERESTARTSYS and
      
      commit cc889e0f
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Jun 13 20:45:19 2012 +0200
      
          drm/i915: disable flushing_list/gpu_write_list
      
      which essentially disabled the flushing list.
      
      The issue happens when we submit a batch & emit it, but get
      interrupted (thanks to the first patch) while trying to emit the
      flush. On the next batch we still assume that the full gpu domain
      handling is in effect and hence compute the invalidate&flushing
      domains. But thanks to the 2nd patch we totally ignore these and only
      invalidate all gpu domains, presuming that any required flushes have
      been issued already.  Which is wrong and eventually results in us
      updating the new write_domain values with the computed
      pending_write_domain values, which leaves an object with write_domain
      == 0 on the gpu_write_list.
      
      As soon as we try to unbind that object, things blow up.
      
      Fix this by emitting the missing flush according to the new
      ring->gpu_caches_dirty flag.
      
      Note that this does _not_ fix all the current cases where we end up
      with an object on the flushing_list that can't be flushed.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52040Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Add bug explanation to commit message.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      09cf7c9a
  9. 20 6月, 2012 1 次提交
    • D
      drm/i915: disable flushing_list/gpu_write_list · cc889e0f
      Daniel Vetter 提交于
      This is just the minimal patch to disable all this code so that we can
      do decent amounts of QA before we rip it all out.
      
      The complicating thing is that we need to flush the gpu caches after
      the batchbuffer is emitted. Which is past the point of no return where
      execbuffer can't fail any more (otherwise we risk submitting the same
      batch multiple times).
      
      Hence we need to add a flag to track whether any caches associated
      with that ring are dirty. And emit the flush in add_request if that's
      the case.
      
      Note that this has a quite a few behaviour changes:
      - Caches get flushed/invalidated unconditionally.
      - Invalidation now happens after potential inter-ring sync.
      
      I've bantered around a bit with Chris on irc whether this fixes
      anything, and it might or might not. The only thing clear is that with
      these changes it's much easier to reason about correctness.
      
      Also rip out a lone get_next_request_seqno in the execbuffer
      retire_commands function. I've dug around and I couldn't figure out
      why that is still there, with the outstanding lazy request stuff it
      shouldn't be necessary.
      
      v2: Chris Wilson complained that I also invalidate the read caches
      when flushing after a batchbuffer. Now optimized.
      
      v3: Added some comments to explain the new flushing behaviour.
      
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      cc889e0f
  10. 14 6月, 2012 1 次提交
    • B
      drm/i915/context: switch contexts with execbuf2 · 6e0a69db
      Ben Widawsky 提交于
      Use the rsvd1 field in execbuf2 to specify the context ID associated
      with the workload. This will allow the driver to do the proper context
      switch when/if needed.
      
      v2: Add checks for context switches on rings not supporting contexts.
      Before the code would silently ignore such requests.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      6e0a69db
  11. 20 5月, 2012 1 次提交
  12. 08 5月, 2012 1 次提交
    • C
      drm/i915: Limit calling mark-busy only for potential scanouts · acb87dfb
      Chris Wilson 提交于
      The principle of intel_mark_busy() is that we want to spot the
      transition of when the display engine is being used in order to bump
      powersaving modes and increase display clocks. As such it is only
      important when the display is changing, i.e. when rendering to the
      scanout or other sprite/plane, and these are characterised by being
      pinned.
      
      v2: Mark the whole device as busy on execbuffer and pageflips as well
      and rebase against dinq for the minor bug fix to be immediately
      applicable.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: fix compile fail.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      acb87dfb
  13. 03 5月, 2012 2 次提交
    • D
      drm/i915: disallow clip rects on gen5+ · 6ebebc92
      Daniel Vetter 提交于
      Unfortunately there has been dri1 userspace that used gem to manage
      the gtt and hence also needed cliprects in the execbuf ioctl. So
      we can't ever remove that code without breaking the ioctl abi.
      
      But at least we can disable it on gen5+, because these horrible
      versions of mesa have not supported these chips.
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6ebebc92
    • B
      drm/i915: remove do_retire from i915_wait_request · b2da9fe5
      Ben Widawsky 提交于
      This originates from a hack by me to quickly fix a bug in an earlier
      patch where we needed control over whether or not waiting on a seqno
      actually did any retire list processing. Since the two operations aren't
      clearly related, we should pull the parameter out of the wait function,
      and make the caller responsible for retiring if the action is desired.
      
      The only function call site which did not get an explicit retire_request call
      (on purpose) is i915_gem_inactive_shrink(). That code was already calling
      retire_request a second time.
      
      v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b2da9fe5
  14. 24 4月, 2012 2 次提交
  15. 18 4月, 2012 2 次提交
  16. 13 4月, 2012 2 次提交
    • B
      drm/i915: use semaphores for the display plane · 2911a35b
      Ben Widawsky 提交于
      In theory this will have performance and power improvements. Performance
      because we don't need to stall when the scanout BO is busy, and power
      because we don't have to stall when the BO is busy (and the ring can
      even go to sleep if the HW supports it).
      
      v2:
      squash 2 patches into 1 (me)
      un-inline the enable_semaphores function (Daniel)
      remove comment about SNB hangs from i915_gem_object_sync (Chris)
      rename intel_enable_semaphores to i915_semaphore_is_enabled (me)
      removed page flip comment; "no why" (Chris)
      
      To address other comments from Daniel (irc):
      update the comment to say 'vt-d is crap, don't enable semaphores'
        - I think you misinterpreted Chris' comment, it already exists.
      checking out whether we can pageflip on the render ring on ivb (didn't
      work on early silicon)
        - We don't want to enable workarounds for early silicon unless we have
          to.
        - I can't find any references in the docs about this.
      optionally use it if the fb is already busy on the render ring
        - This should be how the code already worked, unless I am
          misunderstanding your meaning.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2911a35b
    • C
      drm/i915: Reorganise rules for get_fence/put_fence · 9a5a53b3
      Chris Wilson 提交于
      By simplifying the rules to calling get_fence when writing to the
      through the GTT in a tiled manner, and calling put_fence before writing
      to the object through the GTT in a linear manner, the code becomes
      clearer and there is less chance of making a mistake.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      [danvet: fixed up conflict with ppgtt code and spelling in a new
      comment.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9a5a53b3
  17. 01 4月, 2012 1 次提交
  18. 27 3月, 2012 2 次提交
  19. 26 3月, 2012 1 次提交
    • C
      drm/i915: Batch copy_from_user for relocation processing · 1d83f442
      Chris Wilson 提交于
      Originally the code tried to allocate a large enough array to perform
      the copy using vmalloc, performance wasn't great and throughput was
      improved by processing each individual relocation entry separately.
      This too is not as efficient as one would desire. A compromise would be
      to allocate a single page, or to allocate a few entries on the stack,
      and process the copy in batches. The latter gives simpler code and more
      consistent performance due to a lack of heuristic.
      
      x11perf -copywinwin10:	n450/pnv	i3-330m		i5-2520m (cpu)
                     before: 	  249000	 785000		 1280000 (80%)
                       page:	  264000	 896000		 1280000 (65%)
                   on-stack:	  264000	 902000		 1280000 (67%)
      
      v2: Use 512-bytes of stack for batching rather than allocate a page.
      v3: Tidy the code slightly with more descriptive variable names
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1d83f442
  20. 21 3月, 2012 1 次提交
    • D
      drm/i915: implement SNB workaround for lazy global gtt · 149c8407
      Daniel Vetter 提交于
      PIPE_CONTROL on snb needs global gtt mappings in place to workaround a
      hw gotcha. No other commands need such a workaround. Luckily we can
      detect a PIPE_CONTROL commands easily because they have a write_domain
      = I915_GEM_DOMAIN_INSTRUCTION (and nothing else has that).
      
      v2: Binding the target of such a reloc into the global gtt actually
      works instead of binding the source, which is rather pointless ...
      
      v3: Kill a superflous has_global_gtt_mapping assignement noticed by
      Chris Wilson.
      Reviewed-and-tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      149c8407
  21. 10 2月, 2012 1 次提交
  22. 09 2月, 2012 1 次提交
  23. 30 1月, 2012 3 次提交
  24. 26 1月, 2012 1 次提交
  25. 04 1月, 2012 1 次提交
    • E
      drm/i915: Add support for resetting the SO write pointers on gen7. · ae662d31
      Eric Anholt 提交于
      These registers are automatically incremented by the hardware during
      transform feedback to track where the next streamed vertex output
      should go.  Unlike the previous generation, which had a packet for
      setting the corresponding registers to a defined value, gen7 only has
      MI_LOAD_REGISTER_IMM to do so.  That's a secure packet (since it loads
      an arbitrary register), so we need to do it from the kernel, and it
      needs to be settable atomically with the batchbuffer execution so that
      two clients doing transform feedback don't stomp on each others'
      state.
      
      Instead of building a more complicated interface involcing setting the
      registers to a specific value, just set them to 0 when asked and
      userland can tweak its pointers accordingly.
      Signed-off-by: NEric Anholt <eric@anholt.net>
      Reviewed-by: NEugeni Dodonov <eugeni.dodonov@intel.com>
      Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      ae662d31