1. 26 7月, 2012 7 次提交
  2. 25 7月, 2012 1 次提交
  3. 20 7月, 2012 1 次提交
    • C
      drm/i915: Insert a flush between batches if the breadcrumb was dropped · 09cf7c9a
      Chris Wilson 提交于
      If we drop the breadcrumb request after a batch due to a signal for
      example we aim to fix it up at the next opportunity. In this case we
      emit a second batchbuffer with no waits upon the first and so no
      opportunity to insert the missing request, so we need to emit the
      missing flush for coherency. (Note that that invalidating the render
      cache is the same as flushing it, so there should have been no
      observable corruption.)
      
      Note that beside simply adding the missing flush, avoiding potential
      render corruption, this will also fix at least parts of the problem
      introduced by some funny interaction of these two commits:
      
      commit de2b9985
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Jul 4 22:52:50 2012 +0200
      
          drm/i915: don't return a spurious -EIO from intel_ring_begin
      
      which allowed intel_ring_begin to return -ERESTARTSYS and
      
      commit cc889e0f
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Jun 13 20:45:19 2012 +0200
      
          drm/i915: disable flushing_list/gpu_write_list
      
      which essentially disabled the flushing list.
      
      The issue happens when we submit a batch & emit it, but get
      interrupted (thanks to the first patch) while trying to emit the
      flush. On the next batch we still assume that the full gpu domain
      handling is in effect and hence compute the invalidate&flushing
      domains. But thanks to the 2nd patch we totally ignore these and only
      invalidate all gpu domains, presuming that any required flushes have
      been issued already.  Which is wrong and eventually results in us
      updating the new write_domain values with the computed
      pending_write_domain values, which leaves an object with write_domain
      == 0 on the gpu_write_list.
      
      As soon as we try to unbind that object, things blow up.
      
      Fix this by emitting the missing flush according to the new
      ring->gpu_caches_dirty flag.
      
      Note that this does _not_ fix all the current cases where we end up
      with an object on the flushing_list that can't be flushed.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52040Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Add bug explanation to commit message.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      09cf7c9a
  4. 20 6月, 2012 1 次提交
    • D
      drm/i915: disable flushing_list/gpu_write_list · cc889e0f
      Daniel Vetter 提交于
      This is just the minimal patch to disable all this code so that we can
      do decent amounts of QA before we rip it all out.
      
      The complicating thing is that we need to flush the gpu caches after
      the batchbuffer is emitted. Which is past the point of no return where
      execbuffer can't fail any more (otherwise we risk submitting the same
      batch multiple times).
      
      Hence we need to add a flag to track whether any caches associated
      with that ring are dirty. And emit the flush in add_request if that's
      the case.
      
      Note that this has a quite a few behaviour changes:
      - Caches get flushed/invalidated unconditionally.
      - Invalidation now happens after potential inter-ring sync.
      
      I've bantered around a bit with Chris on irc whether this fixes
      anything, and it might or might not. The only thing clear is that with
      these changes it's much easier to reason about correctness.
      
      Also rip out a lone get_next_request_seqno in the execbuffer
      retire_commands function. I've dug around and I couldn't figure out
      why that is still there, with the outstanding lazy request stuff it
      shouldn't be necessary.
      
      v2: Chris Wilson complained that I also invalidate the read caches
      when flushing after a batchbuffer. Now optimized.
      
      v3: Added some comments to explain the new flushing behaviour.
      
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      cc889e0f
  5. 14 6月, 2012 1 次提交
    • B
      drm/i915/context: switch contexts with execbuf2 · 6e0a69db
      Ben Widawsky 提交于
      Use the rsvd1 field in execbuf2 to specify the context ID associated
      with the workload. This will allow the driver to do the proper context
      switch when/if needed.
      
      v2: Add checks for context switches on rings not supporting contexts.
      Before the code would silently ignore such requests.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      6e0a69db
  6. 20 5月, 2012 1 次提交
  7. 08 5月, 2012 1 次提交
    • C
      drm/i915: Limit calling mark-busy only for potential scanouts · acb87dfb
      Chris Wilson 提交于
      The principle of intel_mark_busy() is that we want to spot the
      transition of when the display engine is being used in order to bump
      powersaving modes and increase display clocks. As such it is only
      important when the display is changing, i.e. when rendering to the
      scanout or other sprite/plane, and these are characterised by being
      pinned.
      
      v2: Mark the whole device as busy on execbuffer and pageflips as well
      and rebase against dinq for the minor bug fix to be immediately
      applicable.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: fix compile fail.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      acb87dfb
  8. 03 5月, 2012 2 次提交
    • D
      drm/i915: disallow clip rects on gen5+ · 6ebebc92
      Daniel Vetter 提交于
      Unfortunately there has been dri1 userspace that used gem to manage
      the gtt and hence also needed cliprects in the execbuf ioctl. So
      we can't ever remove that code without breaking the ioctl abi.
      
      But at least we can disable it on gen5+, because these horrible
      versions of mesa have not supported these chips.
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6ebebc92
    • B
      drm/i915: remove do_retire from i915_wait_request · b2da9fe5
      Ben Widawsky 提交于
      This originates from a hack by me to quickly fix a bug in an earlier
      patch where we needed control over whether or not waiting on a seqno
      actually did any retire list processing. Since the two operations aren't
      clearly related, we should pull the parameter out of the wait function,
      and make the caller responsible for retiring if the action is desired.
      
      The only function call site which did not get an explicit retire_request call
      (on purpose) is i915_gem_inactive_shrink(). That code was already calling
      retire_request a second time.
      
      v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b2da9fe5
  9. 24 4月, 2012 2 次提交
  10. 18 4月, 2012 2 次提交
  11. 13 4月, 2012 2 次提交
    • B
      drm/i915: use semaphores for the display plane · 2911a35b
      Ben Widawsky 提交于
      In theory this will have performance and power improvements. Performance
      because we don't need to stall when the scanout BO is busy, and power
      because we don't have to stall when the BO is busy (and the ring can
      even go to sleep if the HW supports it).
      
      v2:
      squash 2 patches into 1 (me)
      un-inline the enable_semaphores function (Daniel)
      remove comment about SNB hangs from i915_gem_object_sync (Chris)
      rename intel_enable_semaphores to i915_semaphore_is_enabled (me)
      removed page flip comment; "no why" (Chris)
      
      To address other comments from Daniel (irc):
      update the comment to say 'vt-d is crap, don't enable semaphores'
        - I think you misinterpreted Chris' comment, it already exists.
      checking out whether we can pageflip on the render ring on ivb (didn't
      work on early silicon)
        - We don't want to enable workarounds for early silicon unless we have
          to.
        - I can't find any references in the docs about this.
      optionally use it if the fb is already busy on the render ring
        - This should be how the code already worked, unless I am
          misunderstanding your meaning.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2911a35b
    • C
      drm/i915: Reorganise rules for get_fence/put_fence · 9a5a53b3
      Chris Wilson 提交于
      By simplifying the rules to calling get_fence when writing to the
      through the GTT in a tiled manner, and calling put_fence before writing
      to the object through the GTT in a linear manner, the code becomes
      clearer and there is less chance of making a mistake.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      [danvet: fixed up conflict with ppgtt code and spelling in a new
      comment.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9a5a53b3
  12. 01 4月, 2012 1 次提交
  13. 27 3月, 2012 2 次提交
  14. 26 3月, 2012 1 次提交
    • C
      drm/i915: Batch copy_from_user for relocation processing · 1d83f442
      Chris Wilson 提交于
      Originally the code tried to allocate a large enough array to perform
      the copy using vmalloc, performance wasn't great and throughput was
      improved by processing each individual relocation entry separately.
      This too is not as efficient as one would desire. A compromise would be
      to allocate a single page, or to allocate a few entries on the stack,
      and process the copy in batches. The latter gives simpler code and more
      consistent performance due to a lack of heuristic.
      
      x11perf -copywinwin10:	n450/pnv	i3-330m		i5-2520m (cpu)
                     before: 	  249000	 785000		 1280000 (80%)
                       page:	  264000	 896000		 1280000 (65%)
                   on-stack:	  264000	 902000		 1280000 (67%)
      
      v2: Use 512-bytes of stack for batching rather than allocate a page.
      v3: Tidy the code slightly with more descriptive variable names
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1d83f442
  15. 21 3月, 2012 1 次提交
    • D
      drm/i915: implement SNB workaround for lazy global gtt · 149c8407
      Daniel Vetter 提交于
      PIPE_CONTROL on snb needs global gtt mappings in place to workaround a
      hw gotcha. No other commands need such a workaround. Luckily we can
      detect a PIPE_CONTROL commands easily because they have a write_domain
      = I915_GEM_DOMAIN_INSTRUCTION (and nothing else has that).
      
      v2: Binding the target of such a reloc into the global gtt actually
      works instead of binding the source, which is rather pointless ...
      
      v3: Kill a superflous has_global_gtt_mapping assignement noticed by
      Chris Wilson.
      Reviewed-and-tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      149c8407
  16. 10 2月, 2012 1 次提交
  17. 09 2月, 2012 1 次提交
  18. 30 1月, 2012 3 次提交
  19. 26 1月, 2012 1 次提交
  20. 04 1月, 2012 3 次提交
  21. 27 12月, 2011 1 次提交
  22. 17 12月, 2011 1 次提交
  23. 22 9月, 2011 1 次提交
    • B
      drm/i915: Dumb down the semaphore logic · c8c99b0f
      Ben Widawsky 提交于
      While I think the previous code is correct, it was hard to follow and
      hard to debug. Since we already have a ring abstraction, might as well
      use it to handle the semaphore updates and compares.
      
      I don't expect this code to make semaphores better or worse, but you
      never know...
      
      v2:
      Remove magic per Keith's suggestions.
      Ran Daniel's gem_ring_sync_loop test on this.
      
      v3:
      Ignored one of Keith's suggestions.
      
      v4:
      Removed some bloat per Daniel's recommendation.
      
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Keith Packard <keithp@keithp.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      c8c99b0f
  24. 22 6月, 2011 1 次提交
    • E
      Revert "drm/i915: Kill GTT mappings when moving from GTT domain" · e92d03bf
      Eric Anholt 提交于
      This reverts commit 4a684a41.
      Userland has always been required to set the object's domain to GTT
      before using it through a GTT mapping, it's not something that the
      kernel is supposed to enforce.  (The pagefault support is so that we
      can handle multiple mappings without userland having to pin across
      them, not so that userland can use GTT after GPU domains without
      telling the kernel).
      
      Fixes 19.2% +/- 0.8% (n=6) performance regression in cairo-gl
      firefox-talos-gfx on my T420 latop.
      Signed-off-by: NKeith Packard <keithp@keithp.com>
      e92d03bf
  25. 23 3月, 2011 1 次提交
    • C
      drm/i915: Disable pagefaults along execbuffer relocation fast path · d4aeee77
      Chris Wilson 提交于
      Along the fast path for relocation handling, we attempt to copy directly
      from the user data structures whilst holding our mutex. This causes
      lockdep to warn about circular lock dependencies if we need to pagefault
      the user pages. [Since when handling a page fault on a mmapped bo, we
      need to acquire the struct mutex whilst already holding the mm
      semaphore, it is then verboten to acquire the mm semaphore when already
      holding the struct mutex. The likelihood of the user passing in the
      relocations contained in a GTT mmaped bo is low, but conceivable for
      extreme pathology.] In order to force the mm to return EFAULT rather
      than handle the pagefault, we therefore need to disable pagefaults
      across the relocation fast path.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: stable@kernel.org
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d4aeee77