1. 22 11月, 2012 1 次提交
    • C
      drm/i915: Remove bogus test for a present execbuffer · be7cb634
      Chris Wilson 提交于
      The intention of checking obj->gtt_offset!=0 is to verify that the
      target object was listed in the execbuffer and had been bound into the
      GTT. This is guarranteed by the earlier rearrangement to split the
      execbuffer operation into reserve and relocation phases and then
      verified by the check that the target handle had been processed during
      the reservation phase.
      
      However, the actual checking of obj->gtt_offset==0 is bogus as we can
      indeed reference an object at offset 0. For instance, the framebuffer
      installed by the BIOS often resides at offset 0 - causing EINVAL as we
      legimately try to render using the stolen fb.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NEric Anholt <eric@anholt.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      be7cb634
  2. 12 11月, 2012 1 次提交
    • B
      drm/i915: Stop using AGP layer for GEN6+ · e76e9aeb
      Ben Widawsky 提交于
      As a quick hack we make the old intel_gtt structure mutable so we can
      fool a bunch of the existing code which depends on elements in that data
      structure. We can/should try to remove this in a subsequent patch.
      
      This should preserve the old gtt init behavior which upon writing these
      patches seems incorrect. The next patch will fix these things.
      
      The one exception is VLV which doesn't have the preserved flush control
      write behavior. Since we want to do that for all GEN6+ stuff, we'll
      handle that in a later patch. Mainstream VLV support doesn't actually
      exist yet anyway.
      
      v2: Update the comment to remove the "voodoo"
      Check that the last pte written matches what we readback
      
      v3: actually kill cache_level_to_agp_type since most of the flags will
      disappear in an upcoming patch
      
      v4: v3 was actually not what we wanted (Daniel)
      Make the ggtt bind assertions better and stricter (Chris)
      Fix some uncaught errors at gtt init (Chris)
      Some other random stuff that Chris wanted
      
      v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk>
      [danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a
      tad more robust by mapping everything != CACHE_NONE to the cached agp
      flag - we have a 1:1 uncached mapping, but different modes of
      cacheable (at least on later generations). Suggested by Chris Wilson.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e76e9aeb
  3. 18 10月, 2012 1 次提交
    • C
      drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers · d7d4eedd
      Chris Wilson 提交于
      With the introduction of per-process GTT space, the hardware designers
      thought it wise to also limit the ability to write to MMIO space to only
      a "secure" batch buffer. The ability to rewrite registers is the only
      way to program the hardware to perform certain operations like scanline
      waits (required for tear-free windowed updates). So we either have a
      choice of adding an interface to perform those synchronized updates
      inside the kernel, or we permit certain processes the ability to write
      to the "safe" registers from within its command stream. This patch
      exposes the ability to submit a SECURE batch buffer to
      DRM_ROOT_ONLY|DRM_MASTER processes.
      
      v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security
      bit (bit 13, accidentally not set). Also add a comment explaining why
      secure batches need a global gtt binding.
      
      Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      [danvet: added hsw fixup.]
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d7d4eedd
  4. 03 10月, 2012 2 次提交
  5. 20 9月, 2012 3 次提交
  6. 25 8月, 2012 1 次提交
    • C
      drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer · 7788a765
      Chris Wilson 提交于
      If we need to stall in order to complete the pin_and_fence operation
      during execbuffer reservation, there is a high likelihood that the
      operation will be interrupted by a signal (thanks X!). In order to
      simplify the cleanup along that error path, the object was
      unconditionally unbound and the error propagated. However, being
      interrupted here is far more common than I would like and so we can
      strive to avoid the extra work by eliminating the forced unbind.
      
      v2: In discussion over the indecent colour of the new functions and
      unwind path, we realised that we can use the new unreserve function to
      clean up the code even further.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7788a765
  7. 24 8月, 2012 2 次提交
  8. 21 8月, 2012 1 次提交
    • C
      drm/i915: Track unbound pages · 6c085a72
      Chris Wilson 提交于
      When dealing with a working set larger than the GATT, or even the
      mappable aperture when touching through the GTT, we end up with evicting
      objects only to rebind them at a new offset again later. Moving an
      object into and out of the GTT requires clflushing the pages, thus
      causing a double-clflush penalty for rebinding.
      
      To avoid having to clflush on rebinding, we can track the pages as they
      are evicted from the GTT and only relinquish those pages on memory
      pressure.
      
      As usual, if it were not for the handling of out-of-memory condition and
      having to manually shrink our own bo caches, it would be a net reduction
      of code. Alas.
      
      Note: The patch also contains a few changes to the last-hope
      evict_everything logic in i916_gem_execbuffer.c - we no longer try to
      only evict the purgeable stuff in a first try (since that's superflous
      and only helps in OOM corner-cases, not fragmented-gtt trashing
      situations).
      
      Also, the extraction of the get_pages retry loop from bind_to_gtt (and
      other callsites) to get_pages should imo have been a separate patch.
      
      v2: Ditch the newly added put_pages (for unbound objects only) in
      i915_gem_reset. A quick irc discussion hasn't revealed any important
      reason for this, so if we need this, I'd like to have a git blame'able
      explanation for it.
      
      v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Split out code movements and rant a bit in the commit message
      with a few Notes. Done v2]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6c085a72
  9. 06 8月, 2012 1 次提交
  10. 26 7月, 2012 7 次提交
  11. 25 7月, 2012 1 次提交
  12. 20 7月, 2012 1 次提交
    • C
      drm/i915: Insert a flush between batches if the breadcrumb was dropped · 09cf7c9a
      Chris Wilson 提交于
      If we drop the breadcrumb request after a batch due to a signal for
      example we aim to fix it up at the next opportunity. In this case we
      emit a second batchbuffer with no waits upon the first and so no
      opportunity to insert the missing request, so we need to emit the
      missing flush for coherency. (Note that that invalidating the render
      cache is the same as flushing it, so there should have been no
      observable corruption.)
      
      Note that beside simply adding the missing flush, avoiding potential
      render corruption, this will also fix at least parts of the problem
      introduced by some funny interaction of these two commits:
      
      commit de2b9985
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Jul 4 22:52:50 2012 +0200
      
          drm/i915: don't return a spurious -EIO from intel_ring_begin
      
      which allowed intel_ring_begin to return -ERESTARTSYS and
      
      commit cc889e0f
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Wed Jun 13 20:45:19 2012 +0200
      
          drm/i915: disable flushing_list/gpu_write_list
      
      which essentially disabled the flushing list.
      
      The issue happens when we submit a batch & emit it, but get
      interrupted (thanks to the first patch) while trying to emit the
      flush. On the next batch we still assume that the full gpu domain
      handling is in effect and hence compute the invalidate&flushing
      domains. But thanks to the 2nd patch we totally ignore these and only
      invalidate all gpu domains, presuming that any required flushes have
      been issued already.  Which is wrong and eventually results in us
      updating the new write_domain values with the computed
      pending_write_domain values, which leaves an object with write_domain
      == 0 on the gpu_write_list.
      
      As soon as we try to unbind that object, things blow up.
      
      Fix this by emitting the missing flush according to the new
      ring->gpu_caches_dirty flag.
      
      Note that this does _not_ fix all the current cases where we end up
      with an object on the flushing_list that can't be flushed.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52040Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Add bug explanation to commit message.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      09cf7c9a
  13. 20 6月, 2012 1 次提交
    • D
      drm/i915: disable flushing_list/gpu_write_list · cc889e0f
      Daniel Vetter 提交于
      This is just the minimal patch to disable all this code so that we can
      do decent amounts of QA before we rip it all out.
      
      The complicating thing is that we need to flush the gpu caches after
      the batchbuffer is emitted. Which is past the point of no return where
      execbuffer can't fail any more (otherwise we risk submitting the same
      batch multiple times).
      
      Hence we need to add a flag to track whether any caches associated
      with that ring are dirty. And emit the flush in add_request if that's
      the case.
      
      Note that this has a quite a few behaviour changes:
      - Caches get flushed/invalidated unconditionally.
      - Invalidation now happens after potential inter-ring sync.
      
      I've bantered around a bit with Chris on irc whether this fixes
      anything, and it might or might not. The only thing clear is that with
      these changes it's much easier to reason about correctness.
      
      Also rip out a lone get_next_request_seqno in the execbuffer
      retire_commands function. I've dug around and I couldn't figure out
      why that is still there, with the outstanding lazy request stuff it
      shouldn't be necessary.
      
      v2: Chris Wilson complained that I also invalidate the read caches
      when flushing after a batchbuffer. Now optimized.
      
      v3: Added some comments to explain the new flushing behaviour.
      
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      cc889e0f
  14. 14 6月, 2012 1 次提交
    • B
      drm/i915/context: switch contexts with execbuf2 · 6e0a69db
      Ben Widawsky 提交于
      Use the rsvd1 field in execbuf2 to specify the context ID associated
      with the workload. This will allow the driver to do the proper context
      switch when/if needed.
      
      v2: Add checks for context switches on rings not supporting contexts.
      Before the code would silently ignore such requests.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      6e0a69db
  15. 20 5月, 2012 1 次提交
  16. 08 5月, 2012 1 次提交
    • C
      drm/i915: Limit calling mark-busy only for potential scanouts · acb87dfb
      Chris Wilson 提交于
      The principle of intel_mark_busy() is that we want to spot the
      transition of when the display engine is being used in order to bump
      powersaving modes and increase display clocks. As such it is only
      important when the display is changing, i.e. when rendering to the
      scanout or other sprite/plane, and these are characterised by being
      pinned.
      
      v2: Mark the whole device as busy on execbuffer and pageflips as well
      and rebase against dinq for the minor bug fix to be immediately
      applicable.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: fix compile fail.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      acb87dfb
  17. 03 5月, 2012 2 次提交
    • D
      drm/i915: disallow clip rects on gen5+ · 6ebebc92
      Daniel Vetter 提交于
      Unfortunately there has been dri1 userspace that used gem to manage
      the gtt and hence also needed cliprects in the execbuf ioctl. So
      we can't ever remove that code without breaking the ioctl abi.
      
      But at least we can disable it on gen5+, because these horrible
      versions of mesa have not supported these chips.
      Acked-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6ebebc92
    • B
      drm/i915: remove do_retire from i915_wait_request · b2da9fe5
      Ben Widawsky 提交于
      This originates from a hack by me to quickly fix a bug in an earlier
      patch where we needed control over whether or not waiting on a seqno
      actually did any retire list processing. Since the two operations aren't
      clearly related, we should pull the parameter out of the wait function,
      and make the caller responsible for retiring if the action is desired.
      
      The only function call site which did not get an explicit retire_request call
      (on purpose) is i915_gem_inactive_shrink(). That code was already calling
      retire_request a second time.
      
      v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b2da9fe5
  18. 24 4月, 2012 2 次提交
  19. 18 4月, 2012 2 次提交
  20. 13 4月, 2012 2 次提交
    • B
      drm/i915: use semaphores for the display plane · 2911a35b
      Ben Widawsky 提交于
      In theory this will have performance and power improvements. Performance
      because we don't need to stall when the scanout BO is busy, and power
      because we don't have to stall when the BO is busy (and the ring can
      even go to sleep if the HW supports it).
      
      v2:
      squash 2 patches into 1 (me)
      un-inline the enable_semaphores function (Daniel)
      remove comment about SNB hangs from i915_gem_object_sync (Chris)
      rename intel_enable_semaphores to i915_semaphore_is_enabled (me)
      removed page flip comment; "no why" (Chris)
      
      To address other comments from Daniel (irc):
      update the comment to say 'vt-d is crap, don't enable semaphores'
        - I think you misinterpreted Chris' comment, it already exists.
      checking out whether we can pageflip on the render ring on ivb (didn't
      work on early silicon)
        - We don't want to enable workarounds for early silicon unless we have
          to.
        - I can't find any references in the docs about this.
      optionally use it if the fb is already busy on the render ring
        - This should be how the code already worked, unless I am
          misunderstanding your meaning.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2911a35b
    • C
      drm/i915: Reorganise rules for get_fence/put_fence · 9a5a53b3
      Chris Wilson 提交于
      By simplifying the rules to calling get_fence when writing to the
      through the GTT in a tiled manner, and calling put_fence before writing
      to the object through the GTT in a linear manner, the code becomes
      clearer and there is less chance of making a mistake.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      [danvet: fixed up conflict with ppgtt code and spelling in a new
      comment.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9a5a53b3
  21. 01 4月, 2012 1 次提交
  22. 27 3月, 2012 2 次提交
  23. 26 3月, 2012 1 次提交
    • C
      drm/i915: Batch copy_from_user for relocation processing · 1d83f442
      Chris Wilson 提交于
      Originally the code tried to allocate a large enough array to perform
      the copy using vmalloc, performance wasn't great and throughput was
      improved by processing each individual relocation entry separately.
      This too is not as efficient as one would desire. A compromise would be
      to allocate a single page, or to allocate a few entries on the stack,
      and process the copy in batches. The latter gives simpler code and more
      consistent performance due to a lack of heuristic.
      
      x11perf -copywinwin10:	n450/pnv	i3-330m		i5-2520m (cpu)
                     before: 	  249000	 785000		 1280000 (80%)
                       page:	  264000	 896000		 1280000 (65%)
                   on-stack:	  264000	 902000		 1280000 (67%)
      
      v2: Use 512-bytes of stack for batching rather than allocate a page.
      v3: Tidy the code slightly with more descriptive variable names
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      1d83f442
  24. 21 3月, 2012 1 次提交
    • D
      drm/i915: implement SNB workaround for lazy global gtt · 149c8407
      Daniel Vetter 提交于
      PIPE_CONTROL on snb needs global gtt mappings in place to workaround a
      hw gotcha. No other commands need such a workaround. Luckily we can
      detect a PIPE_CONTROL commands easily because they have a write_domain
      = I915_GEM_DOMAIN_INSTRUCTION (and nothing else has that).
      
      v2: Binding the target of such a reloc into the global gtt actually
      works instead of binding the source, which is rather pointless ...
      
      v3: Kill a superflous has_global_gtt_mapping assignement noticed by
      Chris Wilson.
      Reviewed-and-tested-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      149c8407
  25. 10 2月, 2012 1 次提交