1. 18 12月, 2013 2 次提交
    • B
      drm/i915: Create bind/unbind abstraction for VMAs · 6f65e29a
      Ben Widawsky 提交于
      To sum up what goes on here, we abstract the vma binding, similarly to
      the previous object binding. This helps for distinguishing legacy
      binding, versus modern binding. To keep the code churn as minimal as
      possible, I am leaving in insert_entries(). It serves as the per
      platform pte writing basically. bind_vma and insert_entries do share a
      lot of similarities, and I did have designs to combine the two, but as
      mentioned already... too much churn in an already massive patchset.
      
      What follows are the 3 commits which existed discretely in the original
      submissions. Upon rebasing on Broadwell support, it became clear that
      separation was not good, and only made for more error prone code. Below
      are the 3 commit messages with all their history.
      
      drm/i915: Add bind/unbind object functions to VMA
      drm/i915: Use the new vm [un]bind functions
      drm/i915: reduce vm->insert_entries() usage
      
      drm/i915: Add bind/unbind object functions to VMA
      
      As we plumb the code with more VM information, it has become more
      obvious that the easiest way to deal with bind and unbind is to simply
      put the function pointers in the vm, and let those choose the correct
      way to handle the page table updates. This change allows many places in
      the code to simply be vm->bind, and not have to worry about
      distinguishing PPGTT vs GGTT.
      
      Notice that this patch has no impact on functionality. I've decided to
      save the actual change until the next patch because I think it's easier
      to review that way. I'm happy to squash the two, or let Daniel do it on
      merge.
      
      v2:
      Make ggtt handle the quirky aliasing ppgtt
      Add flags to bind object to support above
      Don't ever call bind/unbind directly for PPGTT until we have real, full
      PPGTT (use NULLs to assert this)
      Make sure we rebind the ggtt if there already is a ggtt binding.  This
      happens on set cache levels.
      Use VMA for bind/unbind (Daniel, Ben)
      
      v3: Reorganize ggtt_vma_bind to be more concise and easier to read
      (Ville). Change logic in unbind to only unbind ggtt when there is a
      global mapping, and to remove a redundant check if the aliasing ppgtt
      exists.
      
      v4: Make the bind function a bit smarter about the cache levels to avoid
      unnecessary multiple remaps. "I accept it is a wart, I think unifying
      the pin_vma / bind_vma could be unified later" (Chris)
      Removed the git notes, and put version info here. (Daniel)
      
      v5: Update the comment to not suck (Chris)
      
      v6:
      Move bind/unbind to the VMA. It makes more sense in the VMA structure
      (always has, but I was previously lazy). With this change, it will allow
      us to keep a distinct insert_entries.
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      
      drm/i915: Use the new vm [un]bind functions
      
      Building on the last patch which created the new function pointers in
      the VM for bind/unbind, here we actually put those new function pointers
      to use.
      
      Split out as a separate patch to aid in review. I'm fine with squashing
      into the previous patch if people request it.
      
      v2: Updated to address the smart ggtt which can do aliasing as needed
      Make sure we bind to global gtt when mappable and fenceable. I thought
      we could get away without this initialy, but we cannot.
      
      v3: Make the global GTT binding explicitly use the ggtt VM for
      bind_vma(). While at it, use the new ggtt_vma helper (Chris)
      
      At this point the original mailing list thread diverges. ie.
      
      v4^:
      use target_obj instead of obj for gen6 relocate_entry
      vma->bind_vma() can be called safely during pin. So simply do that
      instead of the complicated conditionals.
      Don't restore PPGTT bound objects on resume path
      Bug fix in resume path for globally bound Bos
      Properly handle secure dispatch
      Rebased on vma bind/unbind conversion
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      
      drm/i915: reduce vm->insert_entries() usage
      
      FKA: drm/i915: eliminate vm->insert_entries()
      
      With bind/unbind function pointers in place, we no longer need
      insert_entries. We could, and want, to remove clear_range, however it's
      not totally easy at this point. Since it's used in a couple of place
      still that don't only deal in objects: setup, ppgtt init, and restore
      gtt mappings.
      
      v2: Don't actually remove insert_entries, just limit its usage. It will
      be useful when we introduce gen8. It will always be called from the vma
      bind/unbind.
      
      Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6f65e29a
    • B
      drm/i915: Make pin count per VMA · d7f46fc4
      Ben Widawsky 提交于
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d7f46fc4
  2. 26 11月, 2013 1 次提交
  3. 09 11月, 2013 1 次提交
  4. 07 11月, 2013 1 次提交
  5. 16 10月, 2013 1 次提交
  6. 01 10月, 2013 1 次提交
  7. 20 9月, 2013 2 次提交
    • B
      drm/i915: Do remaps for all contexts · 3ccfd19d
      Ben Widawsky 提交于
      On both Ivybridge and Haswell, row remapping information is saved and
      restored with context. This means, we never actually properly supported
      the l3 remapping because our sysfs interface is asynchronous (and not
      tied to any context), and the known faulty HW would be reused by the
      next context to run.
      
      Not that due to the asynchronous nature of the sysfs entry, there is no
      point modifying the registers for the existing context. Instead we set a
      flag for all contexts to load the correct remapping information on the
      next run. Interested clients can use debugfs to determine whether or not
      the row has been remapped.
      
      One could propose at this point that we just do the remapping in the
      kernel. I guess since we have to maintain the sysfs interface anyway,
      I'm not sure how useful it is, and I do like keeping the policy in
      userspace; (it wasn't my original decision to make the
      interface the way it is, so I'm not attached).
      
      v2: Force a context switch when we have a remap on the next switch.
      (Ville)
      Don't let userspace use the interface with disabled contexts.
      
      v3: Don't force a context switch, just let it nop
      Improper context slice remap initialization, 1<<1 instead of 1<<i, but I
      rewrote it to avoid a second round of confusion.
      Error print moved to error path (All Ville)
      Added a comment on why the slice remap initialization happens.
      
      CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3ccfd19d
    • B
      drm/i915: Keep a list of all contexts · a33afea5
      Ben Widawsky 提交于
      I have implemented this patch before without creating a separate list
      (I'm having trouble finding the links, but the messages ids are:
      <1364942743-6041-2-git-send-email-ben@bwidawsk.net>
      <1365118914-15753-9-git-send-email-ben@bwidawsk.net>)
      
      However, the code is much simpler to just use a list and it makes the
      code from the next patch a lot more pretty.
      
      As you'll see in the next patch, the reason for this is to be able to
      specify when a context needs to get L3 remapping. More details there.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a33afea5
  8. 04 9月, 2013 2 次提交
    • D
      drm/i915: It's its! · 508842a0
      Damien Lespiau 提交于
      Signed-off-by: NDamien Lespiau <damien.lespiau@intel.com>
      Acked-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      508842a0
    • C
      drm/i915: Do not add an interrupt for a context switch · c0321e2c
      Chris Wilson 提交于
      We use the request to ensure we hold a reference to the context for the
      duration that it remains in use by the ring. Each request only holds a
      reference to the current context, hence we emit a request after
      switching contexts with the final reference to the old context. However,
      the extra interrupt caused by that request is not useful (no timing
      critical function will wait for the context object), instead the overhead
      of servicing the IRQ shows up in some (lightweight) benchmarks. In order
      to keep the useful property of using the request to manage the context
      lifetime, we want to add a dummy request that is associated with the
      interrupt from the subsequent real request following the batch.
      
      The extra interrupt was added as a side-effect of using
      i915_add_request() in
      
      commit 112522f6
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Thu May 2 16:48:07 2013 +0300
      
          drm/i915: put context upon switching
      
      v2: Daniel convinced me that the request here was solely for context
      lifetime tracking and that we have the active ref to keep the object
      alive whilst the MI_SET_CONTEXT. So the only concern then is which
      context should get the blame for MI_SET_CONTEXT failing. The old scheme
      added a request for the old context so that any hang upto and including
      the switch away would mark the old context as guilty. Now any hang here
      implicates the new context. However since we have already gone through a
      complete flush with the last context in its last request, and all that
      lies in no-man's-land is an invalidate flush and the MI_SET_CONTEXT, we
      should be safe in not unduly placing blame on the new context.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c0321e2c
  9. 08 8月, 2013 1 次提交
    • B
      drm/i915: mm_list is per VMA · ca191b13
      Ben Widawsky 提交于
      formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
      
      The mm_list is used for the active/inactive LRUs. Since those LRUs are
      per address space, the link should be per VMx .
      
      Because we'll only ever have 1 VMA before this point, it's not incorrect
      to defer this change until this point in the patch series, and doing it
      here makes the change much easier to understand.
      
      Shamelessly manipulated out of Daniel:
      "active/inactive stuff is used by eviction when we run out of address
      space, so needs to be per-vma and per-address space. Bound/unbound otoh
      is used by the shrinker which only cares about the amount of memory used
      and not one bit about in which address space this memory is all used in.
      Of course to actual kick out an object we need to unbind it from every
      address space, but for that we have the per-object list of vmas."
      
      v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
      
      v3: Moved earlier in the series
      
      v4: Add dropped message from v3
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      [danvet: Frob patch to apply and use vma->node.size directly as
      discused with Ben. Also drop a needles BUG_ON before move_to_inactive,
      the function itself has the same check.]
      [danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA
      in destroy", specifically unlink the vma from the mm_list in
      vma_unbind (to keep it symmetric with bind_to_vm) instead of
      vma_destroy.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ca191b13
  10. 06 8月, 2013 2 次提交
  11. 16 7月, 2013 1 次提交
  12. 09 7月, 2013 1 次提交
    • B
      drm/i915: Getter/setter for object attributes · f343c5f6
      Ben Widawsky 提交于
      Soon we want to gut a lot of our existing assumptions how many address
      spaces an object can live in, and in doing so, embed the drm_mm_node in
      the object (and later the VMA).
      
      It's possible in the future we'll want to add more getter/setter
      methods, but for now this is enough to enable the VMAs.
      
      v2: Reworked commit message (Ben)
      Added comments to the main functions (Ben)
      sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
      sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
      sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
      sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
      (Daniel)
      
      v3: Rebased on new reserve_node patch
      Changed DRM_DEBUG_KMS to actually work (will need fixing later)
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      f343c5f6
  13. 01 7月, 2013 1 次提交
    • B
      drm/i915: Fix context sizes on HSW · a0de80a0
      Ben Widawsky 提交于
      With updates to the spec, we can actually see the context layout, and
      how many dwords are allocated. That table suggests we need 70720 bytes
      per HW context. Rounded up, this is 18 pages. Looking at what lives
      after the current 4 pages we use, I can't see too much important (mostly
      it's d3d related), but there are a couple of things which look scary. I
      am hopeful this can explain some of our odd HSW failures.
      
      v2: Make the context only 17 pages. The power context space isn't used
      ever, and execlists aren't used in our driver, making the actual total
      66944 bytes.
      
      v3: Add a comment to the code. (Jesse & Paulo)
      Reported-by: N"Azad, Vinit" <vinit.azad@intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: NJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a0de80a0
  14. 13 6月, 2013 2 次提交
  15. 01 6月, 2013 1 次提交
    • B
      drm/i915: context debug messages · bb036413
      Ben Widawsky 提交于
      Add some debug messages to help figure out what goes wrong on context
      initialization.
      
      Later in the PPGTT series, I ended up having a lot of failures after
      reset. In many cases it was extra difficult to debug because I hadn't
      even realized that contexts failed to reinitialize after reset (again an
      artifact of some later patches).
      
      This fairly benign patch does help debug some potential issues which
      arise later.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      bb036413
  16. 11 5月, 2013 1 次提交
  17. 06 5月, 2013 2 次提交
  18. 04 5月, 2013 1 次提交
  19. 01 5月, 2013 1 次提交
    • M
      drm/i915: reference count for i915_hw_contexts · dce3271b
      Mika Kuoppala 提交于
      Enabling PPGTT and also the need to track which context was guilty of
      gpu hang (arb robustness enabling) have put pressure for struct i915_hw_context
      to be more than just a placeholder for hw context state.
      
      In order to track object lifetime properly in a multi peer usage, add reference
      counting for i915_hw_context.
      
      v2: track i915_hw_context pointers instead of using ctx_ids
      (from Chris Wilson)
      
      v3 (Ben): Get rid of do_release() and handle refcounting more compactly.
      (recommended by Chis)
      
      v4: kref_* put inside static inlines (Daniel Vetter)
      remove code duplication on freeing context (Chris Wilson)
      
      v5: idr_remove and ctx->file_priv = NULL in destroy ioctl (Chris)
      This actually will cause a problem if one destroys a context and later
      refers to the idea of the context (multiple contexts may have the same
      id, but only 1 will exist in the idr).
      
      v6: Strip out the request related stuff. Reworded commit message.
      Got rid of do_destroy and introduced i915_gem_context_release_handle,
      suggested by Chris Wilson.
      
      v7: idr_remove can't be called inside idr_for_each (Chris Wilson)
      
      Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v5)
      Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v7)
      Reviewed-by: NBen Widawsky <ben@bwidawsk.net>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Squash sob lines, the patch ping-ponged between Ben and Mika
      a bit ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      dce3271b
  20. 18 4月, 2013 1 次提交
    • C
      drm/i915: Use MLC (l3$) for context objects · 4615d4c9
      Chris Wilson 提交于
      Enabling context support increases SwapBuffers latency by about 20%
      (measured on an i7-3720qm). We can offset that loss slightly by enabling
      faster caching for the contexts. As they are not backed by any
      particular cache (such as the sampler or render caches) our only option
      is to select the generic mid-level cache. This reduces the latency of
      the swap by about 5%.
      
      Oddly this effect can be observed running smokin-guns on IVB at
      1280x1024:
      Using BLT copies for swaps: 151.67 fps
      Using Render copies for swaps (unpatched):  141.70 fps
      With contexts disabled: 150.23 fps
      With contexts in L3$: 150.77 fps
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Cc: Kenneth Graunke <kenneth@whitecape.org>
      Reviewed-by: NKenneth Graunke <kenneth@whitecape.org>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4615d4c9
  21. 28 2月, 2013 1 次提交
  22. 15 2月, 2013 2 次提交
  23. 29 11月, 2012 1 次提交
    • C
      drm/i915: Preallocate next seqno before touching the ring · 9d773091
      Chris Wilson 提交于
      Based on the work by Mika Kuoppala, we realised that we need to handle
      seqno wraparound prior to committing our changes to the ring. The most
      obvious point then is to grab the seqno inside intel_ring_begin(), and
      then to reuse that seqno for all ring operations until the next request.
      As intel_ring_begin() can fail, the callers must already be prepared to
      handle such failure and so we can safely add further checks.
      
      This patch looks like it should be split up into the interface
      changes and the tweaks to move seqno wrapping from the execbuffer into
      the core seqno increment. However, I found no easy way to break it into
      incremental steps without introducing further broken behaviour.
      
      v2: Mika found a silly mistake and a subtle error in the existing code;
      inside i915_gem_retire_requests() we were resetting the sync_seqno of
      the target ring based on the seqno from this ring - which are only
      related by the order of their allocation, not retirement. Hence we were
      applying the optimisation that the rings were synchronised too early,
      fortunately the only real casualty there is the handling of seqno
      wrapping.
      
      v3: Do not forget to reset the sync_seqno upon module reinitialisation,
      ala resume.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861
      Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      9d773091
  24. 12 11月, 2012 1 次提交
  25. 03 10月, 2012 1 次提交
  26. 02 10月, 2012 1 次提交
    • C
      drm/i915: Actually invalidate the TLB for the SandyBridge HW contexts w/a · ac82ea2e
      Chris Wilson 提交于
      A side-effect of commit 7d54a904
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Fri Aug 10 10:18:10 2012 +0100
      
          drm/i915: Apply post-sync write for pipe control invalidates
      
      was that only a request to emit invalidate flush would result in the
      TLB being invalidated (since it requires synchronisation and so incurs a
      performance penalty). However, the stated w/a for hardware contexts is
      that the TLBs must be invalidated prior to a MI_SET_CONTEXT, yet the w/a
      itself did not request the TLBs to be invalidated...
      
      Note this w/a does not prevent the hard system hang I experience when
      using hw contexts (with rc6 enabled) on SNB GT1.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Widawsky <ben@bwidawsk.net>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ac82ea2e
  27. 24 8月, 2012 1 次提交
  28. 06 8月, 2012 1 次提交
  29. 26 7月, 2012 2 次提交
  30. 25 7月, 2012 3 次提交