1. 10 4月, 2015 11 次提交
    • C
      drm/i915: Allocate context objects from stolen · 149c86e7
      Chris Wilson 提交于
      As we never expose context objects directly to userspace, we can forgo
      allocating a first-class GEM object for them and prefer to use the
      limited resource of reserved/stolen memory for them. Note this means
      that their initial contents are undefined.
      
      However, a downside of using stolen objects for execlists is that we
      cannot access the physical address directly (thanks MCH!) which prevents
      their use.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      149c86e7
    • C
      drm/i915: Remove request->uniq · d7b9ca2f
      Chris Wilson 提交于
      We already assign a unique identifier to every request: seqno. That
      someone felt like adding a second one without even mentioning why and
      tweaking ABI smells very fishy.
      
      Fixes regression from
      commit b3a38998
      Author: Nick Hoath <nicholas.hoath@intel.com>
      Date:   Thu Feb 19 16:30:47 2015 +0000
      
          drm/i915: Fix a use after free, and unbalanced refcounting
      
      v2: Rebase
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Nick Hoath <nicholas.hoath@intel.com>
      Cc: Thomas Daniel <thomas.daniel@intel.com>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Jani Nikula <jani.nikula@intel.com>
      [danvet: Fixup because different merge order.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d7b9ca2f
    • C
      drm/i915: Reduce locking in execlist command submission · a6111f7b
      Chris Wilson 提交于
      This eliminates six needless spin lock/unlock pairs when writing out
      ELSP.
      
      v2: Respin with my preferred colour.
      v3: Mostly back to the original colour
      
      Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v1]
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a6111f7b
    • D
      drm/i915: Remove unused variable in intel_lrc.c · 19ee66af
      Daniel Vetter 提交于
      Already tagged this one and 0-day builder is failing me.
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      19ee66af
    • C
      drm/i915: Remove vestigal DRI1 ring quiescing code · 595e1eeb
      Chris Wilson 提交于
      After the removal of DRI1, all access to the rings are through requests
      and so we can always be sure that there is a request to wait upon to
      free up available space. The fallback code only existed so that we could
      quiesce the GPU following unmediated access by DRI1.
      
      v2: Rebase
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      595e1eeb
    • C
      drm/i915: Use the global runtime-pm wakelock for a busy GPU for execlists · 4bb1bedb
      Chris Wilson 提交于
      When we submit a request to the GPU, we first take the rpm wakelock, and
      only release it once the GPU has been idle for a small period of time
      after all requests have been complete. This means that we are sure no
      new interrupt can arrive whilst we do not hold the rpm wakelock and so
      can drop the individual get/put around every single request inside
      execlists.
      
      Note: to close one potential issue we should mark the GPU as busy
      earlier in __i915_add_request.
      
      To elaborate: The issue is that we emit the irq signalling sequence
      before we grab the rpm reference, which means we could miss the
      resulting interrupt (since that's not set up when suspended). The only
      bad side effect is a missed interrupt, gt mmio writes automatically
      wake up the hw itself. But otoh we have an umbrella rpm reference for
      the entirety of execbuf, as long as that's there we're covered.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Explain a bit more about the add_request issue, which after
      some irc chatting with Chris turns out to not be an issue really.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4bb1bedb
    • C
      drm/i915: Use simpler form of spin_lock_irq(execlist_lock) · b5eba372
      Chris Wilson 提交于
      We can use the simpler spinlock form to disable interrupts as we are
      always outside of an irq/softirq handler.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      b5eba372
    • M
      drm/i915/gen8: Dynamic page table allocations · d7b2633d
      Michel Thierry 提交于
      This finishes off the dynamic page tables allocations, in the legacy 3
      level style that already exists. Most everything has already been setup
      to this point, the patch finishes off the enabling by setting the
      appropriate function pointers.
      
      In LRC mode, contexts need to know the PDPs when they are populated. With
      dynamic page table allocations, these PDPs may not exist yet. Check if
      PDPs have been allocated and use the scratch page if they do not exist yet.
      
      Before submission, update the PDPs in the logic ring context as PDPs
      have been allocated.
      
      v2: Update aliasing/true ppgtt allocate/teardown/clear functions for
      gen 6 & 7.
      
      v3: Rebase.
      
      v4: Remove BUG() from ppgtt_unbind_vma, but keep checking that either
      teardown_va_range or clear_range functions exist (Daniel).
      
      v5: Similar to gen6, in init, gen8_ppgtt_clear_range call is only needed
      for aliasing ppgtt. Zombie tracking was originally added for teardown
      function and is no longer required.
      
      v6: Update err_out case in gen8_alloc_va_range (missed from lastest
      rebase).
      
      v7: Rebase after s/page_tables/page_table/.
      
      v8: Updated scratch_pt check after scratch flag was removed in previous
      patch.
      
      v9: Note that lrc mode needs to be updated to support init state without
      any PDP.
      
      v10: Unmap correct page_table in gen8_alloc_va_range's error case,  clean-up
      gen8_aliasing_ppgtt_init (remove duplicated map), and initialize PTs
      during page table allocation.
      
      v11: Squashed LRC enabling commit, otherwise LRC mode would be left broken
      until it was updated to handle the init case without any PDP.
      
      v12: Do not overallocate new_pts bitmap, make alloc_gen8_temp_bitmaps
      static and don't abuse of inline functions. (Mika)
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      d7b2633d
    • M
      drm/i915/gen8: Split out mappings · e5815a2e
      Michel Thierry 提交于
      When we do dynamic page table allocations for gen8, we'll need to have
      more control over how and when we map page tables, similar to gen6.
      In particular, DMA mappings for page directories/tables occur at allocation
      time.
      
      This patch adds the functionality and calls it at init, which should
      have no functional change.
      
      The PDPEs are still a special case for now. We'll need a function for
      that in the future as well.
      
      v2: Handle renamed unmap_and_free_page functions.
      v3: Updated after teardown_va logic was removed.
      v4: Rebase after s/page_tables/page_table/.
      v5: No longer allocate all PDPs in GEN8+ systems with less than 4GB of
      memory, and update populate_lr_context to handle this new case (proper
      tracking will be added later in the patch series).
      v6: Assign lrc page directory pointer addresses using a macro. (Mika)
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e5815a2e
    • C
      drm/i915: Split the batch pool by engine · 06fbca71
      Chris Wilson 提交于
      I woke up one morning and found 50k objects sitting in the batch pool
      and every search seemed to iterate the entire list... Painting the
      screen in oils would provide a more fluid display.
      
      One issue with the current design is that we only check for retirements
      on the current ring when preparing to submit a new batch. This means
      that we can have thousands of "active" batches on another ring that we
      have to walk over. The simplest way to avoid that is to split the pools
      per ring and then our LRU execution ordering will also ensure that the
      inactive buffers remain at the front.
      
      v2: execlists still requires duplicate code.
      v3: execlists requires more duplicate code
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      06fbca71
    • A
      drm/i915: Do not set L3-LLC Coherency bit in ctx descriptor · 51847fb9
      Arun Siluvery 提交于
      According to Spec this is a reserved bit for Gen9+ and should not be set.
      
      Change-Id: I0215fb7057b94139b7a2f90ecc7a0201c0c93ad4
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      51847fb9
  2. 01 4月, 2015 3 次提交
  3. 26 2月, 2015 3 次提交
    • J
      drm/i915: Cache ringbuf pointer in request structure · 98e1bd4a
      John Harrison 提交于
      In execlist mode, the ringbuf is a function of the ring and context whereas in
      legacy mode, it is derived from the ring alone. Thus the calculation required to
      determine the ringbuf pointer from the ring (and context) also needs to test
      execlist mode or not. This is messy.
      
      Further, the request structure holds a pointer to both the ring and the context
      for which it was created. Thus, given a request, it is possible to derive the
      ringbuf in either legacy or execlist mode. Hence it is necessary to pass just
      the request in to all the low level functions rather than some combination of
      request, ring, context and ringbuf. However, rather than recalculating it each
      time, it is much simpler to just cache the ringbuf pointer in the request
      structure itself.
      
      Caching the pointer means the calculation is done once at request creation time
      and all further code and simply read it directly from the request structure.
      
      OTC-Jira: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      [danvet: Drop contentless comment in lrc alloc request entirely. And
      spelling fix in the commit message.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      98e1bd4a
    • J
      drm/i915: Add missing trace point to LRC execbuff code path · 5e4be7bd
      John Harrison 提交于
      There is a trace point in the legacy execbuffer execution path that is missing
      from the execlist path. Trace points are extremely useful for debugging and are
      used by various automated validation tests. Hence, this patch adds the missing
      trace point back in.
      
      OTC-Jira: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      5e4be7bd
    • J
      drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading · 8e004efc
      John Harrison 提交于
      There is a flags word that is passed through the execbuffer code path all the
      way from initial decoding of the user parameters down to the very final dispatch
      buffer call. It is simply called 'flags'. Unfortuantely, there are many other
      flags words floating around in the same blocks of code. Even more once the GPU
      scheduler arrives.
      
      This patch makes it more obvious exactly which flags word is which by renaming
      'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word
      already have an 'I915_DISPATCH_' prefix on them and so are not quite so
      ambiguous.
      
      OTC-Jira: VIZ-1587
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      [danvet: Resolve conflict with Chris' rework of the bb parsing.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8e004efc
  4. 25 2月, 2015 2 次提交
    • B
      drm/i915: Create page table allocators · 06fda602
      Ben Widawsky 提交于
      As we move toward dynamic page table allocation, it becomes much easier
      to manage our data structures if break do things less coarsely by
      breaking up all of our actions into individual tasks.  This makes the
      code easier to write, read, and verify.
      
      Aside from the dissection of the allocation functions, the patch
      statically allocates the page table structures without a page directory.
      This remains the same for all platforms,
      
      The patch itself should not have much functional difference. The primary
      noticeable difference is the fact that page tables are no longer
      allocated, but rather statically declared as part of the page directory.
      This has non-zero overhead, but things gain additional complexity as a
      result.
      
      This patch exists for a few reasons:
      1. Splitting out the functions allows easily combining GEN6 and GEN8
      code. Page tables have no difference based on GEN8. As we'll see in a
      future patch when we add the DMA mappings to the allocations, it
      requires only one small change to make work, and error handling should
      just fall into place.
      
      2. Unless we always want to allocate all page tables under a given PDE,
      we'll have to eventually break this up into an array of pointers (or
      pointer to pointer).
      
      3. Having the discrete functions is easier to review, and understand.
      All allocations and frees now take place in just a couple of locations.
      Reviewing, and catching leaks should be easy.
      
      4. Less important: the GFP flags are confined to one location, which
      makes playing around with such things trivial.
      
      v2: Updated commit message to explain why this patch exists
      
      v3: For lrc, s/pdp.page_directory[i].daddr/pdp.page_directory[i]->daddr/
      
      v4: Renamed free_pt/pd_single functions to unmap_and_free_pt/pd (Daniel)
      
      v5: Added additional safety checks in gen8 clear/free/unmap.
      
      v6: Use WARN_ON and return -EINVAL in alloc_pt_range (Mika).
      
      v7: Make err_out loop symmetrical to the way we allocate in
      alloc_pt_range. Also s/page_tables/page_table and correct commit
      message (Mika)
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      06fda602
    • B
      drm/i915: Complete page table structures · 7324cc04
      Ben Widawsky 提交于
      Move the remaining members over to the new page table structures.
      
      This can be squashed with the previous commit if desire. The reasoning
      is the same as that patch. I simply felt it is easier to review if split.
      
      v2: In lrc: s/ppgtt->pd_dma_addr[i]/ppgtt->pdp.page_directory[i].daddr/
      v3: Rebase.
      v4: Rebased after s/page_tables/page_table/.
      Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7324cc04
  5. 24 2月, 2015 3 次提交
    • N
      drm/i915: Fix a use after free, and unbalanced refcounting · b3a38998
      Nick Hoath 提交于
      When converting from implicitly tracked execlist queue items to ref counted
      requests, not all frees of requests were replaced with unrefs, and extraneous
      refs/unrefs of contexts were added.
      Correct the unbalanced refcount & replace the frees.
      Remove a noisy warning when hitting the request creation path.
      
      drm_i915_gem_request and intel_context are both kref reference counted
      structures. Upon allocation, drm_i915_gem_request's ref count should be
      bumped using kref_init. When a context is assigned to the request,
      the context's reference count should be bumped using i915_gem_context_reference.
      i915_gem_request_reference will reduce the context reference count when
      the request is freed.
      
      Problem introduced in
      commit 6d3d8274
      Author:     Nick Hoath <nicholas.hoath@intel.com>
      AuthorDate: Thu Jan 15 13:10:39 2015 +0000
      
           drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request
      
      v2: Added comments explaining how the ctx pointer and the request object should
      be ref-counted. Removed noisy warning.
      
      v3: Cleaned up the language used in the commit & the header
      description (Thanks David Gordon)
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652Signed-off-by: NNick Hoath <nicholas.hoath@intel.com>
      Reviewed-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NDaniel Vetter <daniel@ffwll.ch>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      b3a38998
    • T
      drm/i915: Reset logical ring contexts' head and tail during GPU reset · 3e5b6f05
      Thomas Daniel 提交于
      Work was getting left behind in LRC contexts during reset.  This causes a hang
      if the GPU is reset when HEAD==TAIL because the context's ringbuffer head and
      tail don't get reset and retiring a request doesn't alter them, so the ring
      still appears full.
      
      Added a function intel_lr_context_reset() to reset head and tail on a LRC and
      its ringbuffer.
      
      Call intel_lr_context_reset() for each context in i915_gem_context_reset() when
      in execlists mode.
      
      Testcase: igt/pm_rps --run-subtest reset #bdw
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88096Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
      [danvet: Flatten control flow in the lrc reset code a notch.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      3e5b6f05
    • J
      drm/i915: Request full SSEU enablement on Gen9 · 0cea6502
      Jeff McGee 提交于
      On Gen9 the render power gating can leave slice/subslice/EU in
      a partially enabled state. We must make an explicit request for
      full SSEU enablement through the Render Power Clock State
      register when resuming render work. This register is save/
      restored in the logical ring context image for execlist
      submission mode. Initialize its value in each LRC image to
      request full enablement according to the device SSEU config.
      
      Thanks to Sharma Ankitprasad and Akash Goel for highlighting the
      issue and proposing the initial fix on which this patch is based.
      
      v2: Adjusted the names of the power gating support flags to fit
          update of an earlier patch.
      Signed-off-by: NJeff McGee <jeff.mcgee@intel.com>
      Reviewed-by: "Akash Goel <akash.goel@intel.com>"
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      0cea6502
  6. 14 2月, 2015 5 次提交
  7. 10 2月, 2015 1 次提交
    • C
      drm/i915: Insert a command barrier on BLT/BSD cache flushes · f0a1fb10
      Chris Wilson 提交于
      This looked like an odd regression from
      
      commit ec5cc0f9
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Thu Jun 12 10:28:55 2014 +0100
      
          drm/i915: Restrict GPU boost to the RCS engine
      
      but in reality it undercovered a much older coherency bug. The issue that
      boosting the GPU frequency on the BCS ring was masking was that we could
      wake the CPU up after completion of a BCS batch and inspect memory prior
      to the write cache being fully evicted. In order to serialise the
      breadcrumb interrupt (and so ensure that the CPU's view of memory is
      coherent) we need to perform a post-sync operation in the MI_FLUSH_DW.
      
      v2: Fix all the MI_FLUSH_DW (bsd plus the duplication in execlists).
      
      Also fix the invalidate_domains mask in gen8_emit_flush() for ring !=
      VCS.
      
      Testcase: gpuX-rcs-gpu-read-after-write
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: stable@vger.kernel.org
      Acked-by: NDaniel Vetter <daniel@ffwll.ch>
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      f0a1fb10
  8. 31 1月, 2015 1 次提交
  9. 27 1月, 2015 8 次提交
  10. 13 1月, 2015 1 次提交
    • T
      drm/i915: Reset CSB read pointer in ring init · c0a03a2e
      Thomas Daniel 提交于
      A previous commit enabled execlists by default:
      
             commit 27401d12 ("drm/i915/bdw: Enable execlists by default where supported")
      
      This allowed routine testing of execlists which exposed a regression when
      resuming from suspend.  The cause was tracked down the to recent changes to the
      ring init sequence:
      
             commit 35a57ffb ("drm/i915: Only init engines once")
      
      During a suspend/resume cycle the hardware Context Status Buffer write pointer
      is reset.  However since the recent changes to the init sequence the software CSB
      read pointer is no longer reset.  This means that context status events are not
      handled correctly and new contexts are not written to the ELSP, resulting in an
      apparent GPU hang.
      
      Pending further changes to the ring init code, just move the
      ring->next_context_status_buffer initialization into gen8_init_common_ring to
      fix this regression.
      
      v2: Moved init into gen8_init_common_ring rather than context_enable after
      feedback from Daniel Vetter.  Updated commit msg to reflect this and also cite
      commits related to the regression.  Fixed bz link to correct bug.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88096
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Gordon <david.s.gordon@intel.com>
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NDave Gordon <david.s.gordon@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c0a03a2e
  11. 16 12月, 2014 1 次提交
  12. 15 12月, 2014 1 次提交
    • T
      drm/i915/bdw: Enable execlists by default where supported · 27401d12
      Thomas Daniel 提交于
      Execlist support in the i915 driver is now considered good enough for the
      feature to be enabled by default on Gen8 and later and routinely tested.
      Adjusted i915 parameters structure initialization to reflect this and updated
      the comment in intel_sanitize_enable_execlists().
      
      There's still work to do before we can let the wider massive onto it,
      but there's still time left before the 3.20 cutoff.
      
      v2: Update the MODULE_PARM_DESC too.
      
      Issue: VIZ-2020
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      [danvet: Add note that there's still some work left to do.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      27401d12