1. 23 6月, 2015 11 次提交
    • J
      drm/i915: Update ring->dispatch_execbuffer() to take a request structure · 53fddaf7
      John Harrison 提交于
      Updated the various ring->dispatch_execbuffer() implementations to take a
      request instead of a ring.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      53fddaf7
    • J
      drm/i915: Update ring->emit_request() to take a request structure · c4e76638
      John Harrison 提交于
      Updated the ring->emit_request() implementation to take a request instead of a
      ringbuf/request pair. Also removed its use of the OLR for obtaining the
      request's seqno.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c4e76638
    • J
      drm/i915: Update ring->add_request() to take a request structure · ee044a88
      John Harrison 提交于
      Updated the various ring->add_request() implementations to take a request
      instead of a ring. This removes their reliance on the OLR to obtain the seqno
      value that the request should be tagged with.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      ee044a88
    • J
      drm/i915: Update ring->emit_flush() to take a request structure · 7deb4d39
      John Harrison 提交于
      Updated the various ring->emit_flush() implementations to take a request instead
      of a ringbuf/context pair.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7deb4d39
    • J
      drm/i915: Update ring->flush() to take a requests structure · a84c3ae1
      John Harrison 提交于
      Updated the various ring->flush() functions to take a request instead of a ring.
      Also updated the tracer to include the request id.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      [danvet: Rebase since I didn't merge the addition of req->uniq.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a84c3ae1
    • J
      drm/i915: Update flush_all_caches() to take request structures · 4866d729
      John Harrison 提交于
      Updated the *_ring_flush_all_caches() functions to take requests instead of
      rings or ringbuf/context pairs.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      4866d729
    • J
      drm/i915: Update a bunch of execbuffer helpers to take request structures · 2f20055d
      John Harrison 提交于
      Updated *_ring_invalidate_all_caches(), i915_reset_gen7_sol_offsets() and
      i915_emit_box() to take request structures instead of ring or ringbuf/context
      pairs.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      2f20055d
    • J
      drm/i915: Update queue_flip() to take a request structure · 6258fbe2
      John Harrison 提交于
      Updated the display page flip code to do explicit request creation and
      submission rather than relying on the OLR and just hoping that the request
      actually gets submitted at some random point.
      
      The sequence is now to create a request, queue the work to the ring, assign the
      known request to the flip queue work item then actually submit the work and post
      the request.
      
      Note that every single flip function used to finish with
      '__intel_ring_advance(ring);'. However, immediately after they return there is
      now an add request call which will do the advance anyway. Thus the many
      duplicate advance calls have been removed.
      
      v2: Updated commit message with comment about advance removal.
      
      v3: The request can now be allocated by the _sync() code earlier on. Thus the
      page flip path does not necessarily need to allocate a new request, it may be
      able to re-use one.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      6258fbe2
    • J
      drm/i915: Update init_context() to take a request structure · 8753181e
      John Harrison 提交于
      Now that everything above has been converted to use requests, it is possible to
      update init_context() to take a request pointer instead of a ring/context pair.
      
      For: VIZ-5115
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: NTomas Elf <tomas.elf@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8753181e
    • J
      drm/i915: Reserve ring buffer space for i915_add_request() commands · 29b1b415
      John Harrison 提交于
      It is a bad idea for i915_add_request() to fail. The work will already have been
      send to the ring and will be processed, but there will not be any tracking or
      management of that work.
      
      The only way the add request call can fail is if it can't write its epilogue
      commands to the ring (cache flushing, seqno updates, interrupt signalling). The
      reasons for that are mostly down to running out of ring buffer space and the
      problems associated with trying to get some more. This patch prevents that
      situation from happening in the first place.
      
      When a request is created, it marks sufficient space as reserved for the
      epilogue commands. Thus guaranteeing that by the time the epilogue is written,
      there will be plenty of space for it. Note that a ring_begin() call is required
      to actually reserve the space (and do any potential waiting). However, that is
      not currently done at request creation time. This is because the ring_begin()
      code can allocate a request. Hence calling begin() from the request allocation
      code would lead to infinite recursion! Later patches in this series remove the
      need for begin() to do the allocate. At that point, it becomes safe for the
      allocate to call begin() and really reserve the space.
      
      Until then, there is a potential for insufficient space to be available at the
      point of calling i915_add_request(). However, that would only be in the case
      where the request was created and immediately submitted without ever calling
      ring_begin() and adding any work to that request. Which should never happen. And
      even if it does, and if that request happens to fall down the tiny window of
      opportunity for failing due to being out of ring space then does it really
      matter because the request wasn't doing anything in the first place?
      
      v2: Updated the 'reserved space too small' warning to include the offending
      sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
      re-initialisation of tracking state after a buffer wrap to keep the sanity
      checks accurate.
      
      v3: Incremented the reserved size to accommodate Ironlake (after finally
      managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
      
      v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
      
      For: VIZ-5115
      CC: Tomas Elf <tomas.elf@intel.com>
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      29b1b415
    • A
      drm/i915/gen8: Add infrastructure to initialize WA batch buffers · 17ee950d
      Arun Siluvery 提交于
      Some of the WA are to be applied during context save but before restore and
      some at the end of context save/restore but before executing the instructions
      in the ring, WA batch buffers are created for this purpose and these WA cannot
      be applied using normal means. Each context has two registers to load the
      offsets of these batch buffers. If they are non-zero, HW understands that it
      need to execute these batches.
      
      v1: In this version two separate ring_buffer objects were used to load WA
      instructions for indirect and per context batch buffers and they were part
      of every context.
      
      v2: Chris suggested to include additional page in context and use it to load
      these WA instead of creating separate objects. This will simplify lot of things
      as we need not explicity pin/unpin them. Thomas Daniel further pointed that GuC
      is planning to use a similar setup to share data between GuC and driver and
      WA batch buffers can probably share that page. However after discussions with
      Dave who is implementing GuC changes, he suggested to use an independent page
      for the reasons - GuC area might grow and these WA are initialized only once and
      are not changed afterwards so we can share them share across all contexts.
      
      The page is updated with WA during render ring init. This has an advantage of
      not adding more special cases to default_context.
      
      We don't know upfront the number of WA we will applying using these batch buffers.
      For this reason the size was fixed earlier but it is not a good idea. To fix this,
      the functions that load instructions are modified to report the no of commands
      inserted and the size is now calculated after the batch is updated. A macro is
      introduced to add commands to these batch buffers which also checks for overflow
      and returns error.
      We have a full page dedicated for these WA so that should be sufficient for
      good number of WA, anything more means we have major issues.
      The list for Gen8 is small, same for Gen9 also, maybe few more gets added
      going forward but not close to filling entire page. Chris suggested a two-pass
      approach but we agreed to go with single page setup as it is a one-off routine
      and simpler code wins.
      
      One additional option is offset field which is helpful if we would like to
      have multiple batches at different offsets within the page and select them
      based on some criteria. This is not a requirement at this point but could
      help in future (Dave).
      
      Chris provided some helpful macros and suggestions which further simplified
      the code, they will also help in reducing code duplication when WA for
      other Gen are added. Add detailed comments explaining restrictions.
      Use do {} while(0) for wa_ctx_emit() macro.
      
      (Many thanks to Chris, Dave and Thomas for their reviews and inputs)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Dave Gordon <david.s.gordon@intel.com>
      Signed-off-by: NRafael Barbalho <rafael.barbalho@intel.com>
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      17ee950d
  2. 15 6月, 2015 2 次提交
  3. 10 4月, 2015 1 次提交
    • C
      drm/i915: Split the batch pool by engine · 06fbca71
      Chris Wilson 提交于
      I woke up one morning and found 50k objects sitting in the batch pool
      and every search seemed to iterate the entire list... Painting the
      screen in oils would provide a more fluid display.
      
      One issue with the current design is that we only check for retirements
      on the current ring when preparing to submit a new batch. This means
      that we can have thousands of "active" batches on another ring that we
      have to walk over. The simplest way to avoid that is to split the pools
      per ring and then our LRU execution ordering will also ensure that the
      inactive buffers remain at the front.
      
      v2: execlists still requires duplicate code.
      v3: execlists requires more duplicate code
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      06fbca71
  4. 01 4月, 2015 1 次提交
  5. 18 3月, 2015 1 次提交
  6. 26 2月, 2015 1 次提交
    • J
      drm/i915: Rename 'flags' to 'dispatch_flags' for better code reading · 8e004efc
      John Harrison 提交于
      There is a flags word that is passed through the execbuffer code path all the
      way from initial decoding of the user parameters down to the very final dispatch
      buffer call. It is simply called 'flags'. Unfortuantely, there are many other
      flags words floating around in the same blocks of code. Even more once the GPU
      scheduler arrives.
      
      This patch makes it more obvious exactly which flags word is which by renaming
      'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word
      already have an 'I915_DISPATCH_' prefix on them and so are not quite so
      ambiguous.
      
      OTC-Jira: VIZ-1587
      Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
      [danvet: Resolve conflict with Chris' rework of the bb parsing.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      8e004efc
  7. 24 2月, 2015 1 次提交
  8. 14 2月, 2015 1 次提交
  9. 27 1月, 2015 2 次提交
  10. 03 12月, 2014 6 次提交
  11. 20 11月, 2014 3 次提交
    • C
      drm/i915: Remove DRI1 ring accessors and API · 5c6c6003
      Chris Wilson 提交于
      With the deprecation of UMS, and by association DRI1, we have a tough
      choice when updating the ring access routines. We either rewrite the
      DRI1 routines blindly without testing (so likely to be broken) or take
      the liberty of declaring them no longer supported and remove them
      entirely. This takes the latter approach.
      
      v2: Also remove the DRI1 sarea updates
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Fix rebase conflicts.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      5c6c6003
    • T
      drm/i915/bdw: Pin the ringbuffer backing object to GGTT on-demand · 7ba717cf
      Thomas Daniel 提交于
      Same as with the context, pinning to GGTT regardless is harmful (it
      badly fragments the GGTT and can even exhaust it).
      
      Unfortunately, this case is also more complex than the previous one
      because we need to map and access the ringbuffer in several places
      along the execbuffer path (and we cannot make do by leaving the
      default ringbuffer pinned, as before). Also, the context object
      itself contains a pointer to the ringbuffer address that we have to
      keep updated if we are going to allow the ringbuffer to move around.
      
      v2: Same as with the context pinning, we cannot really do it during
      an interrupt. Also, pin the default ringbuffers objects regardless
      (makes error capture a lot easier).
      
      v3: Rebased. Take a pin reference of the ringbuffer for each item
      in the execlist request queue because the hardware may still be using
      the ringbuffer after the MI_USER_INTERRUPT to notify the seqno update
      is executed.  The ringbuffer must remain pinned until the context save
      is complete.  No longer pin and unpin ringbuffer in
      populate_lr_context() - this transient address is meaningless and the
      pinning can cause a sleep while atomic.
      
      v4: Moved ringbuffer pin and unpin into the lr_context_pin functions.
      Downgraded pinning check BUG_ONs to WARN_ONs.
      
      v5: Reinstated WARN_ONs for unexpected execlist states.  Removed unused
      variable.
      
      Issue: VIZ-4277
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NAkash Goel <akash.goels@gmail.com>
      Reviewed-by: Deepak S<deepak.s@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      7ba717cf
    • T
      drm/i915/bdw: Clean up execlist queue items in retire_work · c86ee3a9
      Thomas Daniel 提交于
      No longer create a work item to clean each execlist queue item.
      Instead, move retired execlist requests to a queue and clean up the
      items during retire_requests.
      
      v2: Fix legacy ring path broken during overzealous cleanup
      
      v3: Update idle detection to take execlists queue into account
      
      v4: Grab execlist lock when checking queue state
      
      v5: Fix leaking requests by freeing in execlists_retire_requests.
      
      Issue: VIZ-4274
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: NDeepak S <deepak.s@linux.intel.com>
      Reviewed-by: NAkash Goel <akash.goels@gmail.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      c86ee3a9
  12. 14 11月, 2014 1 次提交
    • M
      drm/i915: Initialize workarounds in logical ring mode too · 771b9a53
      Michel Thierry 提交于
      Following the legacy ring submission example, update the
      ring->init_context() hook to support the execlist submission mode.
      
      v2: update to use the new workaround macros and cleanup unused code.
      This takes care of both bdw and chv workarounds.
      
      v2.1: Add missing call to init_context() during deferred context creation.
      
      v3: Split init_context (emit) in legacy/lrc modes. For lrc, get the ringbuf
      from the context (Mika/Daniel).
      
      v4: Merge init_context interfaces back, the legacy mode only needs the ring,
      but the lrc mode needs the ring and context (Mika).
      
      Issue: VIZ-4092
      Issue: GMIN-3475
      Change-Id: Ie3d093b2542ab0e2a44b90460533e2f979788d6c
      Cc: Deepak S <deepak.s@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Signed-off-by: NMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
      [danvet: Align function paramater lists properly.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      771b9a53
  13. 03 9月, 2014 1 次提交
    • A
      drm/i915/bdw: Apply workarounds in render ring init function · 86d7f238
      Arun Siluvery 提交于
      For BDW workarounds are currently initialized in init_clock_gating() but
      they are lost during reset, suspend/resume etc; this patch moves the WAs
      that are part of register state context to render ring init fn otherwise
      default context ends up with incorrect values as they don't get initialized
      until init_clock_gating fn.
      
      v2: Add workarounds to golden render state
      This method has its own issues, first of all this is different for
      each gen and it is generated using a tool so adding new workaround
      and mainitaining them across gens is not a straightforward process.
      
      v3: Use LRIs to emit these workarounds (Ville)
      Instead of modifying the golden render state the same LRIs are
      emitted from within the driver.
      
      v4: Use abstract name when exporting gen specific routines (Chris)
      
      For: VIZ-4092
      Signed-off-by: NArun Siluvery <arun.siluvery@linux.intel.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      86d7f238
  14. 15 8月, 2014 3 次提交
    • T
      drm/i915/bdw: Handle context switch events · e981e7b1
      Thomas Daniel 提交于
      Handle all context status events in the context status buffer on every
      context switch interrupt. We only remove work from the execlist queue
      after a context status buffer reports that it has completed and we only
      attempt to schedule new contexts on interrupt when a previously submitted
      context completes (unless no contexts are queued, which means the GPU is
      free).
      
      We canot call intel_runtime_pm_get() in an interrupt (or with a spinlock
      grabbed, FWIW), because it might sleep, which is not a nice thing to do.
      Instead, do the runtime_pm get/put together with the create/destroy request,
      and handle the forcewake get/put directly.
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v2: Unreferencing the context when we are freeing the request might free
      the backing bo, which requires the struct_mutex to be grabbed, so defer
      unreferencing and freeing to a bottom half.
      
      v3:
      - Ack the interrupt inmediately, before trying to handle it (fix for
      missing interrupts by Bob Beckett <robert.beckett@intel.com>).
      - Update the Context Status Buffer Read Pointer, just in case (spotted
      by Damien Lespiau).
      
      v4: New namespace and multiple rebase changes.
      
      v5: Squash with "drm/i915/bdw: Do not call intel_runtime_pm_get() in an
      interrupt", as suggested by Daniel.
      Signed-off-by: NOscar Mateo <oscar.mateo@intel.com>
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [danvet: Checkpatch ...]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      e981e7b1
    • M
      drm/i915/bdw: Two-stage execlist submit process · acdd884a
      Michel Thierry 提交于
      Context switch (and execlist submission) should happen only when
      other contexts are not active, otherwise pre-emption occurs.
      
      To assure this, we place context switch requests in a queue and those
      request are later consumed when the right context switch interrupt is
      received (still TODO).
      
      v2: Use a spinlock, do not remove the requests on unqueue (wait for
      context switch completion).
      Signed-off-by: NThomas Daniel <thomas.daniel@intel.com>
      
      v3: Several rebases and code changes. Use unique ID.
      
      v4:
      - Move the queue/lock init to the late ring initialization.
      - Damien's kmalloc review comments: check return, use sizeof(*req),
      do not cast.
      
      v5:
      - Do not reuse drm_i915_gem_request. Instead, create our own.
      - New namespace.
      
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v1)
      Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v2-v5)
      Reviewed-by: NDamien Lespiau <damien.lespiau@intel.com>
      [davnet: Checkpatch + wash-up s/BUG_ON/WARN_ON/.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      acdd884a
    • O
      drm/i915: Add temporary ring->ctx backpointer · 582d67f0
      Oscar Mateo 提交于
      The execlist patches have a bit a convoluted and long history and due
      to that have the actual submission still misplaced deeply burried in
      the low-level ringbuffer handling code. This design goes back to the
      legacy ringbuffer code with its tricky lazy request and simple work
      submissiion using ring tail writes. For that reason they need a
      ring->ctx backpointer.
      
      The goal is to unburry that code and move it up into a level where the
      full execlist context is available so that we can ditch this
      backpointer. Until that's done make it really obvious that there's
      work still to be done.
      
      Cc: Oscar Mateo <oscar.mateo@intel.com>
      Cc: Thomas Daniel <thomas.daniel@intel.com>
      Acked-by: NThomas Daniel <thomas.daniel@intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      582d67f0
  15. 12 8月, 2014 5 次提交