1. 01 11月, 2019 1 次提交
  2. 30 10月, 2019 1 次提交
    • C
      drm/i915/gem: Make context persistence optional · a0e04715
      Chris Wilson 提交于
      Our existing behaviour is to allow contexts and their GPU requests to
      persist past the point of closure until the requests are complete. This
      allows clients to operate in a 'fire-and-forget' manner where they can
      setup a rendering pipeline and hand it over to the display server and
      immediately exit. As the rendering pipeline is kept alive until
      completion, the display server (or other consumer) can use the results
      in the future and present them to the user.
      
      The compute model is a little different. They have little to no buffer
      sharing between processes as their kernels tend to operate on a
      continuous stream, feeding the results back to the client application.
      These kernels operate for an indeterminate length of time, with many
      clients wishing that the kernel was always running for as long as they
      keep feeding in the data, i.e. acting like a DSP.
      
      Not all clients want this persistent "desktop" behaviour and would prefer
      that the contexts are cleaned up immediately upon closure. This ensures
      that when clients are run without hangchecking (e.g. for compute kernels
      of indeterminate runtime), any GPU hang or other unexpected workloads
      are terminated with the process and does not continue to hog resources.
      
      The default behaviour for new contexts is the legacy persistence mode,
      as some desktop applications are dependent upon the existing behaviour.
      New clients will have to opt in to immediate cleanup on context
      closure. If the hangchecking modparam is disabled, so is persistent
      context support -- all contexts will be terminated on closure.
      
      We expect this behaviour change to be welcomed by compute users, who
      have often been caught between a rock and a hard place. They disable
      hangchecking to avoid their kernels being "unfairly" declared hung, but
      have also experienced true hangs that the system was then unable to
      clean up. Naturally, this leads to bug reports.
      
      Testcase: igt/gem_ctx_persistence
      Link: https://github.com/intel/compute-runtime/pull/228Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Cc: Jon Bloomfield <jon.bloomfield@intel.com>
      Reviewed-by: NJon Bloomfield <jon.bloomfield@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Acked-by: NJason Ekstrand <jason@jlekstrand.net>
      Link: https://patchwork.freedesktop.org/patch/msgid/20191029202338.8841-1-chris@chris-wilson.co.uk
      a0e04715
  3. 26 10月, 2019 1 次提交
  4. 24 10月, 2019 2 次提交
  5. 18 10月, 2019 1 次提交
  6. 04 10月, 2019 6 次提交
  7. 25 9月, 2019 1 次提交
  8. 20 9月, 2019 1 次提交
    • C
      drm/i915: Mark i915_request.timeline as a volatile, rcu pointer · d19d71fc
      Chris Wilson 提交于
      The request->timeline is only valid until the request is retired (i.e.
      before it is completed). Upon retiring the request, the context may be
      unpinned and freed, and along with it the timeline may be freed. We
      therefore need to be very careful when chasing rq->timeline that the
      pointer does not disappear beneath us. The vast majority of users are in
      a protected context, either during request construction or retirement,
      where the timeline->mutex is held and the timeline cannot disappear. It
      is those few off the beaten path (where we access a second timeline) that
      need extra scrutiny -- to be added in the next patch after first adding
      the warnings about dangerous access.
      
      One complication, where we cannot use the timeline->mutex itself, is
      during request submission onto hardware (under spinlocks). Here, we want
      to check on the timeline to finalize the breadcrumb, and so we need to
      impose a second rule to ensure that the request->timeline is indeed
      valid. As we are submitting the request, it's context and timeline must
      be pinned, as it will be used by the hardware. Since it is pinned, we
      know the request->timeline must still be valid, and we cannot submit the
      idle barrier until after we release the engine->active.lock, ergo while
      submitting and holding that spinlock, a second thread cannot release the
      timeline.
      
      v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
      as we need it, and tidy up acquiring the timeline with a bit of
      refactoring (i915_active_add_request)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
      d19d71fc
  9. 03 9月, 2019 1 次提交
  10. 31 8月, 2019 1 次提交
  11. 24 8月, 2019 1 次提交
  12. 20 8月, 2019 1 次提交
  13. 17 8月, 2019 1 次提交
  14. 10 8月, 2019 3 次提交
  15. 08 8月, 2019 1 次提交
  16. 06 8月, 2019 1 次提交
  17. 02 8月, 2019 1 次提交
  18. 31 7月, 2019 1 次提交
  19. 30 7月, 2019 2 次提交
  20. 17 7月, 2019 1 次提交
  21. 13 7月, 2019 1 次提交
  22. 11 7月, 2019 1 次提交
  23. 05 7月, 2019 1 次提交
  24. 22 6月, 2019 1 次提交
  25. 21 6月, 2019 2 次提交
  26. 20 6月, 2019 1 次提交
    • C
      drm/i915/execlists: Preempt-to-busy · 22b7a426
      Chris Wilson 提交于
      When using a global seqno, we required a precise stop-the-workd event to
      handle preemption and unwind the global seqno counter. To accomplish
      this, we would preempt to a special out-of-band context and wait for the
      machine to report that it was idle. Given an idle machine, we could very
      precisely see which requests had completed and which we needed to feed
      back into the run queue.
      
      However, now that we have scrapped the global seqno, we no longer need
      to precisely unwind the global counter and only track requests by their
      per-context seqno. This allows us to loosely unwind inflight requests
      while scheduling a preemption, with the enormous caveat that the
      requests we put back on the run queue are still _inflight_ (until the
      preemption request is complete). This makes request tracking much more
      messy, as at any point then we can see a completed request that we
      believe is not currently scheduled for execution. We also have to be
      careful not to rewind RING_TAIL past RING_HEAD on preempting to the
      running context, and for this we use a semaphore to prevent completion
      of the request before continuing.
      
      To accomplish this feat, we change how we track requests scheduled to
      the HW. Instead of appending our requests onto a single list as we
      submit, we track each submission to ELSP as its own block. Then upon
      receiving the CS preemption event, we promote the pending block to the
      inflight block (discarding what was previously being tracked). As normal
      CS completion events arrive, we then remove stale entries from the
      inflight tracker.
      
      v2: Be a tinge paranoid and ensure we flush the write into the HWS page
      for the GPU semaphore to pick in a timely fashion.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
      22b7a426
  27. 17 6月, 2019 1 次提交
  28. 15 6月, 2019 1 次提交
    • C
      drm/i915: Keep contexts pinned until after the next kernel context switch · ce476c80
      Chris Wilson 提交于
      We need to keep the context image pinned in memory until after the GPU
      has finished writing into it. Since it continues to write as we signal
      the final breadcrumb, we need to keep it pinned until the request after
      it is complete. Currently we know the order in which requests execute on
      each engine, and so to remove that presumption we need to identify a
      request/context-switch we know must occur after our completion. Any
      request queued after the signal must imply a context switch, for
      simplicity we use a fresh request from the kernel context.
      
      The sequence of operations for keeping the context pinned until saved is:
      
       - On context activation, we preallocate a node for each physical engine
         the context may operate on. This is to avoid allocations during
         unpinning, which may be from inside FS_RECLAIM context (aka the
         shrinker)
      
       - On context deactivation on retirement of the last active request (which
         is before we know the context has been saved), we add the
         preallocated node onto a barrier list on each engine
      
       - On engine idling, we emit a switch to kernel context. When this
         switch completes, we know that all previous contexts must have been
         saved, and so on retiring this request we can finally unpin all the
         contexts that were marked as deactivated prior to the switch.
      
      We can enhance this in future by flushing all the idle contexts on a
      regular heartbeat pulse of a switch to kernel context, which will also
      be used to check for hung engines.
      
      v2: intel_context_active_acquire/_release
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
      ce476c80
  29. 11 6月, 2019 2 次提交