1. 12 3月, 2020 1 次提交
    • C
      drm/i915: Defer semaphore priority bumping to a workqueue · 14a0d527
      Chris Wilson 提交于
      Since the semaphore fence may be signaled from inside an interrupt
      handler from inside a request holding its request->lock, we cannot then
      enter into the engine->active.lock for processing the semaphore priority
      bump as we may traverse our call tree and end up on another held
      request.
      
      CPU 0:
      [ 2243.218864]  _raw_spin_lock_irqsave+0x9a/0xb0
      [ 2243.218867]  i915_schedule_bump_priority+0x49/0x80 [i915]
      [ 2243.218869]  semaphore_notify+0x6d/0x98 [i915]
      [ 2243.218871]  __i915_sw_fence_complete+0x61/0x420 [i915]
      [ 2243.218874]  ? kmem_cache_free+0x211/0x290
      [ 2243.218876]  i915_sw_fence_complete+0x58/0x80 [i915]
      [ 2243.218879]  dma_i915_sw_fence_wake+0x3e/0x80 [i915]
      [ 2243.218881]  signal_irq_work+0x571/0x690 [i915]
      [ 2243.218883]  irq_work_run_list+0xd7/0x120
      [ 2243.218885]  irq_work_run+0x1d/0x50
      [ 2243.218887]  smp_irq_work_interrupt+0x21/0x30
      [ 2243.218889]  irq_work_interrupt+0xf/0x20
      
      CPU 1:
      [ 2242.173107]  _raw_spin_lock+0x8f/0xa0
      [ 2242.173110]  __i915_request_submit+0x64/0x4a0 [i915]
      [ 2242.173112]  __execlists_submission_tasklet+0x8ee/0x2120 [i915]
      [ 2242.173114]  ? i915_sched_lookup_priolist+0x1e3/0x2b0 [i915]
      [ 2242.173117]  execlists_submit_request+0x2e8/0x2f0 [i915]
      [ 2242.173119]  submit_notify+0x8f/0xc0 [i915]
      [ 2242.173121]  __i915_sw_fence_complete+0x61/0x420 [i915]
      [ 2242.173124]  ? _raw_spin_unlock_irqrestore+0x39/0x40
      [ 2242.173137]  i915_sw_fence_complete+0x58/0x80 [i915]
      [ 2242.173140]  i915_sw_fence_commit+0x16/0x20 [i915]
      
      Closes: https://gitlab.freedesktop.org/drm/intel/issues/1318
      Fixes: b7404c7e ("drm/i915: Bump ready tasks ahead of busywaits")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: <stable@vger.kernel.org> # v5.2+
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200310101720.9944-1-chris@chris-wilson.co.uk
      (cherry picked from commit 209df10b)
      Signed-off-by: NJani Nikula <jani.nikula@intel.com>
      14a0d527
  2. 12 2月, 2020 2 次提交
  3. 06 1月, 2020 1 次提交
  4. 25 12月, 2019 1 次提交
  5. 23 12月, 2019 1 次提交
  6. 20 12月, 2019 2 次提交
  7. 18 12月, 2019 1 次提交
  8. 14 12月, 2019 1 次提交
  9. 12 10月, 2019 1 次提交
  10. 10 10月, 2019 1 次提交
  11. 04 10月, 2019 4 次提交
  12. 23 9月, 2019 1 次提交
  13. 20 9月, 2019 1 次提交
    • C
      drm/i915: Mark i915_request.timeline as a volatile, rcu pointer · d19d71fc
      Chris Wilson 提交于
      The request->timeline is only valid until the request is retired (i.e.
      before it is completed). Upon retiring the request, the context may be
      unpinned and freed, and along with it the timeline may be freed. We
      therefore need to be very careful when chasing rq->timeline that the
      pointer does not disappear beneath us. The vast majority of users are in
      a protected context, either during request construction or retirement,
      where the timeline->mutex is held and the timeline cannot disappear. It
      is those few off the beaten path (where we access a second timeline) that
      need extra scrutiny -- to be added in the next patch after first adding
      the warnings about dangerous access.
      
      One complication, where we cannot use the timeline->mutex itself, is
      during request submission onto hardware (under spinlocks). Here, we want
      to check on the timeline to finalize the breadcrumb, and so we need to
      impose a second rule to ensure that the request->timeline is indeed
      valid. As we are submitting the request, it's context and timeline must
      be pinned, as it will be used by the hardware. Since it is pinned, we
      know the request->timeline must still be valid, and we cannot submit the
      idle barrier until after we release the engine->active.lock, ergo while
      submitting and holding that spinlock, a second thread cannot release the
      timeline.
      
      v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
      as we need it, and tidy up acquiring the timeline with a bit of
      refactoring (i915_active_add_request)
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
      d19d71fc
  14. 16 8月, 2019 1 次提交
  15. 14 8月, 2019 1 次提交
    • C
      drm/i915: Push the wakeref->count deferral to the backend · a79ca656
      Chris Wilson 提交于
      If the backend wishes to defer the wakeref parking, make it responsible
      for unlocking the wakeref (i.e. bumping the counter). This allows it to
      time the unlock much more carefully in case it happens to needs the
      wakeref to be active during its deferral.
      
      For instance, during engine parking we may choose to emit an idle
      barrier (a request). To do so, we borrow the engine->kernel_context
      timeline and to ensure exclusive access we keep the
      engine->wakeref.count as 0. However, to submit that request to HW may
      require a intel_engine_pm_get() (e.g. to keep the submission tasklet
      alive) and before we allow that we have to rewake our wakeref to avoid a
      recursive deadlock.
      
      <4> [257.742916] IRQs not enabled as expected
      <4> [257.742930] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:169 __local_bh_enable_ip+0xa9/0x100
      <4> [257.742936] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 btusb btrtl btbcm btintel snd_hda_intel snd_intel_nhlt bluetooth snd_hda_codec coretemp snd_hwdep crct10dif_pclmul snd_hda_core crc32_pclmul ecdh_generic ecc ghash_clmulni_intel snd_pcm r8169 realtek lpc_ich prime_numbers i2c_hid
      <4> [257.742991] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G     U  W         5.3.0-rc3-g5d0a06cd532c-drmtip_340+ #1
      <4> [257.742998] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F6 02/17/2015
      <4> [257.743008] RIP: 0010:__local_bh_enable_ip+0xa9/0x100
      <4> [257.743017] Code: 37 5b 5d c3 8b 80 50 08 00 00 85 c0 75 a9 80 3d 0b be 25 01 00 75 a0 48 c7 c7 f3 0c 06 ac c6 05 fb bd 25 01 01 e8 77 84 ff ff <0f> 0b eb 89 48 89 ef e8 3b 41 06 00 eb 98 e8 e4 5c f4 ff 5b 5d c3
      <4> [257.743025] RSP: 0018:ffffa78600003cb8 EFLAGS: 00010086
      <4> [257.743035] RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000010302
      <4> [257.743042] RDX: 0000000080010302 RSI: 0000000000000000 RDI: 00000000ffffffff
      <4> [257.743050] RBP: ffffffffc0494bb3 R08: 0000000000000000 R09: 0000000000000001
      <4> [257.743058] R10: 0000000014c8f0e9 R11: 00000000fee2ff8e R12: ffffa23ba8c38008
      <4> [257.743065] R13: ffffa23bacc579c0 R14: ffffa23bb7db0f60 R15: ffffa23b9cc8c430
      <4> [257.743074] FS:  0000000000000000(0000) GS:ffffa23bbba00000(0000) knlGS:0000000000000000
      <4> [257.743082] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4> [257.743089] CR2: 00007fe477b20778 CR3: 000000011f72a000 CR4: 00000000001006f0
      <4> [257.743096] Call Trace:
      <4> [257.743104]  <IRQ>
      <4> [257.743265]  __i915_request_commit+0x240/0x5d0 [i915]
      <4> [257.743427]  ? __i915_request_create+0x228/0x4c0 [i915]
      <4> [257.743584]  __engine_park+0x64/0x250 [i915]
      <4> [257.743730]  ____intel_wakeref_put_last+0x1c/0x70 [i915]
      <4> [257.743878]  i915_sample+0x2ee/0x310 [i915]
      <4> [257.744030]  ? i915_pmu_cpu_offline+0xb0/0xb0 [i915]
      <4> [257.744040]  __hrtimer_run_queues+0x11e/0x4b0
      <4> [257.744068]  hrtimer_interrupt+0xea/0x250
      <4> [257.744079]  ? lockdep_hardirqs_off+0x79/0xd0
      <4> [257.744101]  smp_apic_timer_interrupt+0x96/0x280
      <4> [257.744114]  apic_timer_interrupt+0xf/0x20
      <4> [257.744125] RIP: 0010:__do_softirq+0xb3/0x4ae
      
      v2: Keep the priority_hint assert
      v3: That assert was desperately trying to point out my bug. Sorry, little
      assert.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111378Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190813190705.23869-1-chris@chris-wilson.co.uk
      a79ca656
  16. 10 7月, 2019 1 次提交
  17. 21 6月, 2019 1 次提交
  18. 20 6月, 2019 1 次提交
    • C
      drm/i915/execlists: Preempt-to-busy · 22b7a426
      Chris Wilson 提交于
      When using a global seqno, we required a precise stop-the-workd event to
      handle preemption and unwind the global seqno counter. To accomplish
      this, we would preempt to a special out-of-band context and wait for the
      machine to report that it was idle. Given an idle machine, we could very
      precisely see which requests had completed and which we needed to feed
      back into the run queue.
      
      However, now that we have scrapped the global seqno, we no longer need
      to precisely unwind the global counter and only track requests by their
      per-context seqno. This allows us to loosely unwind inflight requests
      while scheduling a preemption, with the enormous caveat that the
      requests we put back on the run queue are still _inflight_ (until the
      preemption request is complete). This makes request tracking much more
      messy, as at any point then we can see a completed request that we
      believe is not currently scheduled for execution. We also have to be
      careful not to rewind RING_TAIL past RING_HEAD on preempting to the
      running context, and for this we use a semaphore to prevent completion
      of the request before continuing.
      
      To accomplish this feat, we change how we track requests scheduled to
      the HW. Instead of appending our requests onto a single list as we
      submit, we track each submission to ELSP as its own block. Then upon
      receiving the CS preemption event, we promote the pending block to the
      inflight block (discarding what was previously being tracked). As normal
      CS completion events arrive, we then remove stale entries from the
      inflight tracker.
      
      v2: Be a tinge paranoid and ensure we flush the write into the HWS page
      for the GPU semaphore to pick in a timely fashion.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
      22b7a426
  19. 15 6月, 2019 1 次提交
  20. 22 5月, 2019 2 次提交
  21. 27 4月, 2019 1 次提交
  22. 25 4月, 2019 2 次提交
    • C
      drm/i915: Invert the GEM wakeref hierarchy · 79ffac85
      Chris Wilson 提交于
      In the current scheme, on submitting a request we take a single global
      GEM wakeref, which trickles down to wake up all GT power domains. This
      is undesirable as we would like to be able to localise our power
      management to the available power domains and to remove the global GEM
      operations from the heart of the driver. (The intent there is to push
      global GEM decisions to the boundary as used by the GEM user interface.)
      
      Now during request construction, each request is responsible via its
      logical context to acquire a wakeref on each power domain it intends to
      utilize. Currently, each request takes a wakeref on the engine(s) and
      the engines themselves take a chipset wakeref. This gives us a
      transition on each engine which we can extend if we want to insert more
      powermangement control (such as soft rc6). The global GEM operations
      that currently require a struct_mutex are reduced to listening to pm
      events from the chipset GT wakeref. As we reduce the struct_mutex
      requirement, these listeners should evaporate.
      
      Perhaps the biggest immediate change is that this removes the
      struct_mutex requirement around GT power management, allowing us greater
      flexibility in request construction. Another important knock-on effect,
      is that by tracking engine usage, we can insert a switch back to the
      kernel context on that engine immediately, avoiding any extra delay or
      inserting global synchronisation barriers. This makes tracking when an
      engine and its associated contexts are idle much easier -- important for
      when we forgo our assumed execution ordering and need idle barriers to
      unpin used contexts. In the process, it means we remove a large chunk of
      code whose only purpose was to switch back to the kernel context.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
      79ffac85
    • C
      drm/i915: Pass intel_context to i915_request_create() · 2ccdf6a1
      Chris Wilson 提交于
      Start acquiring the logical intel_context and using that as our primary
      means for request allocation. This is the initial step to allow us to
      avoid requiring struct_mutex for request allocation along the
      perma-pinned kernel context, but it also provides a foundation for
      breaking up the complex request allocation to handle different scenarios
      inside execbuf.
      
      For the purpose of emitting a request from inside retirement (see the
      next patch for engine power management), we also need to lift control
      over the timeline mutex to the caller.
      
      v2: Note that the request carries the active reference upon construction.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-4-chris@chris-wilson.co.uk
      2ccdf6a1
  23. 11 4月, 2019 1 次提交
    • C
      drm/i915: Bump ready tasks ahead of busywaits · b7404c7e
      Chris Wilson 提交于
      Consider two tasks that are running in parallel on a pair of engines
      (vcs0, vcs1), but then must complete on a shared engine (rcs0). To
      maximise throughput, we want to run the first ready task on rcs0 (i.e.
      the first task that completes on either of vcs0 or vcs1). When using
      semaphores, however, we will instead queue onto rcs in submission order.
      
      To resolve this incorrect ordering, we want to re-evaluate the priority
      queue when each of the request is ready. Normally this happens because
      we only insert into the priority queue requests that are ready, but with
      semaphores we are inserting ahead of their readiness and to compensate
      we penalize those tasks with reduced priority (so that tasks that do not
      need to busywait should naturally be run first). However, given a series
      of tasks that each use semaphores, the queue degrades into submission
      fifo rather than readiness fifo, and so to counter this we give a small
      boost to semaphore users as their dependent tasks are completed (and so
      we no longer require any busywait prior to running the user task as they
      are then ready themselves).
      
      v2: Fixup irqsave for schedule_lock (Tvrtko)
      
      Testcase: igt/gem_exec_schedule/semaphore-codependency
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
      Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190409152922.23894-1-chris@chris-wilson.co.uk
      b7404c7e
  24. 05 4月, 2019 1 次提交
  25. 22 3月, 2019 1 次提交
    • C
      drm/i915: Allow contexts to share a single timeline across all engines · ea593dbb
      Chris Wilson 提交于
      Previously, our view has been always to run the engines independently
      within a context. (Multiple engines happened before we had contexts and
      timelines, so they always operated independently and that behaviour
      persisted into contexts.) However, at the user level the context often
      represents a single timeline (e.g. GL contexts) and userspace must
      ensure that the individual engines are serialised to present that
      ordering to the client (or forgot about this detail entirely and hope no
      one notices - a fair ploy if the client can only directly control one
      engine themselves ;)
      
      In the next patch, we will want to construct a set of engines that
      operate as one, that have a single timeline interwoven between them, to
      present a single virtual engine to the user. (They submit to the virtual
      engine, then we decide which engine to execute on based.)
      
      To that end, we want to be able to create contexts which have a single
      timeline (fence context) shared between all engines, rather than multiple
      timelines.
      
      v2: Move the specialised timeline ordering to its own function.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190322092325.5883-4-chris@chris-wilson.co.uk
      ea593dbb
  26. 06 3月, 2019 1 次提交
  27. 02 3月, 2019 2 次提交
    • C
      drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+ · e8861964
      Chris Wilson 提交于
      Having introduced per-context seqno, we now have a means to identity
      progress across the system without feel of rollback as befell the
      global_seqno. That is we can program a MI_SEMAPHORE_WAIT operation in
      advance of submission safe in the knowledge that our target seqno and
      address is stable.
      
      However, since we are telling the GPU to busy-spin on the target address
      until it matches the signaling seqno, we only want to do so when we are
      sure that busy-spin will be completed quickly. To achieve this we only
      submit the request to HW once the signaler is itself executing (modulo
      preemption causing us to wait longer), and we only do so for default and
      above priority requests (so that idle priority tasks never themselves
      hog the GPU waiting for others).
      
      As might be reasonably expected, HW semaphores excel in inter-engine
      synchronisation microbenchmarks (where the 3x reduced latency / increased
      throughput more than offset the power cost of spinning on a second ring)
      and have significant improvement (can be up to ~10%, most see no change)
      for single clients that utilize multiple engines (typically media players
      and transcoders), without regressing multiple clients that can saturate
      the system or changing the power envelope dramatically.
      
      v3: Drop the older NEQ branch, now we pin the signaler's HWSP anyway.
      v4: Tell the world and include it as part of scheduler caps.
      
      Testcase: igt/gem_exec_whisper
      Testcase: igt/benchmarks/gem_wsim
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-3-chris@chris-wilson.co.uk
      e8861964
    • C
      drm/i915: Keep timeline HWSP allocated until idle across the system · ebece753
      Chris Wilson 提交于
      In preparation for enabling HW semaphores, we need to keep in flight
      timeline HWSP alive until its use across entire system has completed,
      as any other timeline active on the GPU may still refer back to the
      already retired timeline. We both have to delay recycling available
      cachelines and unpinning old HWSP until the next idle point.
      
      An easy option would be to simply keep all used HWSP until the system as
      a whole was idle, i.e. we could release them all at once on parking.
      However, on a busy system, we may never see a global idle point,
      essentially meaning the resource will be leaked until we are forced to
      do a GC pass. We already employ a fine-grained idle detection mechanism
      for vma, which we can reuse here so that each cacheline can be freed
      immediately after the last request using it is retired.
      
      v3: Keep track of the activity of each cacheline.
      v4: cacheline_free() on canceling the seqno tracking
      v5: Finally with a testcase to exercise wraparound
      v6: Pack cacheline into empty bits of page-aligned vaddr
      v7: Use i915_utils to hide the pointer casting around bit manipulation
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190301170901.8340-2-chris@chris-wilson.co.uk
      ebece753
  28. 28 2月, 2019 1 次提交
  29. 26 2月, 2019 1 次提交
  30. 06 2月, 2019 1 次提交
  31. 30 1月, 2019 2 次提交
    • C
      drm/i915: Replace global breadcrumbs with per-context interrupt tracking · 52c0fdb2
      Chris Wilson 提交于
      A few years ago, see commit 688e6c72 ("drm/i915: Slaughter the
      thundering i915_wait_request herd"), the issue of handling multiple
      clients waiting in parallel was brought to our attention. The
      requirement was that every client should be woken immediately upon its
      request being signaled, without incurring any cpu overhead.
      
      To handle certain fragility of our hw meant that we could not do a
      simple check inside the irq handler (some generations required almost
      unbounded delays before we could be sure of seqno coherency) and so
      request completion checking required delegation.
      
      Before commit 688e6c72, the solution was simple. Every client
      waiting on a request would be woken on every interrupt and each would do
      a heavyweight check to see if their request was complete. Commit
      688e6c72 introduced an rbtree so that only the earliest waiter on
      the global timeline would woken, and would wake the next and so on.
      (Along with various complications to handle requests being reordered
      along the global timeline, and also a requirement for kthread to provide
      a delegate for fence signaling that had no process context.)
      
      The global rbtree depends on knowing the execution timeline (and global
      seqno). Without knowing that order, we must instead check all contexts
      queued to the HW to see which may have advanced. We trim that list by
      only checking queued contexts that are being waited on, but still we
      keep a list of all active contexts and their active signalers that we
      inspect from inside the irq handler. By moving the waiters onto the fence
      signal list, we can combine the client wakeup with the dma_fence
      signaling (a dramatic reduction in complexity, but does require the HW
      being coherent, the seqno must be visible from the cpu before the
      interrupt is raised - we keep a timer backup just in case).
      
      Having previously fixed all the issues with irq-seqno serialisation (by
      inserting delays onto the GPU after each request instead of random delays
      on the CPU after each interrupt), we can rely on the seqno state to
      perfom direct wakeups from the interrupt handler. This allows us to
      preserve our single context switch behaviour of the current routine,
      with the only downside that we lose the RT priority sorting of wakeups.
      In general, direct wakeup latency of multiple clients is about the same
      (about 10% better in most cases) with a reduction in total CPU time spent
      in the waiter (about 20-50% depending on gen). Average herd behaviour is
      improved, but at the cost of not delegating wakeups on task_prio.
      
      v2: Capture fence signaling state for error state and add comments to
      warm even the most cold of hearts.
      v3: Check if the request is still active before busywaiting
      v4: Reduce the amount of pointer misdirection with list_for_each_safe
      and using a local i915_request variable inside the loops
      v5: Add a missing pluralisation to a purely informative selftest message.
      
      References: 688e6c72 ("drm/i915: Slaughter the thundering i915_wait_request herd")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129205230.19056-2-chris@chris-wilson.co.uk
      52c0fdb2
    • C
      drm/i915: Identify active requests · 85474441
      Chris Wilson 提交于
      To allow requests to forgo a common execution timeline, one question we
      need to be able to answer is "is this request running?". To track
      whether a request has started on HW, we can emit a breadcrumb at the
      beginning of the request and check its timeline's HWSP to see if the
      breadcrumb has advanced past the start of this request. (This is in
      contrast to the global timeline where we need only ask if we are on the
      global timeline and if the timeline has advanced past the end of the
      previous request.)
      
      There is still confusion from a preempted request, which has already
      started but relinquished the HW to a high priority request. For the
      common case, this discrepancy should be negligible. However, for
      identification of hung requests, knowing which one was running at the
      time of the hang will be much more important.
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190129185452.20989-2-chris@chris-wilson.co.uk
      85474441